Content Moderation and User Engagement Insights for Schools

Objective

My goal was to identify trends and behaviors related to content moderation, including how flagged, deleted, ignored, or reported content affects user engagement. I also aimed to provide actionable insights to improve the efficiency of content review processes and optimize engagement across the platform.

Skip to Code Showcase

High-Level Findings

Bar Graph 1

Bar Graph of Flagged Content Updates

The chart above shows the timeframes in which students returned to update their flagged content after the initial scan. Most updates occurred on the same day the content was flagged, accounting for 1,784 posts. A smaller number of updates happened after more than a week (730 posts), with minimal activity observed the next day (45 posts) or within a week (25 posts).

Key Insight: The majority of user updates happen immediately on the same day, while a smaller secondary group returns after an extended period of over a week.

Bar Graph 2

Bar Graph of Confidence Scores

The chart above illustrates the confidence scores of various content moderation categories, including toxicity, obscenity, insults, and polarizing content. Observations indicate that confidence levels across most categories are relatively low:

Key Insight: The low confidence scores across categories suggest that the content moderation model may struggle to classify inappropriate content accurately. This could explain why 10% of flagged content is ignored and only 4% is deleted. A more robust model with improved confidence levels could help reduce ignored flagged content and improve deletion rates.

Conclusion

I analyzed how students interact with flagged content and the actions they take, such as deleting, ignoring, or reporting it. The majority of flagged content is addressed on the same day, while a smaller portion of students return after a week or more to address flagged content.

My analysis revealed that 10% of flagged content is ignored, compared to only 4% being deleted. This suggests that much of the flagged content is not deemed inappropriate, or my flagging system might be overly sensitive (low confidence).

The analysis reveals that low confidence scores are a likely contributor to the system's inability to distinguish inappropriate content effectively. This could lead users to ignore flagged content rather than act on it. Improving the model’s confidence through enhanced datasets, better training processes, and targeted feature engineering will be critical to achieving better content moderation outcomes.

By focusing on categories with the lowest confidence, such as polarizing content and identity attacks, we can ensure that the system more accurately flags inappropriate material, resulting in higher deletion rates and fewer ignored flags.

Potential Next Steps

Python Code Showcase

This project utilized Python extensively for exploratory data analysis, cleaning, statistical insights, and visualizations. Below are some key examples demonstrating my proficiency in working with data using Pandas, custom functions, and visualization tools like Matplotlib and Seaborn.

Exploratory Data Analysis


# Check data types
print(data.dtypes)

# Check the shape (number of rows and columns)
print(data.shape)

# Get summary statistics
print(data.describe(include='all'))

# Group by 'business_customer_id' to count the number of posts for each user
posts_per_student = data.groupby('business_customer_id')['id'].count()

Data Cleaning and Transformation


# Drop rows with any missing values
data_cleaned = data.dropna()

# Convert 'createdAt' and 'updatedAt' to datetime
data_cleaned['createdAt'] = pd.to_datetime(data_cleaned['createdAt'], errors='coerce')
data_cleaned['updatedAt'] = pd.to_datetime(data_cleaned['updatedAt'], errors='coerce')

# Calculate the difference in days between 'createdAt' and 'updatedAt'
data_cleaned['update_duration_days'] = (data_cleaned['updatedAt'] - data_cleaned['createdAt']).dt.days

Custom Function for Classification


# Define a function to classify the date difference
def classify_update_duration(days):
    if days == 0:
        return 'Same Day'
    elif days == 1:
        return 'Next Day'
    elif days <= 7:
        return 'Within a Week'
    else:
        return 'More than a Week'

Statistical Insights and Visualization


# Convert the counts to a DataFrame
classification_counts_df = classification_counts.reset_index()
classification_counts_df.columns = ['Update Duration', 'Count']

# Sort the values by the custom category order
classification_counts = classification_counts.sort_index()

# Set style
sns.set(style="whitegrid")

# Create a figure and axis object
fig, ax = plt.subplots(figsize=(10, 6))

# Plot the bar chart
classification_counts.plot(kind='bar', color=sns.color_palette("Blues_r", len(classification_counts)), ax=ax)

# Set title
plt.title('Days Returned After Initial Scan', fontsize=16, weight='bold')

# Set x and y labels
plt.xlabel('Timeframe Returned', fontsize=14, weight='bold')
plt.ylabel('Number of Posts Updated', fontsize=14, weight='bold')

# Rotate x-axis labels
plt.xticks(rotation=45, fontsize=12)

# Add gridlines
plt.grid(True, which='both', axis='y', linestyle='--', linewidth=0.7)

# Add data labels
for idx, value in enumerate(classification_counts):
    ax.text(idx, value + 1, str(value), ha='center', va='bottom', fontsize=12)

# Display plot
plt.tight_layout()
plt.show()

These examples highlight my ability to clean and transform data, derive statistical insights, and create professional-quality visualizations that communicate actionable findings effectively.

Explore the Full Jupyter Notebook