Case Study · AI Analytics

AI Content
Quality
Monitoring

A monitoring framework for AI-assisted content moderation quality, model risk, and human review workload.

Focus
AI quality · Risk monitoring · Human review operations
Stack
Python · SQL · scikit-learn · Power BI-ready outputs
Type
Portfolio case study
Routing consolethreshold 0.50
Auto-approve
80.3%25616
Human review
10.4%3325
Escalate
1.9%617
Risk scores · sample
C-0042
0.03safe
C-0117
0.67review
C-0293
0.94escalate
C-0381
0.21safe
C-0504
0.48review
Routing consolethreshold 0.50
Auto-approve
80.3%25616
Human review
10.4%3325
Escalate
1.9%617
Risk scores · sample
C-0042
0.03safe
C-0117
0.67review
C-0293
0.94escalate
C-0381
0.21safe
C-0504
0.48review
159571
Comments analyzed
80.3%
Auto-approved
617
High-risk escalations
0.705
Weighted avg F1
The problem

AI moderation does
not fail in one way.

A missed threat creates a safety risk. A false flag creates a poor user experience. Both matter, but they do not have the same operational cost. This framework sits between an AI classifier and a human review team to monitor moderation quality, review workload, and model risk.

What needed monitoring
SafetyRare but high-risk labels needed escalation
QualityFalse positives could damage user experience
OperationsReview workload changed with threshold choices
DecisionTeams needed clear routing rules, not only model scores
Findings
What the monitoring layer revealed.
01

Risk was not evenly distributed

Most comments were safe, but rare categories like threat and identity hate carried higher review risk.

02

Model quality varied by label

Threat and identity_hate were the weakest labels by F1 because they had less training signal.

03

Thresholds changed workload

Changing confidence thresholds shifted the balance between safety coverage and human review volume.

04

Review routing mattered

The framework separated auto-approve, human review, and escalation decisions.

05

Priority scoring improved triage

A final priority score combined severity, multi-label toxicity, and model uncertainty.

06

Dashboard outputs made monitoring repeatable

The pipeline produced Power BI-ready tables for quality, workload, and threshold tracking.

Method
Building the analytics layer around the model.

Four layers, applied in sequence.

I

Data preparation

Processed comments, created label features, risk scores, and risk tiers.

II

Baseline model

TF-IDF + OneVsRest Logistic Regression baseline to generate per-label probabilities.

III

Routing logic

Converted model scores into auto-approve, human review, and escalation decisions.

IV

Monitoring outputs

Created dashboard-ready tables for model performance, review queue, threshold scenarios, and workload monitoring.

Evidence

What the monitoring
system showed.

Dataset composition
159571total comments
Safe
143346
Toxic
16225
Critical-risk
87

Dataset is heavily skewed toward safe content. Rare labels require separate monitoring.

Routing split · threshold 0.50
Auto-approved
80.3%25616
Human review
10.4%3325
Escalated
1.9%617

80% auto-approved keeps reviewer workload manageable while surfacing high-risk items.

Weakest labels · F1 score
0.367
threat
0.376
identity_hate
0.428
severe_toxic
threat
0.367
identity_hate
0.376
severe_toxic
0.428
weighted avg
0.705

Low F1 on high-risk labels means these categories need human review regardless of score.

Threshold scenario analysis
Modeled trade-offs before choosing a policy threshold
0.40Safety-first
Flagged: HigherResidual: Lower
0.60Balanced
Flagged: MediumResidual: Medium
0.80Conservative
Flagged: LowerResidual: Higher

Each threshold scenario was modeled to show workload and residual risk before any policy change.

Decision

Route risk,
not just scores.

The model output becomes useful only when it is translated into operational decisions. High-confidence safe content can be auto-approved, uncertain content should go to review, and high-risk labels should be escalated even when they are rare.

Auto-approve
Low-risk content with low maximum predicted score
Human review
Uncertain content or medium confidence toxic signals
Escalate
Threat, identity hate, severe toxicity, or high final priority score
Monitor
Threshold changes, false positives, false negatives, and reviewer workload
Operational impact

A clearer view of quality,
risk, and workload.

The framework gives Trust & Safety or AI Operations teams a practical way to monitor quality, prioritize review, and understand how threshold choices affect workload and residual risk.

The value is not the classifier alone. It is the control layer around it.

Comments analyzed159571
Flagged at threshold 0.5010612
Auto-approved80.3%
High-risk escalations617
Weighted average F10.705
Recommendations

Monitor the system,
not just the model.

1

Track performance by label

Rare labels should be monitored separately because aggregate metrics hide risk.

2

Keep high-risk labels in human review

Threat and identity_hate should not rely on automatic action alone.

3

Use threshold scenarios before changing policy

Every threshold change should show the workload and residual risk trade-off.

4

Prioritize review by risk and uncertainty

Human reviewers should see the riskiest and most uncertain comments first.

5

Create recurring dashboard outputs

Teams need repeatable monitoring tables, not one-off model evaluation.

My role

What I owned.

I structured the monitoring framework, prepared the dataset, trained a baseline classifier, designed routing rules, built the priority scoring logic, created threshold scenario analysis, and generated dashboard-ready outputs for moderation quality and review workload.

What this shows
Monitor AI-assisted workflows with operational and risk context.
Translate model scores into human review and escalation decisions.
Build analytics outputs that help teams manage quality, workload, and risk.
Next.

Explore the rest of the work.

The repository includes data preparation, classifier training, routing logic, priority scoring, threshold scenario analysis, and dashboard-ready monitoring outputs.

Maïssa Bounar© 2026

Create a free website with Framer, the website builder loved by startups, designers and agencies.