Multi-Class Cyberbullying Detection: Classical vs. Transformers
Explore a comparative NLP study on detecting cyberbullying using Logistic Regression and DistilBERT, including semantic analysis and ethical trade-offs.
MIAA · NLP Final Project · April 2026
INTEGRATED PROJECT
MIAA
A Comparative Evaluation of Classical and Transformer-Based Architectures for Multi-Class Cyberbullying Detection in Social Media
Mina Faltos
Classical ML & Preprocessing
Student ID: 20260193
João Fernandes
DistilBERT & Semantic Analysis
Student ID: 20260482
April 29, 2026
Problem Formulation
Why This Matters
Online harassment has escalated across social platforms — existing binary detectors miss nuanced, identity-targeted abuse
Identity-based attacks (age, religion, ethnicity) require multi-class detection beyond simple toxic/non-toxic labels
Slang and reclaimed language create semantic depth that classical bag-of-words models cannot resolve
Tweet Categories
Age
Ethnicity
Gender
Religion
Other
Not Cyberbullying
Research Objectives
Pre-process and represent a corpus of English tweets
Build a classical ML model for multi-class text classification
Fine-tune a pretrained transformer model (DistilBERT)
Compare approaches using robust evaluation metrics
Analyze semantic similarities via SBERT embeddings (PCA & t-SNE)
Reflect critically on performance, limitations, and ethics
MIAA · NLP Project · 2026
Strategic Research Objectives
Identity Harassment Detection
Addressing three under-served identity axes — Age, Religion, and Ethnicity — where harassment uses coded language invisible to binary classifiers. Multi-class framing enables granular moderation policies.
Semantic Depth in Slang & Ambiguity
Slang, reclaimed language, sarcasm, and indirect harassment evade TF-IDF bag-of-words representations. Transformer attention mechanisms capture contextual polarity shifts (e.g. 'savage' as affectionate vs. derogatory).
Model Selection for Scalability
Balancing DistilBERT's +3.39 pp Macro F1 gain against a 55× inference latency overhead vs. Logistic Regression. Real-world deployment requires explicit cost-benefit analysis per platform scale.
MIAA · NLP Project · 2026
Methodology & Dataset Introduction
LABELED TWITTER CORPUS
47,459
labeled English tweets
6 balanced classes · ~7,910 per class · 80/20 stratified split
KaggleHub · Kaggle Datasets
age
ethnicity
gender
religion
other_cyberbullying
not_cyberbullying
DUAL-TIER ENGINEERING PIPELINE
Tier 1 · Basic Cleaning
Lowercasing, URL removal, @mention anonymization
Hashtag symbol removal (content preserved)
Used for: DistilBERT input
Tier 2 · Advanced Normalization
Stop-word removal (negations preserved)
WordNet lemmatization
Used for: TF-IDF input ONLY
⚠ Tier 2 applied to BERT destroys syntactic structure
MIAA · NLP Project · 2026
Corpus Statistics
47,459
Total Tweets
6
Balanced Classes
80 / 20
Train / Test Split
KaggleHub
Data Source
CLASS DISTRIBUTION
Class
Description
Count
Split (Train/Test)
age
Age-based cyberbullying
~7,910
6,328 / 1,582
ethnicity
Ethnicity-based cyberbullying
~7,910
6,328 / 1,582
gender
Gender-based cyberbullying
~7,910
6,328 / 1,582
religion
Religion-based cyberbullying
~7,910
6,328 / 1,582
other_cyberbullying
Other harassment types
~7,910
6,328 / 1,582
not_cyberbullying
Benign non-harassing content
~7,910
6,328 / 1,582
TOTAL
—
47,459
37,967 / 9,492
Stratified split (random_state=42) ensures equal class representation across train and test sets.
MIAA · NLP Project · 2026
Dual Preprocessing Pipeline
MIAA · NLP Project · 2026
Mina Faltos
Classical Tier
MF
Raw Tweet Input
Tier 2 Normalization
Lowercasing
Stop-word removal (negations preserved: no, not, none, never)
WordNet Lemmatization (bullying → bully)
TF-IDF Vectorization
ngram_range = (1,2)
max_features = 5,000
Classical Models
Logistic Regression + Naïve Bayes
[Scikit-learn]
[NLTK WordNet]
[GridSearchCV]
João Fernandes
Deep Learning Tier
JF
Raw Tweet Input
Tier 1 Basic Cleaning
Lowercasing, URL removal
@mention anonymization
Hashtag symbol stripping
Regex-based whitespace collapse
WordPiece Tokenization
DistilBertTokenizerFast
max_length = 128 tokens
DistilBERT Fine-tuning
FP16 + Accelerate
[HuggingFace Transformers]
[Accelerate]
[FP16]
Classical Model Benchmarks
Naïve Bayes
alpha = 1.0
77.17%
0.7644
0.7652
TF-IDF (1,2)-grams, 5k features
Higher false negative rate on ambiguous classes
GridSearchCV: 3-fold, scoring: Macro F1
Logistic Regression
C=1, class_weight='balanced'
82.68%
0.8268 ✓
0.8276
TF-IDF (1,2)-grams, 5k features
GridSearchCV: 3-fold, scoring: Macro F1
+5.51 pp Macro F1
Strong Performance
Age, ethnicity, religion — lexically distinct classes, near-ceiling for bag-of-words
Weakness
not_cyberbullying (F1=0.56) and other_cyberbullying (F1=0.64) — contextual blindness
MIAA · NLP Project · 2026
DistilBERT Training Performance
Architecture & Configuration
Fine-Tuning Hyperparameters
34.6 min
Wall-Time (CPU Training)
12–15 min
GPU Training (16GB VRAM)
FP16 · 1.5×
Mixed Precision Speedup
MIAA · NLP Project · 2026
distilbert-base-uncased
66M
Parameters
40%
Smaller than BERT
60%
Faster Inference
~97%
Performance Retained
Final Performance Comparison
Naïve Bayes (alpha=1.0)
77.17%
0.7644
0.7652
−9.63 pp
Logistic Regression (C=1)
82.68%
0.8268
0.8276
baseline
DistilBERT (fine-tuned)
86.29%
0.8607
0.8641
+3.39 pp
DistilBERT vs. Logistic Regression
+3.61 pp Accuracy · +3.39 pp Macro F1
Largest gain: not_cyberbullying
+8.8 pp F1 (hardest class)
other_cyberbullying
+5.6 pp F1
DistilBERT's gains are concentrated on semantically ambiguous categories where bag-of-words fails. Identity-specific classes (age, ethnicity, religion) see only marginal +0.7–1.9 pp improvement — classical methods are already near-optimal for lexically-distinct content.
MIAA · NLP Project · 2026
Semantic Cohesion Analysis — Part D
SentenceTransformer · all-MiniLM-L6-v2 · 300 tweets/class · 1,800 total samples
INTRA-CLASS COSINE SIMILARITY
Mean pairwise cosine similarity within each class — higher = tighter, more coherent cluster
age
0.74
Tightest cluster
ethnicity
0.71
religion
0.69
gender
0.61
other_cyberbullying
0.47
High heterogeneity
not_cyberbullying
0.44
Most dispersed
Identity-specific classes form tight semantic neighborhoods; ambiguous classes scatter across embedding space
GEOMETRIC VERDICT
Latent Islands Confirmed
t-SNE (perplexity=40) and PCA projections both confirm tight, well-separated clusters for age, ethnicity, and religion.
Overlapping Core Region
not_cyberbullying and other_cyberbullying share a central overlapping region in embedding space — confirming fundamental semantic ambiguity.
Model-Agnostic Finding
This ambiguity is visible before any classifier is trained — it is a property of the data, not a modelling limitation.
t-SNE · perplexity=40
PCA · PC1+PC2
384-dim embeddings
MIAA · NLP Project · 2026
Technical Trade-offs & Ethics
COMPUTATIONAL SCALABILITY
ETHICAL CONSIDERATIONS
Dimension
Logistic Regression
DistilBERT
Training time
<60 s CPU
~12–15 min GPU
Inference speed
~50,000 tweets/s
~900 tweets/s
Model size
<200 MB
~600 MB
Macro F1
0.8268
0.8607
55× inference latency overhead
+3.39 pp Macro F1 gain
Full feature-weight inspection via SHAP/LIME
Attention weights unreliable (Jain & Wallace, 2019) — requires Integrated Gradients
Representation Bias
English-Twitter-centric dataset; performance degrades on non-Western harassment patterns
Pre-training Bias
DistilBERT inherits BookCorpus/Wikipedia biases along gender, ethnicity, and religion axes
Asymmetric Harm
False negatives leave victims unprotected. Standard Macro F1 insufficient — class-specific cost matrices needed
Marginalized Speech
Reclaimed language used affirmatively within communities risks misclassification as bullying
Human-in-the-loop review required for ambiguous classifications (not_cyberbullying / other)
MIAA · NLP Project · 2026
INTEGRATED PROJECT MIAA · 2026
Q & A
Thank you for your attention.
Mina Faltos
Classical ML · Preprocessing · Error Analysis
Student ID: 20260193
João Fernandes
DistilBERT · Semantic Analysis · Visualization
Student ID: 20260482
NLP — Master's in AI & Advanced Analytics (MIAA)
April 29, 2026
DistilBERT: 86.29% Accuracy
Macro F1: 0.8607
+3.39 pp vs. Logistic Regression
47,459 tweets · 6 classes · Stratified 80/20
MIAA
- machine-learning
- nlp
- cyberbullying-detection
- distilbert
- data-science
- text-classification
- academic-project
- artificial-intelligence