Multi-Class Cyberbullying Detection: Classical vs. Transformers

Explore a comparative NLP study on detecting cyberbullying using Logistic Regression and DistilBERT, including semantic analysis and ethical trade-offs.

#machine-learning#nlp#cyberbullying-detection#distilbert#data-science#text-classification#academic-project#artificial-intelligence

Watch
Pitch

MIAA · NLP Final Project · April 2026

INTEGRATED PROJECT
MIAA

A Comparative Evaluation of Classical and Transformer-Based Architectures for Multi-Class Cyberbullying Detection in Social Media

Mina Faltos

Classical ML & Preprocessing

Student ID: 20260193

João Fernandes

DistilBERT & Semantic Analysis

Student ID: 20260482

April 29, 2026

import torch
model.compile(optimizer='adam', loss='categorical_crossentropy')
from sklearn.feature_extraction.text import TfidfVectorizer
[CLS] Cyberbullying [SEP] Detection [PAD]
from transformers import DistilBertModel, DistilBertTokenizer
def forward(self, input_ids, attention_mask):
epochs=10, batch_size=32, validation_split=0.2
cosine_similarity(vec_a, vec_b)

MIAA

Made by

Problem Formulation

Why This Matters

•

Online harassment has escalated across social platforms — existing binary detectors miss nuanced, identity-targeted abuse

•

Identity-based attacks (age, religion, ethnicity) require multi-class detection beyond simple toxic/non-toxic labels

•

Slang and reclaimed language create semantic depth that classical bag-of-words models cannot resolve

Tweet Categories

Age

Ethnicity

Gender

Religion

Other

Not Cyberbullying

Research Objectives

Pre-process and represent a corpus of English tweets

Build a classical ML model for multi-class text classification

Fine-tune a pretrained transformer model (DistilBERT)

Compare approaches using robust evaluation metrics

Analyze semantic similarities via SBERT embeddings (PCA & t-SNE)

Reflect critically on performance, limitations, and ethics

MIAA · NLP Project · 2026

Made by

Strategic Research Objectives

Identity Harassment Detection

Addressing three under-served identity axes — Age, Religion, and Ethnicity — where harassment uses coded language invisible to binary classifiers. Multi-class framing enables granular moderation policies.

Semantic Depth in Slang & Ambiguity

Slang, reclaimed language, sarcasm, and indirect harassment evade TF-IDF bag-of-words representations. Transformer attention mechanisms capture contextual polarity shifts (e.g. 'savage' as affectionate vs. derogatory).

Model Selection for Scalability

Balancing DistilBERT's +3.39 pp Macro F1 gain against a 55× inference latency overhead vs. Logistic Regression. Real-world deployment requires explicit cost-benefit analysis per platform scale.

MIAA · NLP Project · 2026

Made by

Methodology & Dataset Introduction

LABELED TWITTER CORPUS

47,459 labeled English tweets

6 balanced classes · ~7,910 per class · 80/20 stratified split

KaggleHub · Kaggle Datasets

age

ethnicity

gender

religion

other_cyberbullying

not_cyberbullying

DUAL-TIER ENGINEERING PIPELINE

Tier 1 · Basic Cleaning

→ Lowercasing, URL removal, @mention anonymization

→ Hashtag symbol removal (content preserved)

Used for: DistilBERT input

Tier 2 · Advanced Normalization

→ Stop-word removal (negations preserved)

→ WordNet lemmatization

Used for: TF-IDF input ONLY

⚠ Tier 2 applied to BERT destroys syntactic structure

MIAA · NLP Project · 2026

Made by

Corpus Statistics

47,459

Total Tweets

Balanced Classes

80 / 20

Train / Test Split

KaggleHub

Data Source

CLASS DISTRIBUTION

Class

Description

Count

Split (Train/Test)

age

Age-based cyberbullying

~7,910

6,328 / 1,582

                    ethnicity
                

Ethnicity-based cyberbullying

~7,910

6,328 / 1,582

                    gender
                

Gender-based cyberbullying

~7,910

6,328 / 1,582

                    religion
                

Religion-based cyberbullying

~7,910

6,328 / 1,582

                    other_cyberbullying
                

Other harassment types

~7,910

6,328 / 1,582

                    not_cyberbullying
                

Benign non-harassing content

~7,910

6,328 / 1,582

TOTAL

—

47,459

37,967 / 9,492

Stratified split (random_state=42) ensures equal class representation across train and test sets.

MIAA · NLP Project · 2026

Made by

Dual Preprocessing Pipeline

Mina Faltos

Classical Tier

Raw Tweet Input

Tier 2 Normalization

Lowercasing
Stop-word removal (negations preserved: no, not, none, never)
WordNet Lemmatization (bullying → bully)

TF-IDF Vectorization

ngram_range = (1,2)
max_features = 5,000

Classical Models

→ Logistic Regression + Naïve Bayes

[Scikit-learn]

[NLTK WordNet]

[GridSearchCV]

João Fernandes

Deep Learning Tier

Raw Tweet Input

Tier 1 Basic Cleaning

Lowercasing, URL removal
@mention anonymization
Hashtag symbol stripping
Regex-based whitespace collapse

WordPiece Tokenization

DistilBertTokenizerFast
max_length = 128 tokens

DistilBERT Fine-tuning

→ FP16 + Accelerate

[HuggingFace Transformers]

[Accelerate]

[FP16]

MIAA · NLP Project · 2026

Made by

Classical Model Benchmarks

Naïve Bayes

                Params: alpha = 1.0
            

77.17%

Accuracy

F1-Macro

0.7644

F1-Weighted

0.7652

Feature: TF-IDF (1,2)-grams, 5k features

GridSearchCV: 3-fold, scoring: Macro F1

Higher false negative rate on ambiguous classes

+5.51 pp Macro F1

Logistic Regression

                Params: C=1, class_weight='balanced'
            

82.68%

Accuracy

F1-Macro

0.8268 ✓

F1-Weighted

0.8276

Feature: TF-IDF (1,2)-grams, 5k features

GridSearchCV: 3-fold, scoring: Macro F1

Key Findings

Strong Performance

Age, ethnicity, religion — lexically distinct classes, near-ceiling for bag-of-words

Weakness

not_cyberbullying (F1=0.56) and other_cyberbullying (F1=0.64) — contextual blindness

MIAA · NLP Project · 2026

Made by

DistilBERT Training Performance

Architecture & Configuration

Fine-Tuning Hyperparameters
Parameter	Value	Rationale
Learning Rate	2e-5	Standard BERT fine-tuning lower bound
Epochs	4 (Early Stopping, patience=2)	Avoids catastrophic forgetting
Batch Size	32	Fits 16 GB VRAM under FP16
Max Token Length	128	Covers 95th percentile of tweet tokens
LR Schedule	Cosine + 10% warm-up	Smooth decay, stable head init
Weight Decay	0.01	L2 regularisation on class. head
Mixed Precision	FP16 via accelerate	Halves VRAM, ~1.5× speedup
Architecture	distilbert-base-uncased	66M params, 40% smaller than BERT

34.6 min

Wall-Time (CPU Training)

12–15 min

GPU Training (16GB VRAM)

FP16 · 1.5×

Mixed Precision Speedup

MIAA · NLP Project · 2026

                distilbert-base-uncased
            

66M

Parameters

40%

Smaller than BERT

60%

Faster Inference

~97%

Performance Retained

HuggingFace Transformers

Made by

Final Performance Comparison

Model

Accuracy

F1-Macro

F1-Weighted

vs. Best Classical

Naïve Bayes (alpha=1.0)

77.17%

0.7644

0.7652

−9.63 pp

Logistic Regression (C=1)

82.68%

0.8268

0.8276

baseline

DistilBERT (fine-tuned)

86.29%

0.8607

0.8641

+3.39 pp

DistilBERT vs. Logistic Regression

+3.61 pp Accuracy · +3.39 pp Macro F1

Largest gain: not_cyberbullying

+8.8 pp F1 (hardest class)

other_cyberbullying

+5.6 pp F1

Key Insight: DistilBERT's gains are concentrated on semantically ambiguous categories where bag-of-words fails. Identity-specific classes (age, ethnicity, religion) see only marginal +0.7–1.9 pp improvement — classical methods are already near-optimal for lexically-distinct content.

MIAA · NLP Project · 2026

Made by

Semantic Cohesion Analysis — Part D

SentenceTransformer · all-MiniLM-L6-v2 · 300 tweets/class · 1,800 total samples

INTRA-CLASS COSINE SIMILARITY

Mean pairwise cosine similarity within each class — higher = tighter, more coherent cluster

age 0.74

Tightest cluster

ethnicity 0.71

religion 0.69

gender 0.61

other_cyberbullying 0.47

High heterogeneity

not_cyberbullying 0.44

Most dispersed

Identity-specific classes form tight semantic neighborhoods; ambiguous classes scatter across embedding space

GEOMETRIC VERDICT

Latent Islands Confirmed

t-SNE (perplexity=40) and PCA projections both confirm tight, well-separated clusters for age, ethnicity, and religion.

Overlapping Core Region

not_cyberbullying and other_cyberbullying share a central overlapping region in embedding space — confirming fundamental semantic ambiguity.

Model-Agnostic Finding

This ambiguity is visible before any classifier is trained — it is a property of the data, not a modelling limitation.

t-SNE · perplexity=40

PCA · PC1+PC2

384-dim embeddings

MIAA · NLP Project · 2026

Made by

Technical Trade-offs & Ethics

COMPUTATIONAL SCALABILITY

Dimension

Logistic Regression

DistilBERT

Training time

<60 s CPU

~12–15 min GPU

Inference speed

~50,000 tweets/s

~900 tweets/s

Model size

<200 MB

~600 MB

Macro F1

0.8268

0.8607

55× inference latency overhead

+3.39 pp Macro F1 gain

Logistic Regression

Full feature-weight inspection via SHAP/LIME

DistilBERT

Attention weights unreliable (Jain & Wallace, 2019) — requires Integrated Gradients

ETHICAL CONSIDERATIONS

Representation Bias

English-Twitter-centric dataset; performance degrades on non-Western harassment patterns

Pre-training Bias

DistilBERT inherits BookCorpus/Wikipedia biases along gender, ethnicity, and religion axes

Asymmetric Harm

False negatives leave victims unprotected. Standard Macro F1 insufficient — class-specific cost matrices needed

Marginalized Speech

Reclaimed language used affirmatively within communities risks misclassification as bullying

Human-in-the-loop review required for ambiguous classifications (not_cyberbullying / other)

MIAA · NLP Project · 2026

Made by

INTEGRATED PROJECT MIAA · 2026

Q & A

Thank you for your attention.

Mina Faltos

Classical ML · Preprocessing · Error Analysis

Student ID: 20260193

João Fernandes

DistilBERT · Semantic Analysis · Visualization

Student ID: 20260482

NLP — Master's in AI & Advanced Analytics (MIAA)

April 29, 2026

import torch
model.compile(optimizer='adam', loss='categorical_crossentropy')
from sklearn.feature_extraction.text import TfidfVectorizer
[CLS] Cyberbullying [SEP] Detection [PAD]
from transformers import DistilBertModel, DistilBertTokenizer
def forward(self, input_ids, attention_mask):
epochs=10, batch_size=32, validation_split=0.2
cosine_similarity(vec_a, vec_b)

DistilBERT: 86.29% Accuracy
Macro F1: 0.8607
+3.39 pp vs. Logistic Regression

47,459 tweets · 6 classes · Stratified 80/20

MIAA

Made by

DESIGNER-MADE
PRESENTATION,
GENERATED FROM
YOUR PROMPT

Create your own professional slide deck with real images, data charts, and unique design in under a minute.

Generate For Free

Multi-Class Cyberbullying Detection: Classical vs. Transformers

Explore a comparative NLP study on detecting cyberbullying using Logistic Regression and DistilBERT, including semantic analysis and ethical trade-offs.

MIAA · NLP Final Project · April 2026

INTEGRATED PROJECT

MIAA

A Comparative Evaluation of Classical and Transformer-Based Architectures for Multi-Class Cyberbullying Detection in Social Media

Mina Faltos

Classical ML & Preprocessing

Student ID: 20260193

João Fernandes

DistilBERT & Semantic Analysis

Student ID: 20260482

April 29, 2026

Problem Formulation

Why This Matters

Online harassment has escalated across social platforms — existing binary detectors miss nuanced, identity-targeted abuse

Identity-based attacks (age, religion, ethnicity) require multi-class detection beyond simple toxic/non-toxic labels

Slang and reclaimed language create semantic depth that classical bag-of-words models cannot resolve

Tweet Categories

Age

Ethnicity

Gender

Religion

Other

Not Cyberbullying

Research Objectives

Pre-process and represent a corpus of English tweets

Build a classical ML model for multi-class text classification

Fine-tune a pretrained transformer model (DistilBERT)

Compare approaches using robust evaluation metrics

Analyze semantic similarities via SBERT embeddings (PCA & t-SNE)

Reflect critically on performance, limitations, and ethics

MIAA · NLP Project · 2026

Strategic Research Objectives

Identity Harassment Detection

Semantic Depth in Slang & Ambiguity

Model Selection for Scalability

Balancing DistilBERT's +3.39 pp Macro F1 gain against a 55× inference latency overhead vs. Logistic Regression. Real-world deployment requires explicit cost-benefit analysis per platform scale.

MIAA · NLP Project · 2026

Methodology & Dataset Introduction

LABELED TWITTER CORPUS

47,459

labeled English tweets

6 balanced classes · ~7,910 per class · 80/20 stratified split

KaggleHub · Kaggle Datasets

age

ethnicity

gender

religion

other_cyberbullying

not_cyberbullying

DUAL-TIER ENGINEERING PIPELINE

Tier 1 · Basic Cleaning

Lowercasing, URL removal, @mention anonymization

Hashtag symbol removal (content preserved)

Used for: DistilBERT input

Tier 2 · Advanced Normalization

Stop-word removal (negations preserved)

WordNet lemmatization

Used for: TF-IDF input ONLY

⚠ Tier 2 applied to BERT destroys syntactic structure

MIAA · NLP Project · 2026

Corpus Statistics

47,459

Total Tweets

Balanced Classes

80 / 20

Train / Test Split

KaggleHub

Data Source

CLASS DISTRIBUTION

Class

Description

Count

Split (Train/Test)

age

Age-based cyberbullying

~7,910

6,328 / 1,582

ethnicity

Ethnicity-based cyberbullying

~7,910

6,328 / 1,582

gender

Gender-based cyberbullying

~7,910

6,328 / 1,582

religion

Religion-based cyberbullying

~7,910

6,328 / 1,582

other_cyberbullying

Other harassment types

~7,910

6,328 / 1,582

not_cyberbullying

Benign non-harassing content

~7,910

6,328 / 1,582

TOTAL

—

47,459

37,967 / 9,492

Stratified split (random_state=42) ensures equal class representation across train and test sets.

MIAA · NLP Project · 2026

Dual Preprocessing Pipeline

MIAA · NLP Project · 2026

Mina Faltos

Classical Tier

Raw Tweet Input

Tier 2 Normalization

Lowercasing

Stop-word removal (negations preserved: no, not, none, never)

WordNet Lemmatization (bullying → bully)

TF-IDF Vectorization

ngram_range = (1,2)

max_features = 5,000

Classical Models

Logistic Regression + Naïve Bayes

[Scikit-learn]

[NLTK WordNet]

[GridSearchCV]

João Fernandes

Deep Learning Tier

Raw Tweet Input

Tier 1 Basic Cleaning

Lowercasing, URL removal

@mention anonymization

Hashtag symbol stripping

Regex-based whitespace collapse

WordPiece Tokenization

DistilBertTokenizerFast

max_length = 128 tokens

DistilBERT Fine-tuning

FP16 + Accelerate

[HuggingFace Transformers]

[Accelerate]

[FP16]

Classical Model Benchmarks

Naïve Bayes

alpha = 1.0

77.17%

0.7644

0.7652

TF-IDF (1,2)-grams, 5k features

Higher false negative rate on ambiguous classes

GridSearchCV: 3-fold, scoring: Macro F1

Logistic Regression

C=1, class_weight='balanced'

82.68%

0.8268 ✓

0.8276

TF-IDF (1,2)-grams, 5k features

GridSearchCV: 3-fold, scoring: Macro F1

+5.51 pp Macro F1

Strong Performance

Age, ethnicity, religion — lexically distinct classes, near-ceiling for bag-of-words

Weakness

not_cyberbullying (F1=0.56) and other_cyberbullying (F1=0.64) — contextual blindness

MIAA · NLP Project · 2026

DistilBERT Training Performance

Architecture & Configuration

Fine-Tuning Hyperparameters

34.6 min

Wall-Time (CPU Training)

12–15 min

GPU Training (16GB VRAM)

FP16 · 1.5×

Mixed Precision Speedup

MIAA · NLP Project · 2026

distilbert-base-uncased

66M

Parameters

40%

Smaller than BERT

60%

Faster Inference

~97%

Performance Retained

Final Performance Comparison

Naïve Bayes (alpha=1.0)

77.17%

0.7644

0.7652

−9.63 pp

Logistic Regression (C=1)

82.68%

0.8268

0.8276

baseline

DistilBERT (fine-tuned)

86.29%

0.8607

0.8641

+3.39 pp

DistilBERT vs. Logistic Regression

+3.61 pp Accuracy · +3.39 pp Macro F1

Largest gain: not_cyberbullying

+8.8 pp F1 (hardest class)

other_cyberbullying

+5.6 pp F1

DistilBERT's gains are concentrated on semantically ambiguous categories where bag-of-words fails. Identity-specific classes (age, ethnicity, religion) see only marginal +0.7–1.9 pp improvement — classical methods are already near-optimal for lexically-distinct content.

MIAA · NLP Project · 2026

Semantic Cohesion Analysis — Part D

SentenceTransformer · all-MiniLM-L6-v2 · 300 tweets/class · 1,800 total samples

INTRA-CLASS COSINE SIMILARITY

Mean pairwise cosine similarity within each class — higher = tighter, more coherent cluster

age

0.74

Tightest cluster

ethnicity

0.71

religion

0.69

gender

0.61

other_cyberbullying

0.47

High heterogeneity

not_cyberbullying

0.44

Most dispersed

Identity-specific classes form tight semantic neighborhoods; ambiguous classes scatter across embedding space

GEOMETRIC VERDICT

Latent Islands Confirmed

t-SNE (perplexity=40) and PCA projections both confirm tight, well-separated clusters for age, ethnicity, and religion.

Overlapping Core Region

not_cyberbullying and other_cyberbullying share a central overlapping region in embedding space — confirming fundamental semantic ambiguity.

Model-Agnostic Finding

This ambiguity is visible before any classifier is trained — it is a property of the data, not a modelling limitation.

t-SNE · perplexity=40

PCA · PC1+PC2

384-dim embeddings

MIAA · NLP Project · 2026

Technical Trade-offs & Ethics

COMPUTATIONAL SCALABILITY

ETHICAL CONSIDERATIONS

Dimension

Logistic Regression

DistilBERT

Training time

<60 s CPU

~12–15 min GPU

Inference speed

~50,000 tweets/s

~900 tweets/s

Model size

<200 MB

~600 MB

Macro F1

0.8268

0.8607

55× inference latency overhead

+3.39 pp Macro F1 gain

Full feature-weight inspection via SHAP/LIME

Attention weights unreliable (Jain & Wallace, 2019) — requires Integrated Gradients

Representation Bias

English-Twitter-centric dataset; performance degrades on non-Western harassment patterns

Pre-training Bias

DistilBERT inherits BookCorpus/Wikipedia biases along gender, ethnicity, and religion axes

Asymmetric Harm

False negatives leave victims unprotected. Standard Macro F1 insufficient — class-specific cost matrices needed

Marginalized Speech

Reclaimed language used affirmatively within communities risks misclassification as bullying

Human-in-the-loop review required for ambiguous classifications (not_cyberbullying / other)

MIAA · NLP Project · 2026

INTEGRATED PROJECT MIAA · 2026

Q & A

Thank you for your attention.

Mina Faltos

Classical ML · Preprocessing · Error Analysis

Student ID: 20260193

João Fernandes

DistilBERT · Semantic Analysis · Visualization

Student ID: 20260482

NLP — Master's in AI & Advanced Analytics (MIAA)

April 29, 2026

DistilBERT: 86.29% Accuracy

Macro F1: 0.8607

+3.39 pp vs. Logistic Regression

47,459 tweets · 6 classes · Stratified 80/20

MIAA

machine-learning
nlp
cyberbullying-detection
distilbert
data-science
text-classification
academic-project
artificial-intelligence

Technical Trade-offs & Ethics

DESIGNER-MADE PRESENTATION, GENERATED FROM YOUR PROMPT

Multi-Class Cyberbullying Detection: Classical vs. Transformers

DESIGNER-MADE
PRESENTATION,
GENERATED FROM
YOUR PROMPT