Made byBobr AI

AI/ML Classification for Suspicious Mule Account Detection

Learn about graph-augmented and temporally-aware ensemble classification architectures for real-time detection of financial mule accounts.

#ai#machine-learning#fintech#fraud-detection#graph-networks#banking#cybersecurity
Watch
Pitch
TEAM PHOENIX
LIVE MODEL · F1 0.82
TEAM PHOENIX // BANKING AI HACKATHON 2026

AI/ML Classification of
Suspicious Mule Accounts

Graph-augmented, temporally-aware ensemble classification for real-time mule account detection.

Mule accounts are rarely unusual in isolation — they are unusual in context. They operate in multi-tiered architectures (Tier 1 Collectors → Tier 2 Relays → Tier 3 Consolidators) specifically designed to evade traditional, rule-based banking monitors.

PRESENTED BY TEAM PHOENIX
01 / 08
TIER 1 · VICTIMS TIER 2 · COLLECTOR TIER 3 · RELAYS tx_flow dispersal
v1.0 · CLASSIFIED · PHOENIX-MULE-DET
BUILD // 2026.03 SYS // NOMINAL LAT 28.61°N · LON 77.20°E
Made byBobr AI
TEAM PHOENIX
PROBLEM BRIEF · 02 / 08
02 // PROBLEM ANALYSIS

Why Standard ML
Fails on Tabular Data

The 18 provided tabular features capture account-level aggregates, but suffer from feature blindness — they show what an account does, but not who it interacts with. Fraudsters actively probe bank thresholds, using transaction structuring to mimic legitimate behavior and bypass basic anomaly detection. Static data snapshots also miss time-burst patterns.

TABULAR VIEW · BLIND
18 FEATURES
ACCOUNT A · LEGIT
ACCT_ID: 0x7A2F · TXN_COUNT: 142
AVG_AMT: $2,840 · BAL: $18.2K
ACTIVE_DAYS: 38 · KYC: VERIFIED
ACCOUNT B · MULE
ACCT_ID: 0x9C18 · TXN_COUNT: 138
AVG_AMT: $2,790 · BAL: $17.9K
ACTIVE_DAYS: 36 · KYC: VERIFIED
Account A (legit) ≈ Account B (mule)
— indistinguishable on tabular features
VS
GRAPH VIEW · EXPOSED
+ CONTEXT
B fan-in: 11 fan-out: 9 betweenness: 0.87
Same Account B — revealed as
Tier 2 Relay via graph context
v1.0 · CLASSIFIED · PHOENIX-MULE-DET
BUILD // 2026.03 DIAG // FEATURE BLINDNESS LAT 28.61°N · LON 77.20°E
Made byBobr AI
TEAM PHOENIX
LIVE MODEL · F1 0.82
03 // OPTIMIZATION STRATEGY

The Class Imbalance Trap

// optimizing the wrong metric
ACCURACY IS
A TRAP.

Target variable F3924 typically contains only 1–3% prevalence of actual mule accounts. A model predicting 'legitimate' for everything hits 98% accuracy but catches zero fraud. Standard tabular XGBoost misses roughly 1 in 3 fraudulent accounts.

// OUR APPROACH
Our system strictly optimizes for F1-Score and Recall at fixed precision.
F1-SCORE · BASELINE vs PHOENIX ARCHITECTURE
0.0 0.2 0.4 0.6 0.8 1.0 0.60 0.82 TABULAR BASELINE xgboost // features only PHOENIX ENSEMBLE graph + temporal + tabular + 36% LIFT on the metric that matters F1 SCORE
ACCURACY (misleading)
BASELINE 0.98
PHOENIX 0.98
≈ IDENTICAL // USELESS
PRESENTED BY TEAM PHOENIX
03 / 08
v1.0 · CLASSIFIED · PHOENIX-MULE-DET
BUILD // 2026.03 SYS // NOMINAL LAT 28.61°N · LON 77.20°E
Made byBobr AI
TEAM PHOENIX
LIVE MODEL · F1 0.82
04 // SYSTEM ARCHITECTURE

Dual-Pipeline Stacking Ensemble

Data Foundation fuses 18 tabular features with constructed transaction graphs. Pipeline A runs XGBoost on an enriched 31-feature matrix. Pipeline B uses a 2-Layer Graph Attention Network (GAT) for structural relationships. A stacking meta-learner synthesizes the final risk score.

01 · INGEST 02 · ENGINEER 03 · CLASSIFY 04 · ENSEMBLE 05 · ALERT RAW DATA · 18 tabular features · transaction logs Graph Feature Preprocessor network topology · centrality · degree Temporal Feature Preprocessor velocity · burst patterns · time deltas 31-FEATURE ENRICHED MATRIX PIPELINE A XGBoost tabular · gradient boosted trees 31-feat matrix · Focal Loss · class-weighted PIPELINE B 2-Layer GAT Graph Attention Network · structural GraphSMOTE oversampling · multi-head attn STACKING META- LEARNER Logistic Reg · synthesis SHAP ALERTS risk score + reason codes ingest split split probs probs score DATA FOUNDATION DUAL-PIPELINE · PARALLEL INFERENCE · ENSEMBLE FUSION COMPLIANCE LAYER
04 / 08
v1.0 · CLASSIFIED · PHOENIX-MULE-DET
BUILD // 2026.03 SYS // NOMINAL LAT 28.61°N · LON 77.20°E
Made byBobr AI
TEAM PHOENIX
LIVE MODEL · F1 0.82
05 // FEATURE ENGINEERING

Unlocking Signal via Graph & Temporal Engineering

The Graph Feature Preprocessor extracts structural telemetry; temporal extraction computes velocity and dormancy-spike ratios; behavioral flags catch automated structuring.

// 6 OF 13 ENGINEERED SIGNALS SHOWN
ENGINEERED FEATURE
TYPE
CRIMINAL BEHAVIOR DETECTED
betweenness_centrality
GRAPH
Tier 2 Relay identification
scatter_ratio
GRAPH
Fan-out / Dispersal mules
in_out_degree_skew
GRAPH
Tier 1 Collectors
rolling_txn_velocity
TEMPORAL
Time-burst laundering
dormancy_spike_ratio
TEMPORAL
Dormant-account reactivation
threshold_proximity_score
BEHAVIORAL
Automated structuring
// VISUAL EXAMPLE
High-Betweenness Relay
A single node bridging two otherwise-disconnected clusters — a structural fingerprint of a Tier 2 relay.
ex. betweenness ↑ CLUSTER A CLUSTER B
v1.0 · CLASSIFIED · PHOENIX-MULE-DET
BUILD // 2026.03 SYS // NOMINAL 05 / 08
Made byBobr AI
TEAM PHOENIX
LIVE MODEL · F1 0.82
06 // IMBALANCE MITIGATION

Four-Layer Class Imbalance Correction

Imbalanced data passes through four targeted correction layers before producing the final decision boundary.

L1
Data Layer
GraphSMOTE interpolates synthetic minority nodes along network edges.
L2
Algorithm Layer
Severe cost-sensitive penalties on minority misclassifications in tree splits.
L3
Loss Function
Focal Loss in the GAT forces gradients to focus on deceptive relay accounts.
L4
Threshold Layer
Decision boundary post-tuned on the validation Precision-Recall curve.
INPUT · CLASS RATIO 49:1 · 98% LEGIT / 2% MULE STAGE 1 · DATA LAYER GraphSMOTE · synthetic node interpolation synthetic minority nodes ↑ STAGE 2 · ALGORITHM LAYER Cost-Sensitive · weighted tree splits class weights × penalty STAGE 3 · LOSS LAYER Focal Loss · hard-example gradients γ = 2.0 focus on hard ex. STAGE 4 · THRESHOLD LAYER PR-tuned · decision boundary τ=0.34 threshold = 0.34 PR-optimized OUTPUT · BALANCED DECISIONS RECALL 0.88 PRECISION 0.78 F1 SCORE 0.82 └ FINAL DECISION BOUNDARY VALIDATED // HOLDOUT ┘
v1.0 · CLASSIFIED · PHOENIX-MULE-DET
BUILD // 2026.03 SYS // NOMINAL 06 / 08
Made byBobr AI
TEAM PHOENIX
XAI LAYER · SHAP v0.45
07 // EXPLAINABILITY · COMPLIANCE

Explainable AI (XAI) for Banking Compliance

A risk score of 0.91 is legally insufficient to freeze an account or file a Suspicious Activity Report (SAR).

Our system converts exact SHAP feature contributions into standardized, human-readable compliance reason codes ready for analyst review.

Raw SHAP
values
Reason Code
mapping
Analyst-ready
alert
// SAR-ready · audit-traceable · regulator-approved language
HIGH RISK ALERT
2026-06-12 · 14:32:08 UTC | ALERT ID #A7823-42
ACCOUNT
Account #A7823
RISK SCORE
score: 0.91 / 1.00
TRIGGERED REASON CODES
SHAP Δ
STRUCT-FANOUT
scatter_ratio in 99th percentile
+0.34
TEMP-BURST
inactive 38 days, 31 transactions in 8 hours
+0.28
GRAPH-BETW
betweenness centrality 0.87 — Tier 2 relay topology
+0.19
analyst: j.mehta · audit-log enabled
v1.0 · CLASSIFIED · PHOENIX-MULE-DET
XAI // SHAP+REASON-MAP COMPLIANCE // READY 07 / 08
Made byBobr AI
TEAM PHOENIX
DEPLOY READY · phoenix@hackathon.ai
08 // ROADMAP · BENCHMARKS

Operational Roadmap & Benchmarks

Progressing from a blind tabular baseline to a fully stacked, graph-aware ensemble elevates target F1-Score from ~0.60 → 0.82. Near-real-time tabular scoring for rapid transaction holds; daily batch scoring for graph recomputation. Ingests compliance officer dispositions to adapt to concept drift.

08 / 08
// F1-SCORE PROGRESSION · MODEL EVOLUTION
0.90 0.80 0.70 0.60 0.50 F1-SCORE 0.60 TABULAR BASELINE XGBoost · 18 features +0.12 0.72 + GRAPH FEATURES graph preprocessing · centrality +0.04 0.76 + TEMPORAL FEATURES velocity · dormancy windows +0.06 0.82 FULL ENSEMBLE ARCHITECTURE stacked · graph + temporal + tabular TARGET · PHOENIX
REAL-TIME
<200ms tabular scoring · transaction-hold latency
BATCH
Daily graph recomputation · 4-hr window
ADAPTIVE
Continuous learning · analyst dispositions → labels
// Team Phoenix
ready to deploy
·
F1 0.82
·
production-grade
v1.0 · CLASSIFIED · PHOENIX-MULE-DET
BUILD // 2026.03 SYS // NOMINAL LAT 28.61°N · LON 77.20°E
Made byBobr AI
Bobr AI

DESIGNER-MADE
PRESENTATION,
GENERATED FROM
YOUR PROMPT

Create your own professional slide deck with real images, data charts, and unique design in under a minute.

Generate For Free

AI/ML Classification for Suspicious Mule Account Detection

Learn about graph-augmented and temporally-aware ensemble classification architectures for real-time detection of financial mule accounts.

TEAM PHOENIX // BANKING AI HACKATHON 2026

Graph-augmented, temporally-aware ensemble classification for real-time mule account detection.

Mule accounts are rarely unusual in isolation — they are unusual in context. They operate in multi-tiered architectures (Tier 1 Collectors → Tier 2 Relays → Tier 3 Consolidators) specifically designed to evade traditional, rule-based banking monitors.

02 // PROBLEM ANALYSIS

The 18 provided tabular features capture account-level aggregates, but suffer from <span style='color:#00B4FF;font-weight:500;'>feature blindness</span> — they show <em style='color:#E5EAF2;font-style:normal;font-weight:500;'>what</em> an account does, but not <em style='color:#E5EAF2;font-style:normal;font-weight:500;'>who</em> it interacts with. Fraudsters actively probe bank thresholds, using transaction <span style='color:#00B4FF;font-weight:500;'>structuring</span> to mimic legitimate behavior and bypass basic anomaly detection. Static data snapshots also miss <span style='color:#00B4FF;font-weight:500;'>time-burst patterns</span>.

03 // OPTIMIZATION STRATEGY

The Class Imbalance Trap

04 // SYSTEM ARCHITECTURE

Dual-Pipeline Stacking Ensemble

05 // FEATURE ENGINEERING

05 / 08

06 // IMBALANCE MITIGATION

Imbalanced data passes through four targeted correction layers before producing the final decision boundary.

07 // EXPLAINABILITY · COMPLIANCE

08 // ROADMAP · BENCHMARKS

phoenix@hackathon.ai