SVHN Digit Classification with MobileNetV2 Transfer Learning
Learn how to classify Street View House Numbers (SVHN) using transfer learning with MobileNetV2, Keras, and TensorFlow. Includes preprocessing and code samples.
SVHN Digit Classification
Leveraging Transfer Learning with MobileNetV2
Interactive Agenda
1. Dataset & Exploration
2. Preprocessing & Model
3. Training Analysis
4. Optimization & Deploy
Introduction & Task Overview
The Goal: Classify digits (0-9) from the SVHN Cropped dataset using Transfer Learning.
• Dataset: SVHN (Street View House Numbers) • Challenge: Real-world clutter, variable lighting • Approach: MobileNetV2 Pretrained Backbone • Outcome: Evaluated & Deployed Model
Dataset Exploration (EDA)
73,257
Training Examples
26,032
Test Examples
Preprocessing Pipeline
IMG_SIZE = 96 def format_image(image, label): # Resize to match MobileNetV2 input image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE)) # Normalize (0-255 -> 0.0-1.0) image = image / 255.0 return image, label # Pipeline optimization train = raw_train.map(format_image).cache().shuffle(1000).batch(32)
Key Operations:
• Resizing: 96x96 pixels • Normalization: Scaling pixel intensity • Augmentation: Random Rotation & Zoom • Caching: Optimized data throughput
Model Architecture
We employ a Transfer Learning strategy using MobileNetV2 features with a custom classification head.
Implementation (Keras)
# Base Model (Pretrained) base_model = tf.keras.applications.MobileNetV2( input_shape=(IMG_SIZE, IMG_SIZE, 3), include_top=False, weights='imagenet' ) base_model.trainable = False # Freeze weights # Custom Head model = tf.keras.Sequential([ base_model, tf.keras.layers.GlobalAveragePooling2D(), tf.keras.layers.SeparableConv2D(1024, (3,3), padding='same'), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10, activation='softmax') ])
Note: Feature extraction frozen to leverage ImageNet features.
Training Configuration
• Optimizer: Adam (LR=1e-3) • Loss: Sparse Categorical Crossentropy • Epochs: 3 • Steps per Epoch: 50 (Constraint)
• EarlyStopping (Patience=2) • ReduceLROnPlateau (Factor=0.3) • Validation Split: 10%
Baseline Results (3 Epochs)
Model shows steady convergence even within limited training steps. Accuracy climbed from 38% to 46%.
Model Evaluation
49.82%
Test Set Accuracy
Confusions Noted: • Digits 2 and 3 usually confused • Varied lighting impacts confidence
Optimization: Learning Rate Tuning
Reducing the Learning Rate to 1e-4 significantly improved convergence, yielding >51% accuracy by Epoch 2.
Deployment & Prediction
def predict_image(img_path): # Save and Load model model.save('svhn_mobilenetv2.keras') # Preprocess single image img = load_and_resize(img_path) img_batch = tf.expand_dims(img, 0) # Inference preds = model.predict(img_batch) score = tf.nn.softmax(preds[0]) return np.argmax(score), 100 * np.max(score)
Ethical AI Considerations
Bias & Generalization
Model may fail on house numbers with unusual fonts or backgrounds not present in SVHN training data.
Compute Efficiency
Using Transfer Learning (MobileNetV2) reduces energy consumption compared to training from scratch.
Conclusion & Future Work
Successfully implemented a transfer learning pipeline achieving >50% accuracy on a challenging 10-class dataset with minimal training.
Q & A
- transfer-learning
- mobilenetv2
- computer-vision
- deep-learning
- tensorflow
- keras
- svhn-dataset
- image-classification




