Predicting Seismic Hazards with Machine Learning & MLP
Learn how Machine Learning and Deep Learning models like KNN and MLP are used for seismic hazard prediction in mining using the Seismic-Bumps dataset.
Seismic Hazard Prediction via Machine Learning
Project Analysis & Neural Network Implementation
Student Name | ID: 12345678
Project Objectives
<ul><li><strong>Hazard Prediction:</strong> Forecasting seismic bumps in mining environments based on sensor data.</li><li><strong>Model Comparison:</strong> Evaluating Machine Learning vs. Deep Learning approaches.</li><li><strong>Key Algorithms:</strong><ul><li>Unsupervised: K-Means Clustering</li><li>Supervised (Classic): KNN (K-Nearest Neighbors)</li><li>Deep Learning: MLP (Multi-Layer Perceptron)</li></ul></li></ul>
Dataset Overview: "Seismic-Bumps"
<ul><li><strong>Source:</strong> UCI Machine Learning Repository.</li><li><strong>Volume:</strong> 2584 records, 10 feature columns.</li><li><strong>Class Imbalance:</strong> High imbalance between 'Hazardous' (1) and 'Non-Hazardous' (0) states.</li><li><strong>Key Features:</strong><br>- <em>genergy</em>: Seismic energy recorded.<br>- <em>gpuls</em>: Number of pulses recorded.<br>- <em>gdenergy</em>: Energy gradient/deviation.</li></ul>
Methodology: Preprocessing
Label Encoding
Machine learning models require numerical input. Textual categorical data (e.g., shift names) must be converted.<br><br><strong>Example:</strong><br>Shift 'Morning' → <span style='color:#003366; font-weight:bold;'>0</span><br>Shift 'Afternoon' → <span style='color:#003366; font-weight:bold;'>1</span>
Methodology: Scaling (RobustScaler)
<strong>Why RobustScaler?</strong><br>Seismic energy data contains massive outliers (extreme spikes). Standard scaling (Mean/Variance) would be skewed by these anomalies. RobustScaler uses the Median (Q2) and Interquartile Range (IQR), making the model resilient to extreme physical events.
Unsupervised Learning: K-Means
<ul><li><strong>Goal:</strong> Discovery of natural groups within the data without pre-labeled answers.</li><li><strong>Configuration:</strong> K=3 Clusters.</li><li><strong>Mechanism:</strong> The algorithm identifies 'centroids' (centers of gravity) and groups seismic events based on physical similarity (Energy/Pulses).</li></ul>
Classification: K-Nearest Neighbors (KNN)
<strong>Logic:</strong> Risk determination based on the majority class of the 5 closest historical data points.<br><br><strong>Metric:</strong> Euclidean Distance in a 10-dimensional feature space.<br><br><strong>Application:</strong> A classic baseline model to benchmark complex neural networks against.
Deep Learning: MLP Architecture
<strong>Structure:</strong><br>Input Layer (10 Features) → Hidden Layer (32 Neurons) → Dropout → Hidden Layer (16 Neurons) → Output (1 Neuron)<br><br><strong>Activation Functions:</strong><br><ul><li><em>ReLU:</em> For hidden layers (handling non-linearity).</li><li><em>Sigmoid:</em> For output (probability 0.0 - 1.0).</li></ul>
Stabilization Techniques
<div style='margin-bottom: 40px;'><strong>Dropout (Rate 0.2)</strong><br>Randomly disabling 20% of neurons during training iterations. This forces the network to learn robust features rather than memorizing noise (overfitting prevention).</div><div><strong>EarlyStopping</strong><br>Monitors validation loss. Automatically halts training when model performance ceases to improve, saving computational resources and preventing overtraining.</div>
Results: Correlation Analysis
Spearman Correlation shows a strong positive relationship (approx 0.76) between Seismic Energy and Number of Pulses. High-energy events consistently correlate with high pulse counts.
Results: KNN Evaluation
<strong>Figure 2: Confusion Matrix</strong><br><br>Accuracy reached ~91%. The model demonstrates high precision in determining safe states (non-hazardous), though dealing with the minority 'hazardous' class remains a challenge common in seismic data.
Results: Cluster Visualization
The K-Means algorithm effectively separated the data into regimes. 'Pink' points represent high-energy clusters, identified as the most dangerous seismic states.
Results: Neural Network Training
The close alignment between Training (Blue) and Validation (Pink) curves indicates a stable model with minimal overfitting, thanks to the dropout layers.
Final Comparison: KNN vs MLP
The Neural Network (MLP) outperformed the classic KNN algorithm. The MLP's ability to capture non-linear relationships in the seismic data led to superior predictive performance.
Conclusions
<ul><li><strong>High Accuracy:</strong> AI models achieved >90% accuracy in predicting seismic states.</li><li><strong>Preprocessing is Key:</strong> RobustScaler proved essential for handling natural anomalies in seismic energy data.</li><li><strong>Future Potential:</strong> Deep Learning (MLP) shows the most promise for real-time Early Warning Systems due to its handling of non-linearities and stability.</li></ul>
- machine-learning
- deep-learning
- seismic-hazard
- neural-networks
- data-science
- mining-safety
- mlp
- knn






