Predictive Maintenance: Machine Failure Prediction Analysis

Explore a machine learning study on failure prediction for CNC milling workstations using sensor data, XGBoost, and comparative feature engineering.

#predictive-maintenance#machine-learning#failure-prediction#ai4i-2020#xgboost#industrial-ai#data-science

Watch
Pitch

01

Abstract futuristic industrial blueprint, glowing blue and teal lines, data analytics background, clean minimal, high quality

Predictive Maintenance: Machine Failure Prediction

Petar Yankov

Made by

02

Project Purpose

Predict machine failure early using raw sensor data

Compare two feature sets: Sensors-Only vs. Sensors + Indicators

Focus: Improving reliability and industrial decision support

minimalist icon of a target or goal with circuit board traces, blue and teal vector style

Made by

03

Domain Understanding

Context: CNC Milling Workstation

Sensors reflect: Machine load, vibration, and temperature.
Goal: Reduce downtime & prevent expensive, catastrophic failures.

modern industrial CNC milling machine in a clean factory environment

Made by

04

Dataset: AI4I 2020

Source: Synthetic, cleaned dataset (UCI Repository).
Data Quality: No missing values → minimal cleaning needed.
Challenge: Strong class imbalance (approx 97% vs 3%).

Chart

Made by

05

Feature Engineering Strategy

Version A: Sensors Only

• Air temperature [K]
• Process temperature [K]
• Rotational speed [rpm]
• Torque [Nm]
• Tool wear [min]

Version B: + Indicators

• TWF (Tool Wear Failure)
• HDF (Heat Dissipation)
• PWF (Power Failure)
• OSF (Overstrain Failure)
• RNF (Random Failure)

Version A is realistic. Version B includes failure flags (near-target leakage).

Made by

06

Label Analysis: The 0/1 Mismatch

Mismatch: Any Indicator (OR) vs. Actual Target

Indicator = 0

Indicator = 1 (OR)

Target = 09643 (Correct)

Target = 018 (Indicator present, No Fail)

Target = 19 (Fail present, No Indicator)

Target = 1330 (Correct)

Only 27 total mismatches. Indicators are almost a direct definition of the target.

Made by

07

Failure Type Distribution

Chart

Heat Dissipation and Power Failures are the most common causes in this dataset.

Made by

08

Feature Correlations

We calculated the relationship between each feature and the 'Machine Failure' label.

Strong correlation explains the near-perfect performance of Version B.

Chart

Made by

09

Modeling Strategy

Algorithms

• Random Forest
• XGBoost
• Gradient Boosting

Why?

• Handle non-linear data well
• Robust to mixed feature types
• Interpretable (Feature Importance)

Imbalance Handling

• Class Weights
• Scale Pos Weight (XGB)
• No SMOTE used (kept data pure)

Made by

10

Results: Version A (Sensors Only)

Chart

Analysis

• XGBoost performed best overall (Best Balance).
• Recall is prioritized to catch failures.
• ~98% Accuracy (misleading due to imbalance).

Made by

11

Results: Version B (With Indicators)

Chart

Analysis

• Near-perfect scores across all models.
• Confirms that Indicators are proxy labels.
• Serves as a theoretical Upper Bound.

Made by

12

XAI, Demo & Reflection

1. Interpretability (XAI)

Trust is key in industry. Feature Importance helps engineers understand 'Why'.

2. Demo / Prototype

screenshot of a simple web dashboard for predictive maintenance, showing risk gauges and sensor input fields, clean UI

3. Reflection

✔ Strong comparative analysis
⚠ Next: Use real-world time-series data & timestamps.

Made by

DESIGNER-MADE
PRESENTATION,
GENERATED FROM
YOUR PROMPT

Create your own professional slide deck with real images, data charts, and unique design in under a minute.

Generate For Free

Predictive Maintenance: Machine Failure Prediction Analysis

Explore a machine learning study on failure prediction for CNC milling workstations using sensor data, XGBoost, and comparative feature engineering.

Predictive Maintenance: Machine Failure Prediction

Petar Yankov

Project Purpose

Predict machine failure early using raw sensor data

Compare two feature sets: Sensors-Only vs. Sensors + Indicators

Focus: Improving reliability and industrial decision support

Domain Understanding

Context: CNC Milling Workstation

Sensors reflect: Machine load, vibration, and temperature.

Goal: Reduce downtime & prevent expensive, catastrophic failures.

Dataset: AI4I 2020

Source: Synthetic, cleaned dataset (UCI Repository).

Data Quality: No missing values → minimal cleaning needed.

Challenge: Strong class imbalance (approx 97% vs 3%).

Feature Engineering Strategy

Version A: Sensors Only

• Air temperature [K]<br>• Process temperature [K]<br>• Rotational speed [rpm]<br>• Torque [Nm]<br>• Tool wear [min]

Version B: + Indicators

• TWF (Tool Wear Failure)<br>• HDF (Heat Dissipation)<br>• PWF (Power Failure)<br>• OSF (Overstrain Failure)<br>• RNF (Random Failure)

Version A is realistic. Version B includes failure flags (near-target leakage).

Label Analysis: The 0/1 Mismatch

Mismatch: Any Indicator (OR) vs. Actual Target

9643 (Correct)

18 (Indicator present, No Fail)

9 (Fail present, No Indicator)

330 (Correct)

Only 27 total mismatches. Indicators are almost a direct definition of the target.

Failure Type Distribution

Heat Dissipation and Power Failures are the most common causes in this dataset.

Feature Correlations

Strong correlation explains the near-perfect performance of Version B.

Modeling Strategy

Algorithms

• Random Forest<br>• XGBoost<br>• Gradient Boosting

Why?

• Handle non-linear data well<br>• Robust to mixed feature types<br>• Interpretable (Feature Importance)

Imbalance Handling

• Class Weights<br>• Scale Pos Weight (XGB)<br>• No SMOTE used (kept data pure)

Results: Version A (Sensors Only)

• XGBoost performed best overall (Best Balance).<br>• Recall is prioritized to catch failures.<br>• ~98% Accuracy (misleading due to imbalance).

Results: Version B (With Indicators)

• Near-perfect scores across all models.<br>• Confirms that Indicators are proxy labels.<br>• Serves as a theoretical Upper Bound.

XAI, Demo & Reflection

Interpretability (XAI)

Trust is key in industry. Feature Importance helps engineers understand 'Why'.

Demo / Prototype

Reflection

✔ Strong comparative analysis<br>⚠ Next: Use real-world time-series data & timestamps.

predictive-maintenance
machine-learning
failure-prediction
ai4i-2020
xgboost
industrial-ai
data-science