# Hybrid Models for S&P 500 Forecasting & News Sentiment
> Explore how integrating FinBERT sentiment and BERTopic modeling improves S&P 500 forecasting during market crises in this financial machine learning study.

Tags: financial-forecasting, machine-learning, sentiment-analysis, topic-modeling, nlp, sp500, ai
## Advancing Financial Time Series Forecasting
* **Research Question:** Can structured news sentiment improve next-day S&P 500 predictions? 
* **Data:** 16 years of market data (2008–2023) and 82,110 financial headlines.
* **Technology Stack:** FinBERT for sentiment scoring, BERTopic for topic modeling, and SHAP for explainable AI.

## Methodology and Features
* **Feature Sets:** Tested four layers including price-only data (Set A), basic sentiment (Set B), topic-structured sentiment (Set C), and volatility-weighted topic sentiment (Set D).
* **Models:** Evaluated using Gradient-Boosted Trees (LightGBM, XGBoost, CatBoost) and Recurrent Neural Networks (LSTM, GRU).
* **Evaluation Framework:** 7-check process including walk-forward cross-validation and regime analysis.

## Key Findings
* **Regime Analysis:** While models struggle to beat a coin flip on average (AUC ~0.50), volatility-weighted sentiment (Set D) achieved an AUC of 0.568 during high-volatility/crisis periods.
* **Market Conditions:** Sentiment features add noise in calm markets but provide significant predictive value during market stress (high VIX).
* **Feature Importance:** Technical indicators dominate generally, but topic-derived features like credit-rating sentiment rank highly during turbulent periods.

## Contributions
* Proposed regime-aware evaluation as a standard for financial machine learning.
* Demonstrated how global accuracy metrics can hide model utility in specific market conditions.
---
This presentation was created with [Bobr AI](https://bobr.ai) — an AI presentation generator.