Stock Price Prediction ML

Accepted for Publishing

Accepted for Publishing — Journal of Emerging Investigators

June 2024 – Present

Overview

This research investigates whether incorporating Twitter sentiment analysis improves LSTM-based stock price prediction. The study tested three major stocks — Apple (AAPL), Tesla (TSLA), and Microsoft (MSFT) — comparing baseline technical-indicator models against sentiment-enhanced variants over a one-year period.

Contrary to popular assumptions in financial machine learning literature, the research found that sentiment features consistently degraded prediction accuracy across all three stocks, providing empirical evidence against the naive integration of social media sentiment into price prediction models.

The hypothesis was tested by comparing a one-layer baseline LSTM trained on technical indicators against a three-layer sentiment-augmented LSTM that added daily Twitter sentiment metrics (mean polarity, polarity dispersion, and tweet count). Both models used early stopping and dropout, and were validated through five-fold time series cross-validation preserving chronological ordering. Across all equities, the increase in RMSE was 32.1%, with only Tesla showing statistically significant degradation (t = 6.50, p = 0.003). The sentiment models showed signs of overfitting — smaller training losses but greater validation losses — and permutation importance analysis indicated that sentiment features contributed less than 5% to total predictive importance. These findings suggest that publicly available tweet-level sentiment data may contain insufficient information to improve predictions for highly traded, large-capitalization technology companies, and may instead reduce model performance due to excessive noise.

Key Finding

Sentiment-enhanced models underperformed baseline by ~32% average RMSE

Across all three stocks tested (80,793 tweets analyzed, Sep 2021 – Sep 2022), adding Twitter sentiment features to LSTM models consistently worsened prediction accuracy compared to technical-indicator-only baselines.

Authors

Leo Chang

Lead Researcher & Developer

Princeton Day School

Aditya Saraf

Co-Researcher

Cornell University

Jenjen Chen

Co-Researcher

Yardley, PA

Methodology

Data Collection

Gathered daily stock prices for AAPL, TSLA, and MSFT from September 2021 to September 2022 via Yahoo Finance, alongside 80,793 labeled tweets mentioning each ticker symbol from a publicly available Kaggle dataset.

Feature Engineering

Constructed 13 technical features (log returns, intraday high-low range, close-to-open change, 5/10/20-day SMAs, price-to-SMA ratios, 14-day RSI, volume moving average, volume ratio, rolling volatility) plus 3 sentiment metrics (mean polarity, polarity standard deviation, tweet count) for each trading day.

Model Architecture

Designed two LSTM architectures: a baseline model with a single 50-unit layer using dropout (0.2), and a sentiment-enhanced model with three stacked LSTM layers (128/64/32 units) plus batch normalization, L2 regularization, and dropout (0.2–0.3). Both used the Adam optimizer with early stopping.

Validation

Applied five-fold time series cross-validation respecting temporal ordering. Statistical significance assessed via paired t-tests comparing fold-level RMSE between baseline and sentiment models.