Stock Market Forecasting

Tech Stack: Python, Pandas, NumPy, Matplotlib, Scikit-learn, yfinance, Statsmodels, TensorFlow/Keras

This project explores time series forecasting techniques by comparing statistical models (ARIMA, SARIMA) with deep learning approaches (LSTM) for predicting stock prices of three major tech companies—Google (GOOGL), Apple (AAPL), and Amazon (AMZN).

Problem Statement

Stock price prediction is one of the most challenging problems in financial analysis due to market volatility and complex temporal dependencies. This project explores whether deep learning models can capture sequential patterns better than traditional statistical approaches for short-term price forecasting, while demonstrating advanced time series modeling techniques.

Overview

Goal: Compare forecasting performance across different model architectures.
Data: Daily stock price data for GOOGL, AAPL, and AMZN via yfinance API.
Models: ARIMA, SARIMA, LSTM Neural Networks.
Evaluation: RMSE, MAE across all three stocks.
Best RMSE: LSTM achieved 2.90 (AMZN), 3.37 (GOOGL), 4.38 (AAPL).

Technical Approach

Data Engineering: Collected and preprocessed historical stock data with proper scaling and normalization for neural network training.

Feature Engineering: Created lag features, rolling statistics (moving averages, volatility), and technical indicators to capture market dynamics.

Statistical Modeling: Implemented ARIMA and SARIMA models using Statsmodels, with proper stationarity testing and parameter optimization.

Deep Learning Architecture: Designed LSTM neural network with multiple layers, dropout regularization, and optimized sequence length for temporal pattern recognition.

Model Validation: Used time-based train/validation/test splits to prevent data leakage and ensure realistic evaluation.

Key Achievements

Successfully implemented and compared three different forecasting approaches.
Demonstrated LSTM's superior ability to capture non-linear temporal dependencies.
Built robust evaluation framework with multiple performance metrics.
Created scalable pipeline for multi-stock analysis and comparison.
Achieved consistent performance improvements across all three tech stocks.

Results & Analysis

The LSTM model dramatically outperformed traditional statistical methods across all three stocks:

AAPL: LSTM RMSE (4.38) vs ARIMA (26.54) - 83% improvement
AMZN: LSTM RMSE (2.90) vs ARIMA (33.42) - 91% improvement
GOOGL: LSTM RMSE (3.37) vs ARIMA (27.11) - 88% improvement

The results demonstrate deep learning's superior ability to capture complex temporal patterns in financial data. LSTM consistently achieved 80-90% better accuracy than traditional methods, though performance varied during high volatility periods, highlighting both the potential and limitations of machine learning in financial forecasting.

🔗 View Code on GitHub