Skip to content

A modular Python framework for researching and backtesting multi-factor equity strategies using classical factors (Value, Momentum, Size), Fama–MacBeth regressions, IC/IR analysis, and long–short portfolio evaluation.

Notifications You must be signed in to change notification settings

z-boyi/Multi-Factor-Equity-Strategy-Backtesting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Factor Equity Strategy Backtesting

Introduction

This project implements a complete, modular research pipeline for studying multi-factor equity strategies using Python. The framework covers the fullquantitative workflow used in academic finance and industry quant research: data ingestion, factor construction, predictive modeling, portfolio formation, and performance evaluation.

Three classical cross-sectional equity factors are examined—Value, Momentum, and Size—following well-established definitions from the academic literature. Using adjusted historical price data, the project builds monthly factor exposures, estimates factor premia through Fama–MacBeth regressions, evaluates predictive power with Information Coefficient (IC) and Information Ratio (IR), and constructs a long–short portfolio based on predicted returns.

Although the stock universe is intentionally small for demonstration, the structure is fully extensible and mirrors the methodology used in professional quantitative investment workflows. This project serves as a foundational template for developing scalable multi-factor models, testing alpha signals, and building systematic equity strategies.


Project Structure

Multi-Factor-Equity-Strategy-Backtesting/
│
├── src/
│   └── data_download.py
│   └── prepare_monthly_prices.py
│   └── compute_monthly_returns.py
│   └── compute_momentum_12_1.py 
│   └── compute_value_size_factors.py
│   └── run_fama_macbeth_regression.py
│   └── compute_factor_ic_ir.py
│   └── evaluate_performance.py
│
├── data/
│   └── adj_close_prices.csv
│   └── monthly_prices.csv
│   └── monthly_returns.csv
│   └── momentum_12_1.csv
│   └── size_factor.csv
│   └── value_factor.csv
│   └── fama_macbeth_coefficients.csv
│   └── fama_macbeth_tstats.csv
│   └── factor_ic.csv
│   └── factor_ir.csv
│   └── performance_summary.csv
│
├── plots/
│   └── performance_curve.png
│
└── README.md

Completed

Step 1: Download Adjusted Daily Price Data

  • Universe: AAPL, MSFT, GOOGL, AMZN
  • Data source: Yahoo Finance (yfinance)
  • Frequency: Daily
  • auto_adjust=True → adjusted prices stored in 'Close'
  • Saved as data/adj_close_prices.csv

Step 2: Convert Daily Prices to Monthly Frequency

  • Converted to month-end prices using: resample("M").last()
  • Saved as monthly_prices.csv

Step 3: Compute Monthly Returns

  • Computed using: pct_change
  • Saved as monthly_returns.csv

Step 4: Compute 12–1 Momentum Factor

  • $\text{Momentum}{t} = \frac{P{t-1}}{P_{t-12}} - 1$
  • Saved as momentum_12_1.csv

Step 5: Compute Value and Size Factors

  • $\text{Size} = \mathrm{log}(\text{MarketCap}) = \mathrm{log}(\text{Share Price} \times \text{Shares Outstanding})$
  • $\text{Value} = \frac{1}{PE} = \frac{1}{(\frac{\text{Price per share}}{\text{Earning per share}})}= \frac{\text{Earning per share}}{\text{Price per share}}$
  • Pulled marketCap + trailingPE from Yahoo Finance
  • Saved size_factor.csv and value_factor.csv

Step 6: Fama–MacBeth Cross-Sectional Regression

  • $R_{i,t+1} = \beta_{Value,t} \cdot Value_{i,t} + \beta_{Momentum,t} \cdot Momentum_{i,t} + \beta_{Size,t} \cdot Size_{i,t}$
  • Outputs:
    • fama_macbeth_coefficients.csv
    • fama_macbeth_tstats.csv

Step 7: Compute Factor IC and IR

  • $IC_t = \text{corr}(\text{FactorScore}{i,t}, R{i,t+1})$

  • $IR = \frac{\mathbb{E}[IC]}{\sigma(IC)}$

  • Save factor_ic.csv and factor_ir.csv

  • IR Results:

    Factor IR
    Momentum –0.07
    Size 0.03
    Value –0.08
  • Interpretation:
    All IR values are close to zero, which is expected given the very small four-stock universe (megacap tech). The limited cross-sectional dispersion prevents meaningful factor signals from emerging.

Step 8: Long–Short Portfolio Backtesting

  • Predicted returns:

    $\widehat{R}{i,t+1} = \beta{Value,t} \cdot Value_{i,t} + \beta_{Momentum,t} \cdot Momentum_{i,t} + \beta_{Size,t} \cdot Size_{i,t}$

  • Portfolio Rules:

    • Long the top 2 predicted-return stocks
    • Short the bottom 2 stocks
  • Saved as: long_short_returns.csv

Step 9: Performance Evaluation & Visualization

  • Metrics:

    • Cumulative Return $\mathrm{Cumulative}t = \prod{s=1}^{t}(1 + R_{LS,s})$

    • Annualized Sharpe Ratio $\mathrm{Sharpe} = \frac{\sqrt{12},\mathbb{E}[R]}{\sigma(R)}$

    • Maximum Drawdown $\mathrm{MDD} = \min_t\left(\frac{\mathrm{Cumulative}t}{\max{s \leq t}\mathrm{Cumulative}_s} - 1\right)$

  • Saved:

    • performance_summary.csv
    • performance_curve.png

📊 Results

The results of this project demonstrate the full workflow of a classical multi-factor equity research pipeline. Although the stock universe is intentionally small (AAPL, MSFT, GOOGL, AMZN), the methodology mirrors professional quantitative research practices, including factor construction, predictive regressions, IC/IR analysis, and long–short portfolio evaluation.

Factor Predictive Power (IC/IR)

  • Momentum IR: –0.07
  • Size IR: 0.03
  • Value IR: –0.08

These near-zero IR values reflect the limited cross-sectional dispersion in the four-stock universe rather than weaknesses in the factor definitions.

Long–Short Portfolio Performance

  • Cumulative Return: +84%
  • Annualized Sharpe Ratio: ~0.46
  • Maximum Drawdown: –26%

Cumulative Performance Plot

Cumulative Performance

The cumulative performance curve of the long–short factor strategy reveals several distinct phases:

  1. Sideways Movement (2016–2018)
    Cumulative returns fluctuated between 1.00 and 1.12, indicating limited factor dispersion among the four megacap tech stocks.

  2. Drawdown (2018–2020)
    A decline to below 0.90 occurred. With only four stocks, idiosyncratic risk has an outsized impact.

  3. Recovery (2020–2023)
    Gradual improvement as predicted rankings aligned better with realized returns.

  4. Strong Upside (2024–2025)
    Cumulative return accelerated from ~1.30 to above 1.80.

Overall, these results illustrate the mechanics of factor modeling and backtesting rather than tradable performance. The framework can be extended to larger and more diverse universes for realistic alpha research.


Roadmap (Future Work)

  • Add Jupyter Notebook walkthrough
  • Expand universe to 100–500 stocks
  • Add more factors (Quality, Low Vol, Profitability)
  • Introduce transaction cost & turnover modeling
  • Apply portfolio optimization (e.g., risk parity, constrained optimization)

Required Packages

matplotlib numpy pandas statsmodels yfinance


Notes

  • With auto_adjust=True, adjusted prices appear under 'Close' instead of 'Adj Close'
  • The project is fully modular: every step is its own script

🎯 Conclusion

This project demonstrates the complete workflow of a classical multi-factor equity research pipeline:

data ingestion → factor construction → predictive regression → IC/IR evaluation → long–short portfolio → performance analysis.

Even though the stock universe is intentionally small, the methodology mirrors professional quant research practices used in hedge funds, asset managers, and academic finance. The framework is fully extensible to larger universes, more sophisticated factor models, and real-world portfolio construction techniques.

This project provides a solid foundation for further work in multi-factor modeling, portfolio management, and quantitative investment research.

About

A modular Python framework for researching and backtesting multi-factor equity strategies using classical factors (Value, Momentum, Size), Fama–MacBeth regressions, IC/IR analysis, and long–short portfolio evaluation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages