This project implements a complete, modular research pipeline for studying multi-factor equity strategies using Python. The framework covers the fullquantitative workflow used in academic finance and industry quant research: data ingestion, factor construction, predictive modeling, portfolio formation, and performance evaluation.
Three classical cross-sectional equity factors are examined—Value, Momentum, and Size—following well-established definitions from the academic literature. Using adjusted historical price data, the project builds monthly factor exposures, estimates factor premia through Fama–MacBeth regressions, evaluates predictive power with Information Coefficient (IC) and Information Ratio (IR), and constructs a long–short portfolio based on predicted returns.
Although the stock universe is intentionally small for demonstration, the structure is fully extensible and mirrors the methodology used in professional quantitative investment workflows. This project serves as a foundational template for developing scalable multi-factor models, testing alpha signals, and building systematic equity strategies.
Multi-Factor-Equity-Strategy-Backtesting/
│
├── src/
│ └── data_download.py
│ └── prepare_monthly_prices.py
│ └── compute_monthly_returns.py
│ └── compute_momentum_12_1.py
│ └── compute_value_size_factors.py
│ └── run_fama_macbeth_regression.py
│ └── compute_factor_ic_ir.py
│ └── evaluate_performance.py
│
├── data/
│ └── adj_close_prices.csv
│ └── monthly_prices.csv
│ └── monthly_returns.csv
│ └── momentum_12_1.csv
│ └── size_factor.csv
│ └── value_factor.csv
│ └── fama_macbeth_coefficients.csv
│ └── fama_macbeth_tstats.csv
│ └── factor_ic.csv
│ └── factor_ir.csv
│ └── performance_summary.csv
│
├── plots/
│ └── performance_curve.png
│
└── README.md
- Universe:
AAPL,MSFT,GOOGL,AMZN - Data source: Yahoo Finance (
yfinance) - Frequency: Daily
auto_adjust=True→ adjusted prices stored in'Close'- Saved as
data/adj_close_prices.csv
- Converted to month-end prices using:
resample("M").last() - Saved as
monthly_prices.csv
- Computed using:
pct_change - Saved as
monthly_returns.csv
- $\text{Momentum}{t} = \frac{P{t-1}}{P_{t-12}} - 1$
- Saved as
momentum_12_1.csv
$\text{Size} = \mathrm{log}(\text{MarketCap}) = \mathrm{log}(\text{Share Price} \times \text{Shares Outstanding})$ $\text{Value} = \frac{1}{PE} = \frac{1}{(\frac{\text{Price per share}}{\text{Earning per share}})}= \frac{\text{Earning per share}}{\text{Price per share}}$ - Pulled
marketCap+trailingPEfrom Yahoo Finance - Saved
size_factor.csvandvalue_factor.csv
$R_{i,t+1} = \beta_{Value,t} \cdot Value_{i,t} + \beta_{Momentum,t} \cdot Momentum_{i,t} + \beta_{Size,t} \cdot Size_{i,t}$ - Outputs:
fama_macbeth_coefficients.csvfama_macbeth_tstats.csv
-
$IC_t = \text{corr}(\text{FactorScore}{i,t}, R{i,t+1})$
-
$IR = \frac{\mathbb{E}[IC]}{\sigma(IC)}$ -
Save
factor_ic.csvandfactor_ir.csv -
IR Results:
Factor IR Momentum –0.07 Size 0.03 Value –0.08 -
Interpretation:
All IR values are close to zero, which is expected given the very small four-stock universe (megacap tech). The limited cross-sectional dispersion prevents meaningful factor signals from emerging.
-
Predicted returns:
$\widehat{R}{i,t+1} = \beta{Value,t} \cdot Value_{i,t} + \beta_{Momentum,t} \cdot Momentum_{i,t} + \beta_{Size,t} \cdot Size_{i,t}$
-
Portfolio Rules:
- Long the top 2 predicted-return stocks
- Short the bottom 2 stocks
-
Saved as:
long_short_returns.csv
-
Metrics:
-
Cumulative Return $\mathrm{Cumulative}t = \prod{s=1}^{t}(1 + R_{LS,s})$
-
Annualized Sharpe Ratio
$\mathrm{Sharpe} = \frac{\sqrt{12},\mathbb{E}[R]}{\sigma(R)}$ -
Maximum Drawdown $\mathrm{MDD} = \min_t\left(\frac{\mathrm{Cumulative}t}{\max{s \leq t}\mathrm{Cumulative}_s} - 1\right)$
-
-
Saved:
performance_summary.csvperformance_curve.png
The results of this project demonstrate the full workflow of a classical multi-factor equity research pipeline. Although the stock universe is intentionally small (AAPL, MSFT, GOOGL, AMZN), the methodology mirrors professional quantitative research practices, including factor construction, predictive regressions, IC/IR analysis, and long–short portfolio evaluation.
- Momentum IR: –0.07
- Size IR: 0.03
- Value IR: –0.08
These near-zero IR values reflect the limited cross-sectional dispersion in the four-stock universe rather than weaknesses in the factor definitions.
- Cumulative Return: +84%
- Annualized Sharpe Ratio: ~0.46
- Maximum Drawdown: –26%
The cumulative performance curve of the long–short factor strategy reveals several distinct phases:
-
Sideways Movement (2016–2018)
Cumulative returns fluctuated between 1.00 and 1.12, indicating limited factor dispersion among the four megacap tech stocks. -
Drawdown (2018–2020)
A decline to below 0.90 occurred. With only four stocks, idiosyncratic risk has an outsized impact. -
Recovery (2020–2023)
Gradual improvement as predicted rankings aligned better with realized returns. -
Strong Upside (2024–2025)
Cumulative return accelerated from ~1.30 to above 1.80.
Overall, these results illustrate the mechanics of factor modeling and backtesting rather than tradable performance. The framework can be extended to larger and more diverse universes for realistic alpha research.
- Add Jupyter Notebook walkthrough
- Expand universe to 100–500 stocks
- Add more factors (Quality, Low Vol, Profitability)
- Introduce transaction cost & turnover modeling
- Apply portfolio optimization (e.g., risk parity, constrained optimization)
matplotlib
numpy
pandas
statsmodels
yfinance
- With
auto_adjust=True, adjusted prices appear under'Close'instead of'Adj Close' - The project is fully modular: every step is its own script
This project demonstrates the complete workflow of a classical multi-factor equity research pipeline:
data ingestion → factor construction → predictive regression → IC/IR evaluation → long–short portfolio → performance analysis.
Even though the stock universe is intentionally small, the methodology mirrors professional quant research practices used in hedge funds, asset managers, and academic finance. The framework is fully extensible to larger universes, more sophisticated factor models, and real-world portfolio construction techniques.
This project provides a solid foundation for further work in multi-factor modeling, portfolio management, and quantitative investment research.
