Analysis of IRL streams to identify patterns that differentiate high-performing streamers from low-performing ones.
- Python 3.8+
- ffmpeg (for audio/video processing)
python3 -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txtEdit src/download.py and replace the placeholder URLs with your 4 selected streams:
python run_pipeline.pyThis will automatically:
- Download stream segments (15 minutes each)
- Extract features from audio and video
- Generate statistical analysis and visualizations
results/features.csv- Extracted featuresresults/comparison_boxplots.png- Visual comparisonresults/analysis_summary.txt- Statistical summary
- Vocal Energy Variance - Measures animation and reactivity in speech
- Dead Air Ratio - Percentage of time with no speech activity
- Visual Motion Intensity - Physical engagement via frame differencing
- Speaking Rate - Percentage of time actively talking
- Hook Effectiveness - Energy in first 60 seconds vs overall average
- Sample representativeness: 15-minute segment represents full stream quality
- Audio presence: Assumes streamers use microphones (not silent streams)
- Performance classification: Based on viewership metrics (concurrent viewers, total views)
- Niche consistency: All streams are IRL format
- Small sample size - feel limited by choice of samples
- No chat analysis - missing critical engagement metric for the real live stream
- Manual classification - subjective performance labels
- Documented selection criteria for transparency
- Segment sampling - 15 minutes may not capture full stream dynamics
ffmpeg not found
Install via package manager: brew install ffmpeg (macOS) or apt-get install ffmpeg (Linux)
All features are NaN
Video files may be corrupted. Verify with: ffmpeg -i data/high_1.mp4
No significant differences found
Normal with small sample size.
stream-labs/
├── src/
│ ├── download.py # Stream downloader
│ ├── extract.py # Feature extraction
│ └── analyze.py # Statistical analysis
├── data/ # Downloaded VODs (gitignored)
├── results/ # Output files
├── requirements.txt
├── README.md
└── insight_summary.md