This project performs an in-depth exploratory data analysis (EDA) on a curated dataset of movies to uncover patterns related to runtime, genre, revenue, director impact, and IMDb ratings. Using Python-based data visualization and analysis, the goal is to answer: What makes a movie successful? and Which factors influence a film’s critical and commercial performance? Insights from this analysis can help filmmakers, producers, and data enthusiasts understand key success factors in global cinema.
- Understand the distribution of movie runtimes, IMDb ratings, and global gross earnings.
- Identify top-performing directors by both quantity and average revenue.
- Explore how genre influences movie length and box office success.
- Analyze the relationship between IMDb scores and revenue.
- Visualize trends and extract actionable insights from the data.
- Descriptive Statistics — Summarized key measures such as mean, median, and spread for revenue, runtime, and ratings.
- Correlation Analysis — Measured relationships between IMDb ratings, runtime, and worldwide gross using correlation coefficients.
- Data Visualization (Box, Histogram, Scatter) — Explored distributions, outliers, and variable relationships visually.
- Grouping and Aggregation (Genre/Director Analysis) — Compared performance metrics by genre and director to identify top contributors.
- Poisson Modeling (Director Productivity) — Modeled the frequency of movie releases per director to understand productivity trends.
- Trend Analysis (Over Time) — Examined changes in runtime, ratings, and revenue across different years.
Shows a strong positive correlation between IMDb ratings (approval) and global revenue.
Highlights how different genres vary in movie duration.
IMDb ratings mostly fall between 5.5–7.5, with a central tendency near 6.5.
Majority of movies range between 85 and 115 minutes.
Directors like Christopher Nolan and James Cameron lead in per-film revenue, often exceeding 500M USD.
Most directors made 1–3 movies; very few directed more than 10.
Drama, Comedy, Action, and Thriller dominate genre frequency.
Fantasy, Action, and Adventure genres bring in the highest average global earnings.
- Movie Length: Most movies last between 85–115 minutes, peaking around 100 minutes.
- Approval vs Success: Films rated 6–8 are most likely to gross above 1B USD; poorly rated ones (<4) usually underperform.
- Director Impact:
- Steven Spielberg and Clint Eastwood lead in the number of movies made.
- James Cameron and Christopher Nolan have the highest average gross (400M–600M USD per movie).
- Genre Patterns:
- Fantasy, Adventure, and Action lead in revenue (up to 200M USD on average).
- Comedy and Horror movies are shorter (~90 minutes); Adventure and Fantasy are longer (~130+ minutes).
- IMDb Ratings: Follow a bell-shaped curve with a peak around 6.5.
- Director Count Distribution: Fits a Poisson distribution — most directors have small filmographies.
- Python 3
- pandas
- numpy
- matplotlib / seaborn
- scipy / statsmodels
- Jupyter Notebook
Follow these steps to set up the project locally and run the analysis:
1. Clone the Repository:
git clone https://github.com/indu-explores-data/Movie-Performance-Analysis.git2. Navigate to the Project Directory:
cd Movie-Performance-Analysis
3. Create and Activate a Virtual Environment (Recommended):
python -m venv venv
Windows:
venv\Scripts\activate
Mac/Linux:
source venv/bin/activate
4. Install Required Libraries:
pip install pandas numpy matplotlib seaborn scipy statsmodels jupyter
5. Launch Jupyter Notebook:
jupyter notebook
6. Open Movie Performance Analysis.ipynb and run all cells to reproduce the analysis.
- Open Movie Performance Analysis.ipynb in Jupyter Notebook.
- Run all cells sequentially to reproduce the visualizations and insights.
- Key insights can be found in the final sections, and visualizations are saved in the images/ folder.
Let’s connect on LinkedIn for project discussions or data-driven collaborations:
If you found this project helpful, please ⭐ star the repository and share your thoughts. Suggestions and contributions are always welcome!







