Skip to content

End-to-end Netflix data analysis project focusing on data cleaning, exploratory data analysis, and visualization using Python, Pandas, Seaborn and Matplotlib.

Notifications You must be signed in to change notification settings

Nitisha707/Netflix-Data-Analysis-and-Visualization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 

Repository files navigation

.

🎬 Netflix Data Analysis & Visualization 📌 Project Overview

This project presents an end-to-end data analysis and visualization of Netflix movie data using Python. The aim is to extract meaningful business insights by answering key analytical questions and presenting findings through clear, well-structured visualizations.

The project follows a real-world Exploratory Data Analysis (EDA) workflow, making it suitable for data analyst internships and entry-level roles, including Netflix.

🔗 Project Notebook: 👉 Netflix Data Analysis.ipynb

🎯 Objectives of the Project

Perform structured data cleaning and preprocessing

Answer 5 core analytical questions

Create multiple visualizations per question

Identify trends related to popularity, ratings, genres, votes, and release years

Demonstrate data storytelling and analytical thinking

🛠️ Tools & Technologies Used

Python

Pandas – data manipulation & cleaning

Matplotlib & Seaborn – data visualization

Jupyter Notebook (PyCharm)

Git & GitHub

📂 Dataset Description

The dataset contains Netflix movie information with the following columns:

Title – Movie name

Release_Date – Year of release

Genre – Movie genre

Popularity – Popularity score

Vote_Count – Number of user votes

Vote_Average – Average user rating

Each movie may appear multiple times due to multiple genres, allowing genre-wise analysis.

🧹 Data Cleaning & Preparation

The following steps were performed before analysis:

Removed missing and inconsistent values

Standardized column formats

Ensured correct data types

Handled duplicate movie-genre records

Prepared data for visualization-ready analysis

📊 Analytical Questions & Insights

Question 1: What is the distribution of movie popularity on Netflix?

Identified highly popular movies

Visualized popularity distribution to understand content reach

📌 Visualizations used: Bar charts, distributions

📓 View full analysis in notebook



Movie Genre Distribution

Movies vs TV Shows Distribution

Movies vs TV Shows Distribution

Question 2: Which genres are most common on Netflix?

Action, Adventure, and Drama appear most frequently

Shows Netflix’s focus on high-demand genres

📌 Visualizations used: Count plots, bar charts

vote average disribution

vote average disribution

vote average disribution

Question 3: Which genres receive the highest popularity and votes?

Action and Sci-Fi genres show strong popularity

Higher popularity correlates with higher vote counts

📌 Visualizations used: Bar plots, comparison charts

Popularity

Popularity

Question 4: How do ratings (Vote_Average) vary across genres and years?

Most movies fall in the 6–8 rating range

Ratings remain relatively stable over the years

📌 Visualizations used: Box plots, trend charts

Low Popular

Low Popular

Question 5: What is the relationship between popularity, votes, and release year?

Movies with higher popularity tend to receive more votes

Recent years show increased content production

📌 Visualizations used: Scatter plots, year-wise analysis

Most Filmmed

📌 Additional Insights & Visualizations

Beyond the core analytical questions, the following visualizations provide deeper insights into Netflix’s content strategy and audience engagement patterns.

🎥 Movies vs TV Shows on Netflix

Netflix’s catalog is dominated by movies, highlighting its strong focus on film-based content.

Distribution of Movies vs TV Shows on Netflix

📅 Content Growth Over the Years

Netflix shows a steady increase in content production, especially after 2015, reflecting platform expansion.

Netflix content release trend over the years

🔥 Genre-wise Popularity Analysis

Action, Adventure, and Sci-Fi genres attract higher popularity scores, indicating strong viewer interest.

Average popularity by genre on Netflix

📊 Popularity vs User Engagement

Movies with higher popularity tend to receive more votes, showing a positive correlation between reach and engagement.

Relationship between popularity and vote count

⭐ Rating Trends Over Time

Despite increased content volume, Netflix maintains consistent average ratings over the years.

Vote average trends across years

Most Popular Movies

Vote average trends across years

📈 Key Insights Summary

• Action & Adventure genres dominate Netflix’s popular content
• Movies generate higher user engagement compared to TV Shows
• Higher popularity strongly correlates with increased vote counts
• Netflix has consistently expanded content production over the years
• Average ratings remain stable, reflecting maintained content quality
• Genre-wise analysis highlights clear audience preference patterns
• Data visualization enhances clarity and business decision-making

🚀 Why This Project Stands Out

✔ Industry-style EDA structure ✔ Clear analytical questions ✔ Multiple visualizations per insight ✔ Clean, readable, recruiter-friendly notebook ✔ Demonstrates real-world data analytics workflow

▶️ How to Run the Project

Clone the repository

git clone https://github.com/Nitisha707/Netflix-Data-Analysis-and-Visualization.git

Install required libraries

pip install pandas matplotlib seaborn

Open Netflix Data Analysis.ipynb in Jupyter or PyCharm

Run all cells sequentially

👩‍💻 Author

Nitisha Sharma Aspiring Data Analyst | Python | Data Visualization

About

End-to-end Netflix data analysis project focusing on data cleaning, exploratory data analysis, and visualization using Python, Pandas, Seaborn and Matplotlib.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors