A Python-based tool for analyzing movie metadata and user ratings from Kaggle datasets. Features a command-line interface (CLI) for filtering, statistics, and visualizations, plus a Streamlit web dashboard for interactive exploration. Built as part of a 6-month self-taught Python career transition plan.
- Data Loading: Imports movie metadata (titles, genres, vote averages) and user ratings from CSV into SQLite for persistent storage.
- Filtering: SQL-based filters for genres (multi-genre OR matching), rating ranges, or both.
- Statistics: Genre-based counts and average ratings using SQL aggregations (json_each for multi-genres).
- Ratings Query: Fetch user ratings for a specific movie via SQL joins, with average calculation.
- Visualizations: Bar plots for genre counts, saved as PNG (Matplotlib).
- Save/Load: Export/import filtered data to/from CSV/JSON.
- Web Dashboard: Streamlit UI for dynamic filtering, data display, stats computation, and plotting.
- Error Handling: Robust checks for file existence, invalid inputs, NaN values, and parsing errors.
- Clone the repository:
git clone https://github.com/HuntedCode/python-movie-analyzer.git cd python-movie-analyzer - Install dependencies (Python 3.8+ required):
(SQLite is built-in; no additional install needed.)
pip install pandas matplotlib streamlit
- Download the datasets: Get
movies_metadata.csvandratings.csvfrom Kaggle’s The Movies Dataset. Place them in the project folder.
Run the CLI for console-based interaction:
python main.py- Commands:
view(display data),filter(genre/rating),stats(genre averages/counts),plot(genre bar chart),ratings(user ratings per movie),save/load(CSV/JSON),refresh(reload data),help,exit. - Example: Filter for "Drama" movies rated 7-9, compute stats, plot genres.
Run the Streamlit dashboard for browser-based use:
streamlit run app.py- Opens at http://localhost:8501.
- Sidebar: Input genres (space-separated), slider for min/max rating.
- Main page: Displays filtered data table, buttons for stats (dataframe) and plot (bar chart).
- Example: Enter "Comedy Action", set ratings 8-10, click "Compute Stats" and "Generate Plot".
Note: For educational use. Respects Kaggle terms—use datasets responsibly.
main.py: CLI entry point.app.py: Streamlit dashboard entry point.cli.py: CLI commands and logic (filters, stats, plots).db.py: SQLite database management (loading, queries, joins, aggregations).movies_metadata.csv/ratings.csv: Kaggle input data (not in repo—download separately).
Feedback welcome! Fork the repo, make changes, and submit a pull request. Report issues on GitHub.
MIT License—free to use and modify.
Built by Jeffrey Lowe as part of a 6-month Python learning plan for remote coding jobs. Last updated: August 5, 2025.