Recipe Recommendation System

A comprehensive machine learning-based recipe recommendation system that provides personalized recipe suggestions using collaborative filtering, content-based filtering, and hybrid approaches.

Dataset

This project uses the Food.com Recipe and Interaction Dataset, which contains:

180,000+ recipes with ingredients, nutritional information, and user ratings
700,000+ user interactions and reviews

Dataset Attribution

Source: Food.com (originally Recipe1M dataset)
Kaggle Dataset: Food.com Recipes and Interactions
Original Paper: "Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images"
Citation: If you use this dataset, please cite the original work

Dataset Files Required

RAW_recipes.csv: Recipe information including ingredients, tags, nutrition facts, and cooking time
RAW_interactions.csv: User-recipe interactions with ratings and reviews

Note: Due to size constraints, the actual dataset files are not included in this repository. Please download them from the Kaggle link above and place them in the data/code/datasets/ directory.

Features

Hybrid Recommendation Engine: Combines collaborative filtering (SVD) and content-based filtering (TF-IDF)
User Profile Management: Supports dietary preferences, cuisine choices, and cooking time constraints
Interactive CLI Interface: Easy-to-use command-line interface for new and existing users
Smart Categorization: Automatic recipe categorization with clustering for uncategorized recipes
Rating System: Users can rate recipes to improve future recommendations
Data Pipeline: Complete preprocessing and model training pipeline

Project Structure

recipe-recommendation-system/
├── data/
│   ├── Scripts/
│   │   ├── recommender_app.py              # Main recommendation application
│   │   ├── preprocess_recipes_and_build_initial_models.py  # Data preprocessing & model training
│   │   ├── preprocess_interactions.py      # Interaction data preprocessing
│   │   └── retrain_models.py              # Model retraining utilities
│   ├── code/
│   │   ├── datasets/                      # Raw data files (user must download)
│   │   │   ├── RAW_recipes.csv           # [DOWNLOAD REQUIRED]
│   │   │   ├── RAW_interactions.csv      # [DOWNLOAD REQUIRED]
│   │   │   └── README.md
│   │   └── *.ipynb                        # Jupyter notebooks for analysis
│   └── processed/                         # Generated processed data files
│       └── README.md
├── models/                                # Generated trained ML models
│   └── README.md
├── reports/                               # Generated visualization outputs
├── requirements.txt                       # Python dependencies
├── setup.py                             # Project setup script
├── QUICKSTART.md                         # Quick start guide
├── test_recommendations.py              # Test suite
├── project_health_check.py              # System validation
├── LICENSE                               # MIT License
└── README.md                            # This file

Key Files Description

recommender_app.py: Main CLI application for getting recommendations
preprocess_recipes_and_build_initial_models.py: Complete data pipeline from raw data to trained models
setup.py: Automated setup script for dependencies and NLTK data
QUICKSTART.md: Step-by-step getting started guide

Getting Started (Quick Setup)

Prerequisites

Python 3.8 or higher
At least 4GB RAM (8GB recommended for preprocessing)
2GB free disk space
Internet connection for downloading dependencies and dataset

Installation Steps

Clone the repository:

git clone https://github.com/yourusername/recipe-recommendation-system.git
cd recipe-recommendation-system

Create and activate virtual environment (recommended):

# Create virtual environment
python -m venv venv

# Activate (Windows)
venv\Scripts\activate

# Activate (macOS/Linux)
source venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```

Download NLTK data (required for text processing):

python -c "import nltk; nltk.download('stopwords'); nltk.download('wordnet'); nltk.download('punkt')"

Data Setup

Download the Food.com Dataset:
- Visit Food.com Recipes and Interactions on Kaggle
- Download the dataset (requires free Kaggle account)
- Extract the files

Place data files in data/code/datasets/:

data/code/datasets/
├── RAW_recipes.csv     (~500MB)
└── RAW_interactions.csv (~300MB)

Run preprocessing to prepare the data and build initial models:
```
python data/Scripts/preprocess_recipes_and_build_initial_models.py
```
This step will:
- Clean and process the recipe data
- Extract features for content-based filtering
- Filter and prepare interaction data for collaborative filtering
- Train initial SVD and TF-IDF models
- Create recipe clusters and categories
- Generate processed files in data/processed/ and models/

⚠️ Important: The preprocessing step may take 10-30 minutes depending on your system, as it processes 180K+ recipes and 700K+ interactions.

Usage

Running the Recommendation System

python data/Scripts/recommender_app.py

Features Available:

New User: Create a profile with dietary preferences and get personalized recommendations
Existing User: Get recommendations based on past ratings and preferences
Rating System: Rate recommended recipes to improve future suggestions

Sample Workflow:

For New Users:
- Choose "New User" option
- Create profile with dietary preferences (vegetarian, vegan, etc.)
- Specify preferred cuisines and favorite ingredients
- Set maximum cooking time
- Receive personalized recommendations
- Rate recipes to improve future suggestions
For Existing Users:
- Choose "Existing User" option
- Enter your User ID
- Receive hybrid recommendations based on your history
- Rate new recipes to update your profile

Technical Details

Machine Learning Models

Collaborative Filtering:
- Algorithm: SVD (Singular Value Decomposition) using scikit-surprise
- Purpose: Predict user ratings based on similar users' preferences
- Features: User-item interaction matrix with ratings 1-5
Content-Based Filtering:
- Algorithm: TF-IDF vectorization with cosine similarity
- Purpose: Find recipes similar to those a user has liked
- Features: Recipe text (name, description, ingredients, tags)
Clustering:
- Algorithm: K-Means clustering with optimal k selection
- Purpose: Group similar recipes for better categorization
- Features: Ingredient vectors, tag vectors, and nutritional data
Hybrid Approach:
- Method: Weighted combination of CF and content-based scores
- Weight: 70% collaborative filtering + 30% content-based (configurable)
- Fallback: Content-based and popularity-based for new users

Key Technologies

Machine Learning: scikit-learn, scikit-surprise
Data Processing: pandas, numpy
Text Processing: NLTK, TF-IDF vectorization
Similarity: Cosine similarity, RapidFuzz for fuzzy matching
Database: SQLite for user profiles and ratings
Visualization: matplotlib, seaborn

Data Processing Pipeline

Recipe Preprocessing:
- Nutritional information extraction
- Dietary restriction detection
- Text feature preparation for content-based filtering
- Recipe categorization and clustering
Interaction Preprocessing:
- Data cleaning and deduplication
- User/recipe activity filtering
- Rating normalization
Model Training:
- SVD model for collaborative filtering
- TF-IDF vectorizer for content similarity
- K-Means clustering for recipe grouping

Testing and Validation

Run the test suite to verify everything is working:

python test_recommendations.py
python project_health_check.py

Troubleshooting

Common Issues

"Dataset files not found"
- Ensure you've downloaded and placed RAW_recipes.csv and RAW_interactions.csv in data/code/datasets/
- Check file names match exactly (case-sensitive)
"scikit-surprise import error"
```
pip install scikit-surprise
```

NLTK data missing

python -c "import nltk; nltk.download('stopwords'); nltk.download('wordnet')"

Memory issues during preprocessing
- Close other applications
- Use a machine with at least 8GB RAM
- Consider processing a subset of data first
Long preprocessing time
- This is normal - the full dataset preprocessing takes 15-30 minutes
- You can monitor progress through the console output

Performance Tips

Use SSD storage for faster data loading
Ensure sufficient RAM (8GB recommended)
Close unnecessary applications during preprocessing

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/new-feature)
Commit your changes (git commit -am 'Add new feature')
Push to the branch (git push origin feature/new-feature)
Create a Pull Request

Future Improvements

Web-based user interface
Deep learning models for better recommendations
Real-time recommendation updates
Advanced user preference learning
Recipe image analysis
Social features (sharing, reviews)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Food.com and Kaggle for providing the comprehensive recipe and interaction dataset
scikit-surprise library developers for excellent collaborative filtering tools
scikit-learn community for robust machine learning algorithms
NLTK team for natural language processing capabilities
Original researchers of the Recipe1M+ dataset for inspiring this work

Dataset License

Please ensure compliance with the Food.com dataset license terms available on Kaggle. This project is for educational and research purposes.

Disclaimer

This recommendation system is built for educational purposes. The dataset used belongs to Food.com and is distributed through Kaggle under their respective licenses.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Recipe Recommendation System

Dataset

Dataset Attribution

Dataset Files Required

Features

Project Structure

Key Files Description

Getting Started (Quick Setup)

Prerequisites

Installation Steps

Data Setup

Usage

Running the Recommendation System

Features Available:

Sample Workflow:

Technical Details

Machine Learning Models

Key Technologies

Data Processing Pipeline

Testing and Validation

Troubleshooting

Common Issues

Performance Tips

Contributing

Future Improvements

License

Acknowledgments

Dataset License

Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
models		models
reports		reports
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
check_db.py		check_db.py
project_health_check.py		project_health_check.py
requirements.txt		requirements.txt
setup.py		setup.py
test_recommendations.py		test_recommendations.py
test_system.py		test_system.py
ui_test.py		ui_test.py

License

zyna-b/Food-Recommendation-System

Folders and files

Latest commit

History

Repository files navigation

Recipe Recommendation System

Dataset

Dataset Attribution

Dataset Files Required

Features

Project Structure

Key Files Description

Getting Started (Quick Setup)

Prerequisites

Installation Steps

Data Setup

Usage

Running the Recommendation System

Features Available:

Sample Workflow:

Technical Details

Machine Learning Models

Key Technologies

Data Processing Pipeline

Testing and Validation

Troubleshooting

Common Issues

Performance Tips

Contributing

Future Improvements

License

Acknowledgments

Dataset License

Disclaimer

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages