Open-Food-Facts-Data-Analysis

We will be analyzing the Open Food Facts Dataset as Part of The EPFL Applied Machine Learning Course (Project 2). The purpose of this project is to exercise and demonstrate acquired abilities. Should the person who is looking at this analysis have any questions or suggestions, do not hesitate to contact me.

The Data

The actual file is too large (>1GB) in order to be added to this repository. However, it can easily be found and downloaded here: https://www.kaggle.com/openfoodfacts/world-food-facts/downloads/en.openfoodfacts.org.products.tsv/5
Open Food Facts is a non-profit association of volunteers. 5000+ contributors have added 600 000+ products from 150 countries. The Dataset is a free, open, collaborative database of food products from all around the world, with ingredients, allergens, nutrition facts and information we can find on product labels. The dataset contains a single table, FoodFacts, in CSV (TSV) form in FoodFacts.csv and in SQLite form in database.sqlite. There are more than 300,000 rows over 163 columns. However, as we will see there are a lot of missing or obviously incorrect values.

Data Wrangling

We will take a closer look at our data and eliminate all columns that are either (mostly) empty or irrelevant for our investigation. We will also get rid of any duplicates in our data. We will subsequently try to bring the remaining data in a more useful format and clean up any invalid entries. We will use the "serving_size" column to try to extract information on whether the product is liquid or solid.

Exploration/Investigation

In the first half of our investigation we will primarily focus and on the following three areas:

Frequencies (how are our variables distributed and what are the most common additives and ingredients?)
Characteristics (what are the characteristics of the different categories and countries?)
Relationships (how are the variables related to each other)

We will subsequently investigate our data with correlations and regression analysis.

We will finish this project by summarizing our findings

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Open Food Facts.ipynb		Open Food Facts.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open-Food-Facts-Data-Analysis

The Data

Data Wrangling

Exploration/Investigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Open-Food-Facts-Data-Analysis

The Data

Data Wrangling

Exploration/Investigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages