Skip to content

com-480-data-visualization/Medalytics

Repository files navigation

Student's name SCIPER
Hugo Jeannin 329220
Albert Fares 341018
Daniel Polka 326800
Medalytics Logo

COM-480 - Data Visualization

École Polytechnique Fédérale de Lausanne (EPFL)
Authors: Albert Fares, Hugo Jeannin, Daniel Polka

EPFL Logo

📌 Project Milestones


Milestone 3 – Final Delivery

🖥️ ACCESS THE WEBSITE

(If the main domain is unavailable, use the backup: medalytics.netlify.app)

🧭 Note: For the best experience, we recommend viewing the website on Safari.


📁 Project Artifacts

  • 🔧 Codebase:
    The full source code for the website is located in the code/ folder of the repository.

  • 📄 Documentation:
    The README.md file inside the code/ folder contains detailed information on how the website works, including setup instructions and explanations of its features.

  • 📘 Process Book:
    You can find the process book in the root directory under the filename ProcessBook.pdf. It outlines our design decisions and implementation strategy.

  • 🎥 Showcase Video:
    A video walkthrough of the website is available here:
    🔗 Watch the video


Milestone 2 – Development Progress

You can find the full project goal description in the Medalytics_M2.pdf file located in the root directory of this repository.

The current prototype of the Medalytics website is available at:
🌐 www.medalytics.ch OR www.medalytics.netlify.app if the first link doesn't work.


Milestone 1 – Project Proposal

Medalytics, a data-driven exploration of the Olympic Games


Datasets

For this project, we will use multiple datasets from a GitHub repository owned by Tech YouTuber and MIT graduate Keith Galli. The data originally came from Olympedia.org and provides information about summer and winter Olympic games from 1896 to 2022.

The repository contains both the raw data and the data cleaned by Galli, so we will mostly use the data that has already been processed. The data cleaning process is detailed in a Jupyter notebook provided in the repository and consists mostly of parsing dates, removing bullet points from name fields, and so on. The remaining data is fairly well organized but still had a few issues; for instance, the NOC column was named noc in one of the cleaned .csv files and NOC in the others, and certain country names weren't constant in the various files (for example, USA and United States were used interchangeably).

The file bios_locs.csv contains biographical information on each athlete such as their name, country of origin, date of birth, NOC (National Olympic Committee), height, weight, and so on. Details of each event athletes competed in and their results in that event can be found in results.csv.

The repository also contains additional .csv files with information on each country's population and their corresponding NOC.


Problematic

Overview

The Olympic Games stand as the ultimate stage for international sports competition, uniting athletes from across the world to showcase their skills and determination. Since the first modern Games in 1896, the Olympics have undergone profound transformations—not only in terms of athletic performance but also in their social, political, and cultural impact.

We think that it would be interesting to show, among other things, the number of athletes and medals per country, ranked by the medals/number_of_athletes ratio, and the evolution of the proportion of men to women from the very first Olympic Games up to the 2022 Games (the most recent in the dataset). We'd also like to make a few interactive maps, with one showing which countries hosted games and in which year, and another showing the statistics of each country (number of medals and athletes).

The reasons for choosing this dataset were simply that Lausanne is the Olympic capital and because the last Olympic Games were held next door to Switzerland, in Paris.

Target Audience

The Olympics captivate a diverse audience—from sports enthusiasts and historians to analysts and policymakers. Our project is designed to engage casual fans, who may be drawn to the visualization for fun facts, medal counts, and record-breaking moments. It could also be of interest to sports historians and researchers, looking to analyze long-term trends in performance, participation, and shifting global dominance.

Key Questions & Objectives

Through our data visualization, we seek to explore fundamental questions such as:

  • Who are the best athletes by game, country, age...?
  • What trends emerge in gender participation, new sports, and shifting national dominance?
  • How do different nations compare in their Olympic journeys?
  • Does population size or economic strength play a significant role in medal counts?

Exploratory Data Analysis

To help visualize the data at this stage, we plotted three graphs using pandas and Matplotlib (you can see the code used to generate them in the analyze_data.ipynb notebook).

The first graph captures the evolution of the proportion of men to women over the years (blue line), with the bars representing the absolute number of men and women athletes.

Evolution of the proportion of men to women

The second and third graphs represent the total number of athletes and medals (cumulated over the years), respectively per country and per US athletes' birthplace. The countries and cities (top 20) are ranked by the medals/num_athletes ratio.

Number of athletes and medals per country

Number of athletes and medals per birthplace


Related Work

Keith Galli provided a video on how he cleaned the datasets we are using, but we don't know about any further use of the data.

We want to design a dynamic experience that includes a clickable world map for host countries and medal-winning nations, inspired by implementations we have seen in previous COM-480 projects.

Unlike traditional medal tables or charts, our project will provide an interactive experience where users will be able to explore Olympic history dynamically.

Additionally, we will create sortable rankings for athletes based on gender, age, and sport, allowing users to uncover patterns in Olympic history while simultaneously personalizing their exploration. To further enhance the experience, we plan to implement a bar chart race animation for the most decorated athletes, drawing inspiration from similar visualizations often seen on social media, particularly in music billboard rankings.

We also aim to highlight emerging trends in Olympic sports by exploring the introduction of new disciplines and identifying the most frequently played sports throughout the Games. This could be visualized via a timeline of new sports introduction.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •