This repository contains the code we wrote during the Data4Good Challenge 2020, organized by EMERGENT Leuven, on November 4. Out of 32 teams, we managed to win the overall first place.
The event gathers 150 students, together with experts in data science, machine learning and business strategy around the common goal of solving a complex social problem. Teams work together to get insights from a number of datasets and use it to formulate a strategy for this social issue. Pitches are evaluated by corporate jury members.
You can find out more here.
We were challenged to come up with a solution to eliminate racial bias in a recidivism algorithm used by the US Justice System:
"The US justice system makes use of algorithms that assign risk scores to offenders of various
crimes. These scores are then used by judges to determine the type and length of the sentence. One
such algorithm used by the states of New York, Wisconsin, California, Florida’s Broward County,
and other jurisdictions is the Compas algorithm developed by Equivant (formerly known as
NorthPointe Inc.).
In recent years, more and more data about these algorithms became publicly available and now it is
the perfect time to properly study these algorithms. Your team’s job is to study whether or not
these algorithms cause bias in the US Justice System and how this affects various
stakeholders."
Our main idea was not to focus on improving the algorithm itself, but instead focusing more on the bias that was present in the training data. We used additional recources to prove that the data was indeed biased and came up with solutions to reduce this bias, so the output would be more fair.
Raphaël Widdershoven, Maarten Wens & Stijn Verpoest.