- Introduction
- Project Overview
- Installation
- Project Structure
- Project Overview
- Acknowledgements
- Contact
It aims to examine the gender representation among current state leaders. It does this through construction of a Resource Description Framework (RDF), which allows for visualisations such as the distribution of characteristics.
The RDF can be found in the data folder. The results and insights in this project showcases the inequality of specific cultural dominances. Additionally, this project can be used as a baseline for further knowledge graph construction and Wikipedia information scraping.
Data stems from the World Leaders Wikipedia page. This results in the analysis of 196 world leaders as characterised by the web page. The following Research Questions were asked:
- What is the gender distribution of current state leaders globally?
- Are there differences in Wikipedia visibility between male and female leaders (e.g., page length, references, cross-language presence)?
The following steps were taken to construct the RDF:
- Collection of data from the World Leaders page.
- Collection of data from each respective world leader page.
- Cleaning and normalisation of data.
- Extraction of subjects, predicates and objects.
- Construction of knowledge graph.
Clone the repository.
git clone https://github.com/11907223/lesroostersNavigate to the directory.
Install the required dependencies:
conda env create -f environment.ymlWindows:
irm https://astral.sh/uv/install.ps1 | iexmacOS:
curl -LsSf https://astral.sh/uv/install.sh | shLinux:
curl -LsSf https://astral.sh/uv/install.sh | shIn VS Code’s take the following steps:
- Run Task
- Setup Python env (uv)
- Continue without scanning the task output
- Wait
- A folder named .venv will appear, which should indicate a succesfull installation.
- notebooks: Contains all Jupyter notebooks used to create the RDF.
- data: Contains the raw data in csv files and a relational file in a RDF Turtle file.
- output: Contains image files of results and a HTML file of knowledge graph.
- environment.yml File specifying installation of Python environment through Conda.
- uv.lock File specifying installation of Python environment through uv.
It was found that there was a much larger number of male world leaders than female world leaders as can be seen in Figure 1.

Figure 1. Gender distribution of world leaders
Within those Wikipedia pages, more differences are visible: male world leaders have longer Wikipedia pages than females, and have a lower representation in different languages as well. As a possible natural consequence of their shorter pages, female world leaders have a lower number of references within their pages, see Figure 2.

Figure 2. Visibility of world leaders by gender
This project was written by Elean Huang, Yajing Hazel Wang and Yiran Lilly Zi, and is part of the Knowledge Extraction group project of the course Big Data and Automated Content Analysis II of the University of Amsterdam.
Inquiries about this project can be emailed to elean.huang@student.uva.nl