Wikipedia scraper

BeCode learning project aiming at scraping wikipedia and using API.

Mission Objectives

In this project, we will guide you step by step through the process of:

Creating a self-contained development environment (virtual environment)
Retrieving some information from an API
Leveraging your knowledge to scrape a website that does not provide an API
Saving the output for later processing

More specifically, in this project we will query an API to obtain a list of countries and their past political leaders. We then extract and sanitize their short bio from Wikipedia. Finally, we save the data.

Scraping data is often the first coding step of a data science project (meaning, the data collection) and you will likely come back to it in the future.

Installation

Clone the repository: """ git clone https://github.com/CharlyHo/wikipedia-scraper.git cd wikipedia-scraper """

Make sure you have Python installed. Then, install all required libraries by running:
""" pip install -r requirements.txt """

Must-have features (MVP)

You should have a working wikipedia_scraper.ipynb notebook that calls the API and creates a JSON file
Create your own exception to include proper exception handling
Have a nice README that explains your project.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.venv		.venv
README.md		README.md
requirements.txt		requirements.txt
wikipedia_scraper.ipynb		wikipedia_scraper.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wikipedia scraper

Mission Objectives

Installation

Must-have features (MVP)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

CharlyHo/wikipedia-scraper

Folders and files

Latest commit

History

Repository files navigation

Wikipedia scraper

Mission Objectives

Installation

Must-have features (MVP)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages