Skip to content

Latest commit

 

History

History
11 lines (8 loc) · 693 Bytes

File metadata and controls

11 lines (8 loc) · 693 Bytes

Web-Scraping

In this project i will try show how to scrape some information from a website which can be used later to train a machine learning model for educational purposes.

I will collect company text reviews along with ratings from indeed.com and save those on a local disk. This can be used to train natural processing language model to predict a rating based on text information.

Selenium package in python will be used to scrape data online.

Scraping data from a website puts additional load on the server. In order to avoid this i will introduce a wait time between each request to the server and will only capture a limited volume of data to introduce the idea of web scraping.