With thoughts of moving within Toronto, scrolling through countrless pages is tiring; especially with new properties constntly being added to the market. This Python scraper goes through the most recent 600 postings within Toronto, making real-estate listings local to Toronto much easier.
This Python script utilizes Selenium and Undetected_Chromedriver to scrape property data from Realtor.ca. It is designed to extract key information such as address, price, bedrooms, and bathrooms, and export all the data into an Excel Spreadsheet.
Before running the script, make sure the following is installed:
- Python (ideally 3.9 or newer)
- Openpyxl
- Selenium / Selenium-Wire
- Undetected-Chromedriver
- Pandas
You can install the required packages using the following:
pip install openpyxl selenium selenium-wire undetected-chromedriver pandas- Clone the repository:
git clone https://github.com/rayhant2/Realtor.ca-Web-Scraper.git- Run the script:
python app.pyThe script will open a Chrome browser, scrape the property data, and export it to "properties.xlsx".
- The scraper only goes through the first, most recent 600 properties (50 pages x 12 listings); if there are more pages with properties listings, they are neglected - need to find a way to scrape all listings based on the number of pages available
- For now, this property scraper only works for the city Toronto, ON - need to find a way to optimize the scraper to work for any city/region specified
- Another issue was that I couldn't find a way to use the button to navigate the listing pages; I'm not sure if its because it wasn't an HTML element, but in order to counter this issue, I found a way where new tabs could be opened and scraping the data from there. Hopefully there is a better way to achieve this, since opening 50 tabs before closing the driver is not an efficient method for scraping data.
- Other future updates include: Automation, Filtering results based on price/beds/baths, Including links,lot size, etc. in the spreadsheet, and solutions to the issues mentioned above.
If you encounter any issues or have suggestions for improvements, feel free to open an issue or submit a pull request.
An example of how the property data is stored can be found here
MIT