A Python scraper for extracting property listings from wg-gesucht.de using the ScrapingAnt API.
- Scrapes WG rooms, 1-room apartments, apartments, and houses
- Supports 50+ German cities
- Parallel scraping for improved performance
- Extracts 28 property attributes including rent, size, location, amenities
- Exports data to CSV format
- Rate limiting and retry logic for reliability
- Clone the repository:
git clone https://github.com/kami4ka/WgGesuchtScraper.git
cd WgGesuchtScraper- Create a virtual environment and install dependencies:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt- Set your ScrapingAnt API key in
config.py:
SCRAPINGANT_API_KEY = "your-api-key-here"# Scrape WG rooms in Hamburg
python main.py hamburg
# Scrape 1-room apartments in Berlin
python main.py berlin --type 1-room
# Scrape apartments in Munich with limit
python main.py munich --type apartment --limit 50 --output munich_apartments.csv
# Enable verbose logging
python main.py hamburg -v| Option | Description |
|---|---|
city |
City name (required) - see supported cities below |
--type, -t |
Property type: wg, room, 1-room, apartment, flat, house (default: wg) |
--output, -o |
Output CSV file path |
--limit, -l |
Limit number of properties to scrape |
--max-workers, -w |
Maximum parallel requests (default: 10) |
--api-key, -k |
ScrapingAnt API key (overrides config) |
--verbose, -v |
Enable verbose logging |
Berlin, Hamburg, Munich/München, Cologne/Köln, Frankfurt, Stuttgart, Dusseldorf/Düsseldorf, Dresden, Leipzig, Hanover/Hannover, Nuremberg/Nürnberg, Bremen, Bonn, Braunschweig, Darmstadt, Giessen, Göttingen, Heidelberg, Jena, Lüneburg, Mainz, Mannheim, Tübingen, Freiburg, Karlsruhe, Augsburg, Münster, Aachen, Wiesbaden, Kiel, Magdeburg, Rostock, Potsdam, Erfurt, Wuppertal, Bielefeld, Bochum, Dortmund, Essen, Duisburg, Regensburg, Würzburg
The scraper exports data to CSV with the following fields:
| Field | Description |
|---|---|
| url | Property listing URL |
| title | Property title/description |
| listing_id | WG-Gesucht listing ID |
| total_rent | Total monthly rent in EUR |
| base_rent | Base rent (Kaltmiete) in EUR |
| additional_costs | Additional costs (Nebenkosten) |
| deposit | Security deposit (Kaution) |
| room_size | Room size in m² |
| apartment_size | Total apartment size in m² |
| address | Full address |
| postal_code | German postal code (5 digits) |
| city | City name |
| district | District/neighborhood |
| available_from | Availability start date |
| available_until | Availability end date |
| wg_size | WG size (e.g., "2er WG") |
| flatmates_age | Age of current flatmates |
| sought_gender | Gender preference |
| sought_age | Age preference for new flatmate |
| building_type | Building type (Altbau, Neubau) |
| floor | Floor level |
| furnished | Whether furnished (true/false) |
| smoking | Smoking policy |
| wg_type | WG type (Berufstätigen-WG, etc.) |
| languages | Languages spoken |
| internet | Internet connection details |
| amenities | Available amenities |
| online_since | Time since listing was posted |
This scraper uses the ScrapingAnt API for web scraping. Configuration options in config.py:
SCRAPINGANT_API_KEY: Your API keyDEFAULT_MAX_WORKERS: Parallel request limit (default: 10)DEFAULT_TIMEOUT: Request timeout in seconds (default: 60)MAX_RETRIES: Number of retry attempts (default: 3)
MIT License