Skip to content

kami4ka/WgGesuchtScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WG-Gesucht.de Scraper

A Python scraper for extracting property listings from wg-gesucht.de using the ScrapingAnt API.

Features

  • Scrapes WG rooms, 1-room apartments, apartments, and houses
  • Supports 50+ German cities
  • Parallel scraping for improved performance
  • Extracts 28 property attributes including rent, size, location, amenities
  • Exports data to CSV format
  • Rate limiting and retry logic for reliability

Installation

  1. Clone the repository:
git clone https://github.com/kami4ka/WgGesuchtScraper.git
cd WgGesuchtScraper
  1. Create a virtual environment and install dependencies:
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
  1. Set your ScrapingAnt API key in config.py:
SCRAPINGANT_API_KEY = "your-api-key-here"

Usage

Command Line

# Scrape WG rooms in Hamburg
python main.py hamburg

# Scrape 1-room apartments in Berlin
python main.py berlin --type 1-room

# Scrape apartments in Munich with limit
python main.py munich --type apartment --limit 50 --output munich_apartments.csv

# Enable verbose logging
python main.py hamburg -v

Available Options

Option Description
city City name (required) - see supported cities below
--type, -t Property type: wg, room, 1-room, apartment, flat, house (default: wg)
--output, -o Output CSV file path
--limit, -l Limit number of properties to scrape
--max-workers, -w Maximum parallel requests (default: 10)
--api-key, -k ScrapingAnt API key (overrides config)
--verbose, -v Enable verbose logging

Supported Cities

Berlin, Hamburg, Munich/München, Cologne/Köln, Frankfurt, Stuttgart, Dusseldorf/Düsseldorf, Dresden, Leipzig, Hanover/Hannover, Nuremberg/Nürnberg, Bremen, Bonn, Braunschweig, Darmstadt, Giessen, Göttingen, Heidelberg, Jena, Lüneburg, Mainz, Mannheim, Tübingen, Freiburg, Karlsruhe, Augsburg, Münster, Aachen, Wiesbaden, Kiel, Magdeburg, Rostock, Potsdam, Erfurt, Wuppertal, Bielefeld, Bochum, Dortmund, Essen, Duisburg, Regensburg, Würzburg

Output Format

The scraper exports data to CSV with the following fields:

Field Description
url Property listing URL
title Property title/description
listing_id WG-Gesucht listing ID
total_rent Total monthly rent in EUR
base_rent Base rent (Kaltmiete) in EUR
additional_costs Additional costs (Nebenkosten)
deposit Security deposit (Kaution)
room_size Room size in m²
apartment_size Total apartment size in m²
address Full address
postal_code German postal code (5 digits)
city City name
district District/neighborhood
available_from Availability start date
available_until Availability end date
wg_size WG size (e.g., "2er WG")
flatmates_age Age of current flatmates
sought_gender Gender preference
sought_age Age preference for new flatmate
building_type Building type (Altbau, Neubau)
floor Floor level
furnished Whether furnished (true/false)
smoking Smoking policy
wg_type WG type (Berufstätigen-WG, etc.)
languages Languages spoken
internet Internet connection details
amenities Available amenities
online_since Time since listing was posted

API Configuration

This scraper uses the ScrapingAnt API for web scraping. Configuration options in config.py:

  • SCRAPINGANT_API_KEY: Your API key
  • DEFAULT_MAX_WORKERS: Parallel request limit (default: 10)
  • DEFAULT_TIMEOUT: Request timeout in seconds (default: 60)
  • MAX_RETRIES: Number of retry attempts (default: 3)

License

MIT License

About

Python scraper for wg-gesucht.de property listings using ScrapingAnt API

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages