GitHub - SoCloseSociety/DoctolibDataScraper: 🏥 Scrape doctor data from Doctolib — names, specialties, addresses, availability. Python + Selenium. Free & open source.

Extract doctor profiles, skills & contacts from Doctolib.fr — automated scraping with clean CSV export.

Quick Start • Features • Configuration • FAQ • Contributing

What is Doctolib Data Scraper?

Doctolib Data Scraper is a free, open-source Doctolib.fr web scraper built with Python and Selenium. It automates the extraction of doctor profiles from any Doctolib search URL into clean, analysis-ready CSV files.

Manually collecting doctor information from Doctolib is time-consuming. This scraper handles the entire process: give it a search URL, and it crawls all paginated results, then visits each profile to extract structured data — names, addresses, skills, degrees, and contact information.

Who is this for?

Healthcare Recruiters looking to build prospect lists of medical professionals
Market Researchers studying the healthcare landscape in France
Data Analysts collecting public health data for analysis
Startup Founders building healthcare-related products and services
Researchers studying medical specialization distribution
Developers learning web scraping with Selenium and Python

Key Features

Two-Phase Extraction - Phase 1 crawls paginated results, Phase 2 scrapes each profile
Full Pagination - Automatically navigates all search result pages
Multi-Location - Extracts every practice location per doctor
VPN Rotation - Built-in NordVPN CLI support to avoid rate limiting (optional)
Progressive Saving - Data saved every 5 profiles, no data loss on crash
Auto-Recovery - Handles connection drops with smart retry logic
Cross-Platform - Works on Windows, macOS, and Linux
Clean CSV Output - Ready for Excel, Google Sheets, or any data tool
Free & Open Source - MIT license, no API key required

Quick Start

Prerequisites

Requirement	Details
Python	Version 3.9 or higher (Download)
Google Chrome	Latest version (Download)
NordVPN	Optional — for IP rotation during large scrapes

Installation

# 1. Clone the repository
git clone https://github.com/SoCloseSociety/DoctolibDataScraper.git
cd DoctolibDataScraper

# 2. (Recommended) Create a virtual environment
python -m venv venv

# Activate it:
# Windows:
venv\Scripts\activate
# macOS / Linux:
source venv/bin/activate

# 3. Install dependencies
pip install -r requirements.txt

Usage

python main.py

Enter a Doctolib search URL when prompted:

============================================================
  DoctolibDataScraper
  by SoClose Society - https://soclose.co
============================================================

Enter Doctolib search URL: https://www.doctolib.fr/medecin-generaliste/paris

How It Works

Doctolib Search URL
        │
        ▼
┌─────────────────────┐
│  Phase 1: Crawl     │──→ doctolib_profile_link.csv
│  Paginated results  │
└─────────────────────┘
        │
        ▼
┌─────────────────────┐
│  Phase 2: Scrape    │──→ doctolib_profile_details.csv
│  Each doctor profile │
└─────────────────────┘

What It Extracts

Data Field	Example
Name	Dr. Marie Dupont
Addresses	All practice locations with full addresses
Skills	Medical specializations, competencies
Degrees	Diplomas, certifications, education history
Contacts	Phone numbers, additional contact details

Output Files

File	Content
`doctolib_profile_link.csv`	All unique doctor profile links
`doctolib_profile_details.csv`	Full structured profile data
`scraper.log`	Timestamped execution log

Configuration

NordVPN Setup (Optional)

VPN support helps avoid rate limiting during large scraping sessions.

Windows

Install NordVPN
Add NordVPN to your PATH: C:\Program Files\NordVPN\
Verify in Command Prompt: nordvpn -c

macOS

Install NordVPN via nordvpn.com or brew install nordvpn
Verify in Terminal: nordvpn connect

Linux

Install: sh <(curl -sSf https://downloads.nordcdn.com/apps/linux/install.sh)
Login: nordvpn login
Verify: nordvpn connect

Tech Stack

Technology	Role
Python 3.9+	Core language
Selenium 4.15+	Browser automation & page interaction
BeautifulSoup4	HTML parsing & data extraction
Pandas	Data structuring & CSV export
webdriver-manager	Automatic ChromeDriver management

Project Structure

DoctolibDataScraper/
├── main.py              # Main scraper application
├── requirements.txt     # Python dependencies
├── assets/
│   └── banner.svg       # Project banner
├── LICENSE              # MIT License
├── README.md            # This file
├── CONTRIBUTING.md      # Contribution guidelines
└── .gitignore           # Git ignore rules

Troubleshooting

Chrome driver issues

The bot uses webdriver-manager to automatically download the correct ChromeDriver. If you encounter issues:

pip install --upgrade webdriver-manager

Rate limiting / IP blocks

If Doctolib blocks your requests:

Enable NordVPN rotation (see Configuration)
Increase delays between requests
Reduce the number of profiles per session

Doctolib UI changes

Doctolib occasionally updates its web interface. If the scraper stops working:

Check the Issues page for known problems
Open a new issue with the error message

Permission denied errors (macOS/Linux)

chmod +x main.py

FAQ

Q: Is this free? A: Yes. Doctolib Data Scraper is 100% free and open source under the MIT license.

Q: Do I need an API key? A: No. This tool uses browser automation (Selenium), so no API key or developer account is needed.

Q: How many profiles can I scrape at once? A: There is no hard limit. The scraper processes profiles one by one with progressive saving. Just be mindful of Doctolib's usage policies and use VPN rotation for large scrapes.

Q: Does it comply with GDPR? A: The tool extracts publicly available data. You are responsible for handling any collected data in compliance with GDPR and applicable laws.

Q: Does it work on Mac / Linux? A: Yes. The scraper is fully cross-platform and works on Windows, macOS, and Linux.

Alternatives Comparison

Feature	Doctolib Data Scraper	Manual Copy-Paste	Paid Scraping APIs
Price	Free	Free	$50-200/mo
Automated pagination	Yes	No	Yes
Multi-location scraping	Yes	Manual	Varies
Open source	Yes	N/A	No
API key required	No	No	Yes
VPN rotation	Built-in	N/A	Varies
Cross-platform	Yes	Yes	Web only

Contributing

Contributions are welcome! Please read the Contributing Guide before submitting a pull request.

License

This project is licensed under the MIT License.

Disclaimer

This tool is provided for educational and research purposes only. Use it responsibly and in compliance with Doctolib's Terms of Service and applicable data protection laws (GDPR). The authors are not responsible for any misuse or consequences arising from the use of this software.

If this project helps you, please give it a star!
It helps others discover this tool.

_{Built with purpose by SoClose — Digital Innovation Through Automation & AI}
_{Website •
LinkedIn •
Twitter •
Contact}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What is Doctolib Data Scraper?

Who is this for?

Key Features

Quick Start

Prerequisites

Installation

Usage

How It Works

What It Extracts

Output Files

Configuration

NordVPN Setup (Optional)

Tech Stack

Project Structure

Troubleshooting

Chrome driver issues

Rate limiting / IP blocks

Doctolib UI changes

Permission denied errors (macOS/Linux)

FAQ

Alternatives Comparison

Contributing

License

Disclaimer

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

License

SoCloseSociety/DoctolibDataScraper

Folders and files

Latest commit

History

Repository files navigation

What is Doctolib Data Scraper?

Who is this for?

Key Features

Quick Start

Prerequisites

Installation

Usage

How It Works

What It Extracts

Output Files

Configuration

NordVPN Setup (Optional)

Tech Stack

Project Structure

Troubleshooting

Chrome driver issues

Rate limiting / IP blocks

Doctolib UI changes

Permission denied errors (macOS/Linux)

FAQ

Alternatives Comparison

Contributing

License

Disclaimer

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages