This project aims to scrape data from the French Bridge Federation (FFB) website to build a comprehensive and structured database using Python, Selenium, and Supabase.
The project is composed of two main Python scripts:
-
scripts/scraper.py: A robust, interactive web scraper that logs into the FFB "espace métier", navigates to different entity pages (FFB, Zone, Ligue, Comité, Club), and extracts detailed information. It features an interactive menu to select specific entities or run batch scraping operations. -
scripts/create_database_schema.py: A schema generator that analyzes the structure of the scraped data and produces a complete database schema in bothJSONandSQLformats. This script is currently being refactored to align with a new, unified entity model.
The project is under active development. Here is the current status:
-
Scraper (
scraper.py):- Successfully logs in and navigates to the "espace métier".
- Provides an interactive menu to choose which entity type to scrape.
- The core scraping logic is being refactored to handle different entity types and their specific data layouts.
- A bug related to scraping "Zone" entities has been fixed.
-
Schema Generator (
create_database_schema.py):- The script is being overhauled to generate a more unified and coherent database schema based on user feedback.
- Upcoming Changes:
- Consolidate all direct entity attributes into a single
entitiestable. - Use explicit, English, snake_case names for all tables and fields, derived from UI labels.
- Create separate tables for related lists (e.g.,
actors,roles) linked by a foreign key. - Ensure standard metadata fields (
created_at,created_by, etc.) are correctly ordered at the end of each table.
- Consolidate all direct entity attributes into a single
- Ensure you have Python 3 installed.
- Create and activate a virtual environment:
python3 -m venv venv source venv/bin/activate - Install the required dependencies:
pip install -r requirements.txt
- Make sure you have a
scripts/config.pyfile with yourFFB_USERNAMEandFFB_PASSWORD.
To run the interactive scraper:
python3 scripts/scraper.pyYou will be presented with a menu to choose which entity to scrape.
FFB_Database/
├── scripts/
│ ├── scraper.py # The main interactive scraper
│ ├── create_database_schema.py # The database schema generator
│ ├── config.py # Configuration file (credentials)
│ └── ...
├── FFB_Scraped_Data/
│ ├── database_schema.json # Generated JSON schema
│ └── database_schema.sql # Generated SQL schema
├── docs/
│ └── DB_Schema_Builder_by_Web_Navigation.txt # Detailed scraping instructions
└── README.md # This file
- The scraper is designed to be resilient, with explicit waits and error handling for navigation and data extraction.
- The schema generation process is evolving to produce a cleaner, more normalized database structure.
- Regular commits are made to ensure progress is saved and the project history is clean.
The project previously contained more complex setup instructions related to Node.js and Prisma, which are no longer the primary focus. The current approach is centered around the Python scraping and schema generation scripts. The docs folder contains historical and specification files that guide the development of the scraper.