Background
Research was done evaluating whether adopting Scrapling with StealthyFetcher
would benefit the project.
Finding
The only source where adoption makes sense is Planespotters.
Current pipeline:
- Playwright → browser visit → harvest Cloudflare cookies (cached 1h)
- curl_cffi → reuse cookies for HTML page requests
- BeautifulSoup → 6-method fallback chain to extract aircraft data
StealthyFetcher uses Camoufox (modified Firefox) with engine-level
fingerprint spoofing (stronger than current JS injection) and a built-in
session class that can replace all three steps.
Expected Benefits
- Remove
beautifulsoup4 dependency (only used in planespotters.py)
- Eliminate ~120 lines of Playwright boilerplate + 6-method HTML parsing fallback
- Stronger Cloudflare resilience (C++ engine spoofing vs JS injection)
- Scrapling's selector API replaces manual BeautifulSoup traversal
Non-benefits
- No Docker image size reduction — StealthyFetcher still needs a browser binary, comparable size to current Chromium
- ADSBexchange: not worth migrating — JSON API + custom TokenBucketRateLimiter; cookie-harvest-once + curl_cffi pattern is more efficient
- Flightradar24: no change needed
Scope (if implemented)
planespotters.py — replace PlanespottersCookieManager + curl_cffi session + BeautifulSoup parsing
pyproject.toml — add scrapling[fetchers], remove beautifulsoup4
Dockerfile — add scrapling install step; verify Firefox system deps
Notes
- Scrapling recently migrated StealthyFetcher from Camoufox to Patchright — verify current engine before implementing
- Scrapling API is still evolving; check stability before adopting
Background
Research was done evaluating whether adopting Scrapling with StealthyFetcher
would benefit the project.
Finding
The only source where adoption makes sense is Planespotters.
Current pipeline:
StealthyFetcheruses Camoufox (modified Firefox) with engine-levelfingerprint spoofing (stronger than current JS injection) and a built-in
session class that can replace all three steps.
Expected Benefits
beautifulsoup4dependency (only used inplanespotters.py)Non-benefits
Scope (if implemented)
planespotters.py— replacePlanespottersCookieManager+ curl_cffi session + BeautifulSoup parsingpyproject.toml— addscrapling[fetchers], removebeautifulsoup4Dockerfile— addscrapling installstep; verify Firefox system depsNotes