Move TikTok scraper to standalone API with multi-service architecture#74
Open
karilaa-dev wants to merge 10 commits intomainfrom
Open
Move TikTok scraper to standalone API with multi-service architecture#74karilaa-dev wants to merge 10 commits intomainfrom
karilaa-dev wants to merge 10 commits intomainfrom
Conversation
Transient failures (network hiccups, 429 rate limits, 5xx errors) now retry up to 3 attempts with 3s/5s delays, using a tighter per-request timeout (10s total, 3s connect). Non-retryable errors (404) still fail immediately.
Show 👨💻 reaction after fetching media info and before uploading, matching the existing TikTok download behavior.
Reorganize TikTok scraper functionality into a reusable `tiktok_scrapper` package with the following changes: - Move client, models, exceptions, and proxy_manager from tiktok_api/ to tiktok_scrapper/ - Create standalone config system based on environment variables - Add FastAPI REST API server (app.py) with endpoints: * GET /video - Extract video/slideshow metadata and CDN URLs * GET /music - Extract music metadata * GET /check - Validate TikTok URLs via regex * GET /health - Health check - Add two new client methods for metadata extraction without downloading: * extract_video_info() - Get raw video data from TikTok API * extract_music_info() - Get raw music data from TikTok API - Add Pydantic models for JSON API responses - Add Dockerfile for containerized API deployment - Keep tiktok_api as backward-compatible shim re-exporting from tiktok_scrapper - Move yt-dlp and curl_cffi dependencies to tiktok_scrapper package
Extract core TikTok client functionality into a new tiktok_api library package. The tiktok_scrapper service now uses tiktok_api as a dependency, separating library code from REST API implementation. - Move client, models, exceptions, proxy_manager to tiktok_api/ - Remove Pydantic response models from core library (API-specific) - Restructure tiktok_scrapper/ as standalone FastAPI service - Update package imports and module structure
- Remove PROXY_DATA_ONLY, MAX_VIDEO_DURATION, STREAMING_DURATION_THRESHOLD, HOST, and PORT config options - Remove data_only_proxy parameter and feature from TikTokClient - Remove python-dotenv dependency - Update fastapi, uvicorn, yt-dlp, curl-cffi, and pydantic-settings to latest versions
…er to app - Move application code from tiktok_scrapper/tiktok_scrapper/ to tiktok_scrapper/app/ - Remove editable package configuration and build-system requirements - Update Docker and documentation to reference new module path (tiktok_scrapper.app → app.app) - Simplify dependency management by treating tiktok_scrapper as an application, not a library package
Rename the package directory and all references from tiktok_scrapper to tt-scrap. Updates project name in pyproject.toml, documentation, Docker configuration, and dependency lock files. Removes tt-scrap as a local dependency from the main project's uv.lock.
…pattern - Move TikTok client, routes, and config into dedicated service module - Introduce ServiceRegistry for managing multiple scraper services - Create BaseClient protocol for service implementations - Generalize exceptions from TikTok-specific to service-agnostic names - Replace dependency injection with service registry initialization - Support dynamic service registration and initialization at startup
Update endpoints to reflect /{service}/... routing, document TIKTOK_
env prefix, and add instructions for adding new services.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #134
tt-scrap/services/<name>/directory — no changes to core app codeGET /tiktok/video,GET /tiktok/music, etc.ScraperError,ContentDeletedError, etc.) shared across all servicesTIKTOK_prefix (e.g.,TIKTOK_URL_RESOLVE_MAX_RETRIES)Architecture
Test plan
uvicorn app.app:appstarts without errorsGET /health→{"status": "ok"}GET /tiktok/video?url=<tiktok_url>→ video/slideshow responseGET /tiktok/music?video_id=<id>→ music responseGET /docs→ Swagger UI shows/tiktok/video,/tiktok/music,/healthTIKTOK_URL_RESOLVE_MAX_RETRIESandTIKTOK_VIDEO_INFO_MAX_RETRIESrespected/video,/music) no longer exist (404)