Skip to content

Conversation

@iibabyy
Copy link
Contributor

@iibabyy iibabyy commented Jan 12, 2026

1. Summary of Changes

This PR introduces support for TikTok scraping (including handling of shortened URLs). It also includes a significant refactor of the core error handling system to provide better user feedback and cleaner logs, alongside architectural improvements to the Telegram bot structure and tasks management using tokio::task::JoinSet.

2. Details of Changes

New Features

  • TikTok Support: Added src/tiktok/ module with TikTokScraper to handle tiktok.com, vm.tiktok.com, and vt.tiktok.com links. Implemented logic to resolve redirections and extract video IDs.
  • New Command: Added /tiktok (alias /tk) command to the Telegram bot.

Architecture & Refactor

  • Error Handling:
    • Renamed Error to BotError in src/core/error.rs.
    • Expanded BotError variants (CommandNotFound, NoMediaFound, etc.) to provide distinct, user-friendly error messages.
    • Downgraded logging severity from error! to warn! for expected failures (e.g., invalid user input).
  • Traits: Renamed MediaScraper::scrape to get_medias to better reflect its purpose of retrieving metadata vs downloading.
  • Telegram Bot:
    • Converted TelegramBot to a static dispatcher (enum) instead of an instance struct.
    • Refactored request handling to separate command_handler and default_handler.
    • Improved response logic to feedback specific errors to the user.
  • Twitter Scraper: Updated TwitterScraper to align with the new MediaScraper trait and error handling.
  • Data Models: Simplified MediaMetadata by removing the unused id field.

Chores & Cleanup

  • Main: Refactored src/main.rs to use tokio::task::JoinSet for managing bot tasks.
  • Tracing: Extracted tracing subscriber initialization to a helper function.
  • Config: Moved constants to config submodules in tiktok and twitter.

3. Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactor (code change that neither fixes a bug nor adds a feature)
  • Documentation (updates to documentation only)
  • Chore (build process, dependency changes)

4. Verification & Testing

Manual Verification:

  1. TikTok: Tested with vm.tiktok.com and www.tiktok.com links; verified video ID extraction and metadata retrieval.
  2. Twitter: Verified existing /twitter command still works with the refactored scraper.
  3. Error Handling: Verified that invalid URLs return a "Invalid URL" message to the user instead of a generic error, and logs show WARN instead of ERROR.
  4. Startup: Confirmed main starts the bot successfully with the new JoinSet logic.

ibaby and others added 20 commits January 12, 2026 21:32
- Replace old modules (handler.rs, insta.rs, main.rs, tiktok.rs, x.rs)
- Add new discord integration module
- Add scraper functionality with twitter support
- Update dependencies in Cargo.toml and Cargo.lock
- Add utility macros for error handling

Add tracing subscriber and implement parallel media download/sending for Discord bot

Refactor architecture with dependency injection pattern
- Implement service container pattern for better dependency management
- Create dedicated service modules for scraping and messaging functionality
- Use enum-based polymorphism for service implementations
- Improve separation of concerns and testability
- Maintain backward compatibility with existing functionality

Enhance logging with structured tracing
- Add more detailed logging with different log levels (info, debug, warn, error)
- Include contextual information like user_id, chat_id, and command in spans
- Add progress tracking for media download and send operations
- Improve error reporting with detailed error messages
- Add success/failure counts for better monitoring
- Use structured logging fields for better log analysis

Refactor error handling with custom BotError enum
- Create custom BotError enum with specific error variants
- Replace generic anyhow::Error with BotError in core services
- Add specific error variants for different error conditions
- Implement user-friendly error messages for each error type
- Remove old macros and error modules, consolidate error handling
- Use structured error handling throughout the application

fixed twitter photo json parsing

Refactor scraper service with improved error handling and type safety

save

moved error builder macros declaration to error.rs

tweaked error handling for scraper response deserializing (impl TryFrom<&serde_json::Value> for TwitterMediaMetadata)

sorted dependencies with cargo-sort

Refactor error handling and cleanup logging

- Added BotError::Custom and custom! macro for flexible error messages.
- Simplified error! macros to use ::tracing directly.
- Refactored Twitter scraper to handle API error responses and internalize URL parsing.
- Cleaned up Telegram bot logging and command handling.
- Simplified VSCode launch configurations.
- Improved error reporting in message service.

Refactor project structure: reorganize modules and remove services
- Migrate `bots/telegram` and `senders/telegram` to `src/telegram/`
- Move `scrapers/twitter` to `src/twitter/`
- Consolidate global error handling into `src/telegram/error.rs`
- Update `lib.rs` exports and internal imports to match new paths
- Introduce `main.rs` and `prelude.rs` for cleaner entry points
- Remove `src/telegram/prelude.rs` and replace with explicit exports in [src/telegram/mod.rs](cci:7://file:///home/ibaby/Desktop/bot-rs/src/telegram/mod.rs:0:0-0:0).
- Decouple `TelegramBot` from `BotTrait` and move `run` logic directly into the struct.
- Optimize `TelegramSender` to reuse `Bot` instance via `Arc` for concurrent dispatch.
- Feature-gate `telegram` and `twitter` modules in `src/lib.rs`.
- Update `TwitterScraper` imports to align with new module paths.
- Transition main entry point to tokio::task::JoinSet to enable multiple bots to runs at once (for future bots).
- Introduce feature-gated initialization for the Telegram bot in main.rs.
- Reorganize twitter module visibility and move re-exports to twitter/mod.rs.
- Use fully qualified paths for tracing components and simplify internal imports.
- Update `TelegramBot::run` signature to return `()`.
- Simplify `JoinSet` task management in `main.rs`.
- Add `#[allow(dead_code)]` to currently unused core types.

These changes reduce boilerplate in the main startup sequence and clean up
compiler warnings for future-use types.
- Moved `Error` enum and `BotResult` to `src/core/error.rs`
- Moved and genericized error macros to `src/core/error.rs`
- Deleted `src/telegram/error.rs`
- Updated `TelegramSender` and `TwitterScraper` to use core errors
- Renamed `Error::Other` to `Error::Unknown`
- Restrict visibility of internal types, traits, and errors to `pub(crate)` in `core`, `telegram`, and `twitter` modules.
- Encapsulate `TelegramBot` fields and `Command` enum.
- Remove unnecessary public re-exports from `lib.rs`.
- Reorder file contents (macros, types, impls) and sort imports to adhere to project style standards.
Updates visibility modifiers from `pub(crate)` to `pub` for core
traits, errors, and types to ensure they are accessible to the
binary target. Simplifies internal module imports by leveraging
unified usage of the `core` module and updates `main.rs` to use
the library prelude.

Changes:
- Make `Error`, `BotResult`, `MediaMetadata`, and traits public.
- Make `TwitterScraper` public.
- Replace verbose imports in submodules with `crate::core::*`.
- Update `main.rs` to use `media_bot::prelude::*`.
Refactor `TelegramSender` and `TwitterScraper` to move their core logic into inherent `impl` blocks. The `MediaSender` and `MediaScraper` trait implementations now delegate to these inherent methods.

- Improves ergonomics by removing the need to import traits for usage.
- Hides trait implementations from documentation using `#[doc(hidden)]`.
Renames the central `Error` enum to `BotError` to reduce ambiguity with
the standard library `std::error::Error` and improve type clarity.
Changes include:
- Renaming the enum definition and `BotResult` alias in [src/core/error.rs](cci:7://file:///home/ibaby/Desktop/bot-rs/src/core/error.rs:0:0-0:0).
- Updating error generation macros (`custom!`, `helper_error_macro!`).
- Updating `Display` and `Error` trait implementations.
- Propagating the name change to `MediaSender` and `MediaScraper` trait
  implementations in [src/telegram/sender.rs](cci:7://file:///home/ibaby/Desktop/bot-rs/src/telegram/sender.rs:0:0-0:0) and [src/twitter/scraper.rs](cci:7://file:///home/ibaby/Desktop/bot-rs/src/twitter/scraper.rs:0:0-0:0).
- Updating public re-exports in `src/lib.rs`.
Includes minor code formatting adjustments in `src/telegram/sender.rs`.
…tures

- introduce `send_msg!` macro to simplify message sending and error logging
- add `FeatureNotEnabled` error to handle disabled feature flags gracefully
- add `CommandNotFound` error for unknown commands
- refactor `answer` function to use new macro and error types
…nitions

Refactor the `MediaSender` trait to replace `Result` return types with a generic `Output`, allowing flexible result handling.
Key changes:
- Update `TelegramSender` to process media downloads and sends concurrently using `JoinSet`.
- Change `send_medias` to return a collection of individual message results instead of a single result.
- Simplify `TelegramBot` command handling and adapt to new sender interface.
- Remove unused `BotError::FeatureNotEnabled` and duplicate error macros.

Remove telegram feature on import from main.rs

Add conditional compilation attributes for feature-gated modules

replaced 'X_LINK' env variable with const value

updated Readme.md

Changed twitter module visibility to public (pub mod)

fixed features dependencies

Modified MediaScraper trait

moved X_LINK variable to twitter::config

Changed TwitterScraper struct to an empty enum

tweak scraper link variable management
- Implement `TikTokScraper` in `src/tiktok` with URL redirection support.
- Add `/tiktok` (alias `/tk`) command to the Telegram bot, guarded by feature flag.
- Refactor `TwitterScraper` to explicitly implement `MediaScraper` trait.
- Remove unused `id` field from `MediaMetadata` and add constructor.
- Extract tracing initialization into helper and fix `.env` loading order.
- Reorganize `src/core/error.rs` types and macros.
- Instrument `TelegramSender` tasks with tracing spans for better observability.
- Rename  to  for better semantic clarity.
- Update , , and  implementations to match new trait definitions.
- Extract  method in  to decouple redirection logic.
- Integrate comprehensive  instrumentation across scrapers and senders to improve runtime observability.
- Implement strict validation for TikTok URLs to reject malformed inputs (domains, IDs) early.
- Add error notification logic in Telegram sender to inform users when media delivery fails.
- Improve observability by promoting logs to 'warn' level for partial scraping or sending failures.
- Fix typo in bot response logging.
- Convert `TelegramBot` to a stateless execution model using static run methods.
- Add `default_handler` to gracefully catch and report unknown commands.
- Downgrade operational log levels from `error!` to `warn!` and remove verbose debug logs.
- Offload error message sending to async tasks in `TelegramSender` to avoid blocking.
- Optimize `get_tiktok_url` signature and fix reference passing in scraper.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds TikTok scraping support and refactors the error handling system to provide better user feedback and cleaner logging. The changes include a new TikTok scraper module, improved error messages, architectural improvements to the Telegram bot, and renaming of core traits for clarity.

Changes:

  • Added TikTok scraper with support for standard and shortened URLs (vm.tiktok.com, vt.tiktok.com)
  • Refactored error handling to use distinct error variants with user-friendly messages and downgraded log levels from error to warn for expected failures
  • Renamed MediaScraper::scrape to get_medias and refactored Telegram bot to use static dispatcher pattern

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 18 comments.

Show a summary per file
File Description
src/tiktok/scraper.rs New TikTok scraper implementation with URL validation and redirection handling
src/tiktok/mod.rs Module exports and configuration constants for TikTok scraper
src/twitter/scraper.rs Refactored to align with new trait signature and error handling patterns
src/twitter/mod.rs Added config submodule for scraper link constant
src/telegram/bot.rs Converted to static dispatcher with separate command and default handlers
src/telegram/sender.rs Updated error handling and added tracing spans
src/core/error.rs Expanded error variants and changed logging from error to warn
src/core/traits.rs Renamed scrape method to get_medias
src/core/types.rs Removed unused id field and added constructor method
src/core/mod.rs Added module-level unused attribute
src/main.rs Refactored to use JoinSet and extracted tracing initialization
src/lib.rs Added tiktok module export
Cargo.toml Added tiktok feature
Cargo.lock Updated url dependency
README.md Updated documentation with TikTok command

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pub const TIKTOK_SCRAPER_LINK_END: &str = ".mp4";
#[expect(unused)]
pub const BROWSER_UA: &str = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36";
pub const MINIMAL_USER_AGENT: &str = "curl/8.7.1"; // Use the exact version from your output
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment mentions this is a minimal user agent from 'your output', but the version 'curl/8.7.1' is hardcoded. This comment should either be removed or clarified, as it suggests this was copied from specific test output rather than being a deliberate choice. If the exact curl version matters for the TikTok API, this should be documented.

Suggested change
pub const MINIMAL_USER_AGENT: &str = "curl/8.7.1"; // Use the exact version from your output
pub const MINIMAL_USER_AGENT: &str = "curl/8.7.1"; // Minimal user agent string used for TikTok requests; update if TikTok starts requiring a different value

Copilot uses AI. Check for mistakes.
src/main.rs Outdated
.with_target(true) // Include module target in logs
.init();
// enable tracing logs
tracing_subcriber().init();
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in function name: 'tracing_subcriber' should be 'tracing_subscriber'.

Copilot uses AI. Check for mistakes.
Comment on lines 188 to 190
info!("TikTok media Metadata retrieving results: 1 total, 1 successfull, 0 failed");
} else {
warn!("TikTok media Metadata retrieving results: 1 total, 0 successfull, 1 failed");
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in log message: 'successfull' should be 'successful'.

Suggested change
info!("TikTok media Metadata retrieving results: 1 total, 1 successfull, 0 failed");
} else {
warn!("TikTok media Metadata retrieving results: 1 total, 0 successfull, 1 failed");
info!("TikTok media Metadata retrieving results: 1 total, 1 successful, 0 failed");
} else {
warn!("TikTok media Metadata retrieving results: 1 total, 0 successful, 1 failed");

Copilot uses AI. Check for mistakes.
Comment on lines 154 to 161
let path_segments = url.path().split('/').collect::<Vec<_>>();
if path_segments.len() != 1 {
return Err(invalid_url!(
"url path should look like 'https://vm.tiktok.com/ABC123'"
));
}

if path_segments[0].len() < 6 || !path_segments[0].chars().all(|c| c.is_alphanumeric()) {
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation logic for the shortened URL path is flawed. When the path is split by '/', an empty string will be the first element due to the leading '/'. The check path_segments[0].len() < 6 will always be checking an empty string. The actual path segment should be at index 1 after filtering or using a proper path parsing method.

Suggested change
let path_segments = url.path().split('/').collect::<Vec<_>>();
if path_segments.len() != 1 {
return Err(invalid_url!(
"url path should look like 'https://vm.tiktok.com/ABC123'"
));
}
if path_segments[0].len() < 6 || !path_segments[0].chars().all(|c| c.is_alphanumeric()) {
let mut segments = url
.path_segments()
.ok_or_else(|| invalid_url!(
"url path should look like 'https://vm.tiktok.com/ABC123'"
))?;
let segment = segments
.next()
.ok_or_else(|| invalid_url!(
"url path should look like 'https://vm.tiktok.com/ABC123'"
))?;
// Ensure there is exactly one non-empty path segment
if segments.next().is_some()
|| segment.len() < 6
|| !segment.chars().all(|c| c.is_alphanumeric())
{

Copilot uses AI. Check for mistakes.
src/core/mod.rs Outdated
Comment on lines 1 to 2
#![allow(unused)]

Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module-level #![allow(unused)] attribute is too broad and suppresses all unused code warnings in the entire core module. This can hide legitimate issues such as unused functions, variables, or imports that should be removed. Consider using more specific attributes like #[allow(dead_code)] on individual items that are intentionally unused, or removing genuinely unused code.

Suggested change
#![allow(unused)]

Copilot uses AI. Check for mistakes.
Comment on lines 76 to 77
.map(ToString::to_string)
.unwrap_or(BotError::Unknown.to_string());
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ToString::to_string() on a JSON Value will produce a JSON representation (e.g., "\"error message\"" with quotes), not a plain string. Use .as_str().unwrap_or("unknown error") instead to extract the actual string value from the JSON.

Suggested change
.map(ToString::to_string)
.unwrap_or(BotError::Unknown.to_string());
.and_then(|v| v.as_str())
.map(str::to_owned)
.unwrap_or_else(|| BotError::Unknown.to_string());

Copilot uses AI. Check for mistakes.
Comment on lines 44 to 46
if let Err(err) = result {
warn!("Failed to send error message to chat: {err}");
return Err(err);
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling logic is redundant. Lines 44-47 check if the result is an error, log and return it. Then line 49 returns the result again. If the result is Ok, it's returned twice (once in the implicit Ok from the if-let, once explicitly). If it's Err, it's also returned after being logged. The explicit return on line 46 makes line 49 unreachable for error cases. Consider simplifying to just returning the result after optional logging.

Suggested change
if let Err(err) = result {
warn!("Failed to send error message to chat: {err}");
return Err(err);
if let Err(err) = &result {
warn!("Failed to send error message to chat: {err}");

Copilot uses AI. Check for mistakes.
));
}

let path_segments = url.path().split('/').collect::<Vec<_>>();
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path validation logic is incorrect. When splitting the path by '/', if there are 2 segments (e.g., '/video/123'), splitting by '/' will produce 3 elements: ['', 'video', '123']. The check path_segments.len() != 1 will always fail for valid shortened URLs like 'https://vm.tiktok.com/ABC123' which have path '/ABC123' (splitting gives ['', 'ABC123']).

Suggested change
let path_segments = url.path().split('/').collect::<Vec<_>>();
let path = url.path().trim_matches('/');
if path.is_empty() {
return Err(invalid_url!(
"url path should look like 'https://vm.tiktok.com/ABC123'"
));
}
let path_segments = path.split('/').collect::<Vec<_>>();

Copilot uses AI. Check for mistakes.
- fix(typo): correct typos in `tracing_subscriber`, `successful`, `occurred`
- refactor(tiktok): improve URL validation and update User Agent comment
- refactor(core): narrow `allow(unused)` to `allow(unused_imports)`
- refactor(telegram): improve error classification and remove redundant error handling
- refactor(telegram): simplify username extraction in tracing
- fix(twitter): correct JSON error message extraction
- style: normalize capitalization in logs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant