X.com Tweet Extraction Project

Overview

The X.com Tweet Extraction Project is designed to extract and store tweets related to specific topics (e.g., technology, AI, innovation, and science) using asynchronous requests. The project scrapes tweets based on a search query while handling rate-limiting scenarios. Extracted tweet data is saved in a CSV file for further analysis.

Features

Search Query: Extracts tweets related to "technology", "innovation", "AI", and "science", excluding links and replies.
Asynchronous Processing: Utilizes Python's asyncio for non-blocking operations, enabling efficient tweet extraction.
Rate-Limiting Handling: Gracefully handles rate-limiting by waiting for the reset time when the API limit is reached.
Text Normalization: Ensures tweet text is sanitized by removing problematic non-ASCII characters.
CSV Storage: Saves tweet data such as tweet count, username, text, creation time, retweet count, and like count into a CSV file.

Tools and Technologies

Python: Programming language used for the project.
Twikit: A Python client for interacting with X.com.
asyncio: Used for handling asynchronous tasks and non-blocking I/O operations.
ConfigParser: For reading configuration files containing login credentials.
CSV: Used to save the extracted tweet data in a structured format.
unicodedata: For normalizing and handling non-ASCII characters in tweet text.

How It Works

Credentials Loading: The project loads login credentials (e.g., username, email, password) from a configuration file (config.ini) to authenticate with X.com.
Client Initialization: Initializes the twikit.Client and loads necessary cookies from a file (cookies.json) for authentication.
Tweet Extraction: Tweets are fetched asynchronously based on the specified search query. The program waits for the rate limit to reset when the limit is reached.
Text Normalization: Non-ASCII characters in tweet text are removed or replaced to ensure compatibility and clean data.
CSV Output: Extracted tweet details, including tweet count, username, tweet text, creation time, retweets, and likes, are written to a CSV file (tweets.csv).

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
config.ini		config.ini
cookies.json		cookies.json
tweets.csv		tweets.csv
twitter_extract.py		twitter_extract.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

X.com Tweet Extraction Project

Overview

Features

Tools and Technologies

How It Works

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

X.com Tweet Extraction Project

Overview

Features

Tools and Technologies

How It Works

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages