A simple, powerful Python tool to download Reddit posts, comments, images, and videos using Reddit's public JSON API. No login or API keys required!
- π₯ Download from subreddits - Get posts sorted by hot, new, top, rising, or controversial
- π€ Download user posts - Archive all posts from any Reddit user
- π¬ Full comment extraction - Save comment threads with nested replies (up to 5 levels)
- πΌοΈ All media types - Images (JPG, PNG), videos (MP4), GIFs, and more
- π Link preservation - Save link posts as HTML redirects
- β‘ Fast and efficient - Rate limiting to avoid IP bans
- π― No authentication - Uses Reddit's public JSON API
- π Markdown format - Posts and comments saved in readable Markdown
- π§ CLI and Python API - Use from command line or import as a library
1. Clone this repository
git clone https://github.com/0anxt/reddit-json-scraper.git
cd reddit-json-scraper2. Create a virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate3. Install dependencies
pip install -r requirements.txtThat's it! You're ready to go. π
π‘ Having installation issues? See the detailed INSTALL.md guide for:
- Virtual environment setup
- Platform-specific instructions (Linux/Mac/Windows)
- Troubleshooting common errors
- Docker installation
Just run the interactive script and answer the questions:
python interactive.pyThe script will guide you through:
- What to scrape (subreddit, user, or single post)
- Sorting options
- Number of posts
- Whether to include comments
- Confirmation before downloading
Perfect for first-time users! No need to remember command-line arguments.
Download top posts from a subreddit:
python reddit_scraper.py --subreddit python --sort top --time week --limit 25Download posts from a user:
python reddit_scraper.py --user username --limit 10Download a specific post:
python reddit_scraper.py --url "https://reddit.com/r/pics/comments/abc123/"| Option | Short | Description | Example |
|---|---|---|---|
--subreddit |
-s |
Subreddit to download (without r/) | --subreddit python |
--user |
-u |
User to download (without u/) | --user spez |
--url |
Single post URL to download | --url "https://..." |
|
--sort |
Sort: hot, new, top, rising, controversial | --sort top |
|
--time |
Time: hour, day, week, month, year, all | --time week |
|
--limit |
-l |
Max posts to download | --limit 50 |
--output |
-o |
Output directory | --output downloads |
--no-comments |
Skip downloading comments (faster) | --no-comments |
1. Archive a subreddit's top posts this month
python reddit_scraper.py --subreddit AskReddit --sort top --time month --limit 1002. Download new posts (good for monitoring)
python reddit_scraper.py --subreddit news --sort new --limit 253. Get images from r/pics (skip comments for speed)
python reddit_scraper.py --subreddit pics --sort hot --limit 50 --no-comments4. Download a specific post with all comments
python reddit_scraper.py --url "https://reddit.com/r/nextfuckinglevel/comments/1o8ze9o/"5. Custom output directory
python reddit_scraper.py --subreddit python --output my_python_archive --limit 30You can also use the scraper in your Python scripts:
from reddit_scraper import RedditJSONScraper
# Create scraper instance
scraper = RedditJSONScraper(output_dir="downloads")
# Download from subreddit
scraper.download_subreddit(
subreddit="python",
sort="top",
time_filter="week",
limit=50,
include_comments=True
)
# Download from user
scraper.download_user(username="spez", limit=25)
# Download single post
scraper.download_from_url("https://reddit.com/r/pics/comments/abc123/")
# Fetch posts without downloading (for analysis)
posts = scraper.fetch_subreddit("python", sort="hot", limit=100)
for post in posts:
print(f"{post['title']} - {post['score']} upvotes")See examples.py for more usage examples!
Downloaded content is organized like this:
reddit_downloads/
βββ r_python/
β βββ 1abc123_Amazing_Python_Tutorial.md # Text post with comments
β βββ 1def456_Cool_Screenshot.png # Image
β βββ 1ghi789_Funny_Video.mp4 # Video
β βββ 1jkl012_Interesting_Article_link.html # Link post
βββ r_pics/
β βββ ...
βββ u_username/
βββ ...
Files are named: {post_id}_{sanitized_title}.{extension}
- Post ID - Reddit's unique ID (prevents duplicates)
- Title - Post title with special characters removed
- Extension - Based on content type (.md, .jpg, .mp4, .html)
Text Posts (.md) - Markdown files with:
- Post title, author, score, timestamp
- Full post content
- All comments with nested replies
- Formatted for easy reading
Images - Downloaded in original format (JPG, PNG, etc.)
Videos - MP4 format, including Reddit-hosted videos
Link Posts (.html) - HTML files that auto-redirect to the original URL
This scraper uses Reddit's public JSON API - no authentication needed!
Simply append .json to any Reddit URL:
https://reddit.com/r/python/top.json- Top postshttps://reddit.com/r/python/comments/abc123.json- Post with comments
The scraper:
- Fetches JSON data from Reddit
- Parses posts and comments
- Downloads media files
- Saves everything in organized folders
Rate Limiting: Automatically waits 2 seconds between requests to avoid IP bans. If rate-limited (429 error), waits 60 seconds and retries.
- β Any public subreddit
- β Any public user's posts
- β Any public post with comments
- β Images, videos, GIFs
- β Text posts with full formatting
- β Link posts
- β Private subreddits (requires login)
- β Deleted or removed posts
- β User-specific content (saved posts, messages)
- β Content from banned/quarantined subreddits
- π Use reasonable limits (don't download 10,000 posts at once)
- β±οΈ The scraper has built-in rate limiting - don't modify it
- πΎ Large downloads can take significant time and disk space
- π€ Be respectful of Reddit's servers
| Feature | This Scraper | easy-reddit-downloader | Gallery-dl |
|---|---|---|---|
| Language | Python | Node.js | Python |
| Auth Required | β No | β No | β No |
| Comments | β Nested | β Yes | β No |
| CLI | β Args | β Interactive | β Args |
| Python API | β Yes | β No | |
| Dependencies | 1 (requests) | Multiple | Multiple |
| Setup Time | < 1 min | ~2 min | ~2 min |
"No items returned" or "403 Forbidden"
- Subreddit might be private or doesn't exist
- Try a different subreddit or check the spelling
- Reddit might be temporarily blocking requests
"Rate limited (429)"
- The scraper will automatically wait and retry
- If it persists, wait a few minutes before running again
"No module named 'requests'"
- Install the dependency:
pip install requests
Downloads are slow
- This is normal - the scraper waits 2 seconds between requests
- Use
--no-commentsto skip comments for faster downloads - Reduce
--limitfor smaller batches
Some posts didn't download
- Posts may be deleted, removed, or failed to load
- Check the output for error messages
- This is expected - not all posts will download successfully
Check out examples.py for 8 different usage examples:
- Basic subreddit download
- Multiple subreddits
- User posts
- Single post with comments
- Fetch and analyze (no download)
- Filter by score
- Media-only downloads
- Custom data processing
Run examples:
python examples.pyContributions are welcome! Feel free to:
- π Report bugs
- π‘ Suggest features
- π§ Submit pull requests
- π Improve documentation
MIT License - See LICENSE file for details.
This tool is for educational and personal use only. Please:
- Respect Reddit's Terms of Service
- Don't use for commercial purposes without permission
- Don't abuse Reddit's servers with excessive requests
- Be a good internet citizen! π
- Inspired by easy-reddit-downloader
- Uses Reddit's public JSON API
- Built with Python and the
requestslibrary
- π Check the Quick Start Guide
- π Open an issue
- π‘ View examples
Made with β€οΈ for the Reddit community
If you find this useful, give it a β on GitHub!