Headline Scraper CLI

A command-line tool that scrapes top headlines from Hacker News (news.ycombinator.com), allowing users to filter, sort, save/load feeds, and view details. Built in Python as part of a self-taught career transition project.

Features

Fetch live headlines with details like title, link, score, author, and date.
Filter by keywords (multi-word support with OR logic).
Sort by date, score, or title (ascending/descending).
Save filtered/sorted feeds to JSON files and load them back.
Ethical scraping: Includes User-Agent header and respects site structure.

Installation

Clone the repository:

git clone https://github.com/HuntedCode/python-headline-scraper.git
cd python-headline-scraper

Install dependencies (Python 3.8+ required):

pip install requests beautifulsoup4

(No other external libraries needed.)

Usage

Run the app from the command line:

python main.py

Once running, you'll see the initial feed. Use these commands:

view: Display the current feed.
filter: Enter keywords to filter titles.
sort: Choose criteria (date, score, title) and direction (asc/desc).
save: Export the current feed to a JSON file.
load: Import a saved JSON file and append to the feed.
refresh: Fetch fresh headlines from Hacker News.
help: List all commands.
exit: Quit the app.

Example session:

Type refresh to get latest headlines.
filter with "AI data" to search titles.
sort by "score" descending.
save to "my_feed.json".
view to see results.

Note: This is for educational purposes. Respect Hacker News' terms—avoid excessive requests (e.g., add delays if automating). No API key needed.

Project Structure

main.py: Entry point.
cli.py: Handles user interface and commands.
scraper.py: Fetches and parses web data using Requests and BeautifulSoup.
headline.py: Data model for individual headlines.
headline_feed.py: Manages feed operations (filter, sort, save/load).

Contributing

Feedback welcome! Fork the repo, make changes, and submit a pull request. Issues can be reported on GitHub.

License

MIT License—feel free to use and modify.

Built by Jeffrey Lowe as part of a 6-month Python learning plan for remote coding jobs. Last updated: July 27, 2025.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Headline Scraper CLI

Features

Installation

Usage

Project Structure

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
README.md		README.md
cli.py		cli.py
headline.py		headline.py
headline_feed.py		headline_feed.py
main.py		main.py
scraper.py		scraper.py

Folders and files

Latest commit

History

Repository files navigation

Headline Scraper CLI

Features

Installation

Usage

Project Structure

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages