Extract clean sentences from news articles with one click, right from your browser.
This tool is a Chrome Extension that connects to a local Flask server using newspaper3k and NLTK to extract the content of news articles and export it as a JSON file of sentences.
- One-click extraction: Download sentences from any news/web article instantly.
- NLP-grade results: Robust parsing and sentence tokenization using NLTK
punkt. - Local & private: All processing is local; no article data leaves your machine.
- Simple UI: Clean status updates and download flow.
git clone <your-repo-url>
cd <repo-folder>
Python 3.8+ recommended
pip install flask flask-cors newspaper3k nltk==3.8.1
Run this once to get NLTK's sentence splitter:
python download_nltk_data.py
- This downloads the tokenizer to your user
nltk_datafolder (cross-platform, no hardcoding).
python app.py
- The server listens at:
http://127.0.0.1:5000/extract - Leave this terminal running while using the extension.
- Go to
chrome://extensionsin Chrome - Enable Developer Mode (top right)
- Click Load unpacked and select the repo folder
- Approve downloads and localhost access if prompted.
- Open any article in Chrome
- Click the extension icon, then click Extract Text & Download JSON
- A
.jsonfile will be downloaded containing the article’s sentences
app.py– Flask server (Python)download_nltk_data.py– NLTK model downloader (Python)background.js– Extension background logic (JS)popup.js– Extension popup logic (JS)popup.html– UI for the extension popupmanifest.json– Chrome extension config (MV3)
-
No download/status?
- Ensure Flask server is running, with no errors
- Reload the extension in
chrome://extensionsafter edits - Approve permissions for downloads and localhost requests
- Some browsers may block
http://127.0.0.1; see Chrome flags or try Edge/Firefox - If you see NLTK errors, re-run
download_nltk_data.pyor checknltk_datapath inapp.py
-
Still stuck?
- Check the extension background page console for errors
- Test Flask directly with curl/Postman
- Article extraction: newspaper3k
- Sentence splitting: NLTK | nltk.PyPI 3.8.1
- Chrome Extension: Manifest V3
- Author: Avijit Roy
Built with ❤️ by Avijit Roy.