Skip to content

FaroutYLq/whatsup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ArXiv Daily Digest

A personalized daily arXiv paper recommendation system powered by AI. Get relevant papers delivered to your inbox based on your research interests and Zotero library.

Features

  • πŸ” Smart Filtering: Pre-filters papers by category and keywords
  • πŸ€– LLM Evaluation: Uses OpenAI GPT models to intelligently match papers to your interests
  • πŸ“š Zotero Integration: Learns from your existing research library
  • πŸ“§ Email Delivery: Sends daily digest with relevant papers, abstracts, and links
  • ⏰ Automated: Set it up once, get digests daily via cron/launchd
  • βš™οΈ Configurable: Customize categories, keywords, relevance threshold

Quick Start

Prerequisites

  • Python 3.7+
  • pip
  • OpenAI API key
  • Gmail account (or other SMTP email)
  • Zotero library export (optional but recommended)

Installation

  1. Clone the repository
git clone https://github.com/yourusername/whatsup.git
cd whatsup
  1. Install dependencies
pip install -r requirements.txt
  1. Configure the system
cp config.yaml.example config.yaml
nano config.yaml  # Edit with your settings

Required configuration:

  1. Run manually (test)
python src/main.py
  1. Set up automation (optional)

See docs/CRON_SETUP.md for instructions on scheduling daily runs.

Configuration

ArXiv Categories

Find categories at https://arxiv.org/category_taxonomy

Common physics/astronomy categories:

  • cond-mat.supr-con - Superconductivity
  • cond-mat.mes-hall - Mesoscale and Nanoscale Physics
  • physics.ins-det - Instrumentation and Detectors
  • astro-ph.IM - Instrumentation and Methods
  • quant-ph - Quantum Physics

Keywords

Use keywords for pre-filtering to improve speed and reduce OpenAI API costs. Papers matching ANY keyword will be evaluated by the LLM.

Example:

keywords:
  - superconductor
  - quasiparticle
  - THz detector
  - single photon

Relevance Threshold

Papers are scored 0-10 by the LLM. The default threshold is 7.0. Adjust based on your needs:

  • 5.0-6.0: More papers, some less relevant
  • 7.0: Balanced (recommended)
  • 8.0-9.0: Fewer papers, highly relevant only

Research Interests

Be specific about your current interests. The LLM uses this along with your Zotero library to evaluate papers.

Example:

interests:
  description: |
    I study superconducting quantum devices, specifically:
    1. Athermal quasiparticle dynamics and trapping
    2. THz single-photon detectors for cosmology
    3. Microwave kinetic inductance detectors (MKIDs)
    4. Quantum sensing applications

Project Structure

whatsup/
β”œβ”€β”€ config.yaml             # Your configuration
β”œβ”€β”€ config.yaml.example     # Configuration template
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ README.md              # This file
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ config_parser.py   # Load and validate config
β”‚   β”œβ”€β”€ zotero_parser.py   # Parse Zotero exports
β”‚   β”œβ”€β”€ arxiv_client.py    # Fetch papers from arXiv
β”‚   β”œβ”€β”€ llm_evaluator.py   # OpenAI-based evaluation
β”‚   β”œβ”€β”€ email_sender.py    # Send digest emails
β”‚   └── main.py            # Main orchestrator
└── docs/
    β”œβ”€β”€ ZOTERO_EXPORT.md   # Zotero export guide
    β”œβ”€β”€ GMAIL_SETUP.md     # Gmail app password setup
    └── CRON_SETUP.md      # Automation scheduling

Usage

Manual Run

Run the digest manually:

python src/main.py

With custom config file:

python src/main.py /path/to/custom_config.yaml

Automated Daily Run

See docs/CRON_SETUP.md for full instructions.

Quick example (runs daily at 8 AM):

crontab -e

Add:

0 8 * * * cd /path/to/whatsup && python src/main.py

Output

The system will:

  1. Load your configuration
  2. Parse your Zotero library
  3. Fetch papers from arXiv (last 24 hours by default)
  4. Evaluate each paper with OpenAI
  5. Send email with relevant papers (score >= threshold)

Example email:

ArXiv Digest: 3 relevant papers - 2025-11-01

======================================================================

1. Quasiparticle Dynamics in Disordered Superconductors
   Authors: Smith, J., Doe, A.
   Published: 2025-10-31
   Relevance: 8.5/10 - Directly addresses athermal 
              quasiparticle dynamics
   URL: https://arxiv.org/abs/2510.12345
   PDF: https://arxiv.org/pdf/2510.12345.pdf
   
   Abstract:
   We study the dynamics of athermal quasiparticles...

Documentation

Cost Estimation

OpenAI API Costs

Using gpt-4o-mini (recommended):

  • ~$0.15 per 1M input tokens
  • ~$0.60 per 1M output tokens

Typical daily run:

  • 10-50 papers to evaluate
  • ~1000 tokens per paper evaluation
  • Estimated cost: $0.01-0.05 per day (~$1-2/month)

Using gpt-4o:

  • ~10x more expensive
  • Better quality, but usually unnecessary

Reducing Costs

  1. Use narrow arXiv categories
  2. Add specific keywords for pre-filtering
  3. Reduce max_days_back to 1
  4. Increase relevance threshold

Troubleshooting

No papers found

  • Check arXiv categories are correct
  • Verify papers were published recently (within max_days_back)
  • Remove or broaden keyword filters

No relevant papers

  • Lower the relevance threshold (try 6.0)
  • Update your interests description to be more specific
  • Check OpenAI API key is valid

Email not received

  • Verify Gmail app password (see docs/GMAIL_SETUP.md)
  • Check spam folder
  • Review email configuration in config.yaml
  • Check logs for errors

"Config file not found"

  • Ensure config.yaml exists (copy from config.yaml.example)
  • Use absolute path or run from project directory

OpenAI API errors

  • Verify API key is correct
  • Check you have API credits
  • Check OpenAI service status

Development

Adding New Features

The code is modular:

  • config_parser.py: Add new config sections
  • zotero_parser.py: Support new export formats
  • arxiv_client.py: Change arXiv query logic
  • llm_evaluator.py: Modify evaluation prompt/scoring
  • email_sender.py: Change email format

Testing

Test individual components:

from src.config_parser import ConfigParser
config = ConfigParser('config.yaml')
print(config.get_arxiv_config())

Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Add tests if applicable
  4. Submit a pull request

License

See LICENSE file for details.

Acknowledgments

Support

For issues, questions, or feature requests:

  • Open an issue on GitHub
  • Check existing documentation in docs/

Version

Current version: 0.0.0


Happy paper hunting! πŸ“šπŸ”¬

About

What's new on arXiv today? Do I want to know?

Resources

License

Stars

Watchers

Forks

Packages

No packages published