Skip to content

jonstraveladventures/arxivrecommendations

Repository files navigation

ArXiv Paper Recommender

A personalized daily paper recommendation system that learns from your preferences and feedback to suggest relevant papers from ArXiv. Built with FastAPI (backend) and React (frontend).

ArXiv Recommender Python React FastAPI

Features

  • 🎯 Personalized Recommendations: Machine learning algorithm that learns from your likes, dislikes, and preferences
  • πŸ“š Customizable Categories: Select from 123+ ArXiv categories and subcategories (cs.AI, cs.LG, math., physics., etc.)
  • πŸ” Smart Search: Search through all papers in your database
  • πŸ’Ύ Save Papers: Bookmark papers for later reading
  • πŸ“Š Learning System: Improves recommendations based on:
    • Your initial keywords
    • Papers you like/dislike
    • Semantic similarity using TF-IDF
    • Category preferences (for filtering, not scoring)
  • πŸ“„ Export: Export saved papers as JSON or CSV
  • ⚑ Fast & Responsive: Paginated results (100 papers per page) for quick loading

Screenshots

Your personalized paper recommendations, ranked by relevance to your interests

Quick Start

Prerequisites

  • Python 3.8 or higher
  • Node.js 16 or higher
  • npm or yarn

Installation

  1. Clone the repository:

    git clone https://github.com/jonstraveladventures/arxivrecommendations.git
    cd arxivrecommendations
  2. Install Python dependencies:

    pip install -r requirements.txt
  3. Install frontend dependencies:

    cd frontend
    npm install
    cd ..

Running the Application

  1. Start the backend server:

    cd backend
    python main.py

    The API will be available at http://localhost:8000

  2. Start the frontend (in a new terminal):

    cd frontend
    npm run dev

    The web interface will be available at http://localhost:3000

  3. Open your browser: Navigate to http://localhost:3000 to start using the recommender!

Usage

Initial Setup

  1. Set Your Preferences:

    • Click "βš™οΈ Preferences" in the header
    • Add keywords that describe your research interests (comma-separated)
    • Select ArXiv categories you want to follow
    • Click "Save Preferences"
  2. View Today's Papers:

    • Papers are automatically fetched from your selected categories
    • They're ranked by relevance score (shown as a percentage badge)
    • Click on a paper title to expand and see the full abstract
  3. Provide Feedback:

    • πŸ‘ Like: Improves recommendations for similar papers
    • πŸ‘Ž Dislike: Reduces similar papers in future recommendations
    • πŸ’Ύ Save: Bookmark to read later (doesn't affect recommendations)
    • ⏭️ Skip: No opinion, just move on

    You can click both Like and Save on the same paper!

  4. Load More Papers:

    • Click "Load More" at the bottom to see additional papers
    • Papers are paginated (100 per page) for fast loading
  5. Search Papers:

    • Click "πŸ” Search" to search through all papers in the database
    • Search by title, abstract, or authors
  6. View Saved Papers:

    • Click "πŸ’Ύ Saved Papers" to see all your bookmarked papers
    • Export as JSON or CSV for reference

How It Works

Recommendation Algorithm

The system uses a multi-factor scoring approach:

  1. Base Score: 0.5 (neutral starting point)

  2. Keyword Matching (+0.05 per keyword, up to +0.3):

    • Matches your provided keywords against paper titles/abstracts
    • Also extracts keywords from papers you've liked
  3. Semantic Similarity (+0.3 max):

    • Uses TF-IDF vectorization to convert papers to numerical vectors
    • Builds a "user profile" from the average of all your liked papers
    • Computes cosine similarity between each paper and your profile
    • Papers similar to ones you like get higher scores
  4. Negative Feedback (-0.2 max):

    • Penalizes papers similar to ones you've disliked
    • Helps filter out unwanted content
  5. Ranking Boost (+0.1 max):

    • Top-ranked papers get a slight boost to maintain ordering

Note: Category matching is NOT used in scoring (as requested) - categories are only used for filtering which papers to fetch.

Data Storage

The system uses SQLite to store:

  • Papers: Fetched papers with metadata
  • User Preferences: Your keywords and selected categories
  • User Feedback: Your likes, dislikes, saves, and skips

The database file (arxiv_recommender.db) is created automatically in the backend directory.

Configuration

Selecting Categories

  1. Click "βš™οΈ Preferences"
  2. Scroll to "Select ArXiv Categories"
  3. Check the categories you want to include
  4. Click "Save Preferences"
  5. Click "πŸ”„ Force Refresh" to fetch papers from your new categories

Available Categories

The system supports 123+ ArXiv categories organized by subject:

  • Computer Science: cs.AI, cs.LG, cs.CV, cs.CL, cs.RO, etc.
  • Mathematics: math.AG, math.AT, math.CO, etc.
  • Statistics: stat.ML, stat.AP, stat.CO, etc.
  • Physics: physics.optics, physics.quant-ph, etc.
  • Quantitative Biology: q-bio.*
  • Quantitative Finance: q-fin.*
  • Economics: econ.*
  • Electrical Engineering: eess.*

See the full list in the Preferences panel!

Project Structure

arxivrecommendations/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ main.py              # FastAPI application
β”‚   β”œβ”€β”€ database.py          # Database setup
β”‚   β”œβ”€β”€ models.py            # SQLAlchemy models
β”‚   β”œβ”€β”€ arxiv_fetcher.py     # ArXiv API integration
β”‚   β”œβ”€β”€ arxiv_categories.py  # ArXiv category definitions
β”‚   └── recommender.py       # Recommendation algorithm
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ App.jsx          # Main React component
β”‚   β”‚   β”œβ”€β”€ App.css          # Styles
β”‚   β”‚   β”œβ”€β”€ main.jsx         # React entry point
β”‚   β”‚   └── index.css        # Global styles
β”‚   β”œβ”€β”€ package.json
β”‚   └── vite.config.js
β”œβ”€β”€ requirements.txt         # Python dependencies
β”œβ”€β”€ .gitignore
└── README.md

API Endpoints

  • GET /api/health - Health check
  • GET /api/papers/today?page=1&per_page=100 - Get today's papers (paginated)
  • GET /api/papers/search?q=query - Search papers
  • GET /api/papers/saved - Get saved papers
  • GET /api/papers/saved/export?format=json|csv - Export saved papers
  • GET /api/categories - Get all available categories
  • GET /api/categories/selected - Get user's selected categories
  • GET /api/preferences - Get user preferences
  • POST /api/preferences - Update preferences
  • POST /api/feedback - Submit feedback on a paper
  • GET /api/stats - Get user statistics

Customization

Changing Default Categories

Edit backend/arxiv_categories.py and modify DEFAULT_SELECTED.

Adjusting Recommendation Weights

Edit backend/recommender.py to adjust how different factors contribute to the relevance score.

Changing Paper Fetch Settings

Edit backend/main.py to modify:

  • days_back: How many days of papers to fetch (default: 1 for last 24 hours)
  • max_results: Max papers per category (default: 2000, ArXiv API limit)
  • per_page: Papers per page (default: 100)

Troubleshooting

Backend won't start

  • Check Python version: python --version (should be 3.8+)
  • Install dependencies: pip install -r requirements.txt
  • Check if port 8000 is available

Frontend won't load

  • Make sure backend is running on port 8000
  • Install frontend dependencies: cd frontend && npm install
  • Check browser console for errors (F12)
  • Try hard refresh: Cmd+Shift+R (Mac) or Ctrl+Shift+R (Windows/Linux)

No papers showing

  • Click "πŸ”„ Force Refresh" to fetch new papers
  • Check that you've selected at least one category in Preferences
  • Check backend terminal for error messages
  • Verify ArXiv API is accessible

Categories not loading

  • Check browser console for errors
  • Verify backend is running: curl http://localhost:8000/api/categories
  • Try refreshing the page

Development

Running with Auto-Reload

Backend:

cd backend
uvicorn main:app --reload --host 0.0.0.0 --port 8000

Frontend: The Vite dev server already auto-reloads on file changes.

Database Reset

To reset your database and start fresh:

rm backend/arxiv_recommender.db
# Restart the backend - it will create a new database

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - feel free to use this project for your own ArXiv paper recommendations!

Acknowledgments

Future Enhancements

Potential improvements:

  • Email notifications for daily papers
  • More sophisticated ML models (e.g., transformer-based embeddings)
  • User accounts and cloud sync
  • Paper summaries using AI
  • Collaborative filtering (recommendations based on similar users)
  • Paper clustering and topic modeling

Enjoy discovering new papers! πŸ“šβœ¨

About

A personal Arxiv recommender system

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors