Skip to content

Add Ollama-backed async article analysis (summary + keywords)#157

Open
xenacode-art wants to merge 5 commits intom2b3:testfrom
xenacode-art:feat/ollama-ai-article-analysis
Open

Add Ollama-backed async article analysis (summary + keywords)#157
xenacode-art wants to merge 5 commits intom2b3:testfrom
xenacode-art:feat/ollama-ai-article-analysis

Conversation

@xenacode-art
Copy link
Copy Markdown

Overview

This is a proof-of-concept for the AI-assisted literature discovery
feature discussed in the GSoC 2026 possibilities thread. It uses a
locally hosted Ollama model — no external API keys or costs required.

What it does

Given an article's abstract, the system:

  1. Generates a 2–3 sentence plain-language summary
  2. Extracts 5–8 key topic keywords as structured JSON

Both are returned as a Celery task result so the UI can poll
asynchronously without blocking the request.

New files

myapp/services/ai_tasks.py

  • analyse_article_task@shared_task that calls the Ollama
    /api/generate endpoint with two sequential prompts
  • Ollama base URL and model are configurable via OLLAMA_BASE_URL
    and OLLAMA_MODEL Django settings (defaults: localhost:11434,
    llama3.2)
  • ConnectionError and Timeout trigger Celery retries with
    backoff; other failures propagate so they're visible in the
    task state

articles/ai_api.py

  • POST /articles/{slug}/ai-summarize — validates article exists
    and has an abstract, queues the task, returns task_id
  • GET /articles/ai-task/{task_id} — polls task state and returns
    result when complete

Running locally

# Install and start Ollama
ollama pull llama3.2
ollama serve

# Start Celery worker (Redis must be running)
celery -A myapp worker -l info

Then hit POST /api/articles/{slug}/ai-summarize with a valid JWT.

@xenacode-art xenacode-art changed the base branch from main to test March 22, 2026 20:38
@armanalam03
Copy link
Copy Markdown
Collaborator

This is a great initiative @xenacode-art ! If possible, please attach some working examples (images/videos) of integrating it with frontend. I have few questions on this -

  • How and who sends this task to celery?
  • How do we generate summaries on existing articles? Does this run everytime on the api call for get article, or is it a scheduled job?
  • What hardware resources are needed to run this model on the cloud?
  • How much accurate is it to generate a summary if an article and fetching relevant keywords which best describes the article?
  • In future, we would like to build a recommendation system of the articles based on the user. How can we extend our AI workflow with this to fetch best relevant keywords for an article?

Community names containing special characters (e.g. '+', spaces) were
being inserted raw into notification link paths, causing 404s when
users clicked through. Apply urllib.parse.quote(..., safe='') to all
in-app notification links that include the community name in the path.

Fixes m2b3#119
Introduces two new components:

- myapp/services/ai_tasks.py: a Celery shared_task that sends an
  article abstract to a locally running Ollama instance, retrieves
  a 2-3 sentence summary and a JSON array of keywords, and returns
  them as a structured result. The Ollama base URL and model are
  configurable via OLLAMA_BASE_URL and OLLAMA_MODEL settings so no
  external API keys are required.

- articles/ai_api.py: two endpoints wired into the articles router:
    POST /{slug}/ai-summarize  — queues the task, returns task_id
    GET  /ai-task/{task_id}    — polls task state and result

Connection and timeout errors trigger automatic Celery retries so
transient Ollama unavailability does not surface as hard failures.
@xenacode-art xenacode-art force-pushed the feat/ollama-ai-article-analysis branch from 88b39cf to 95c6817 Compare March 22, 2026 21:07
@xenacode-art
Copy link
Copy Markdown
Author

xenacode-art commented Mar 22, 2026

Thanks for the review @armanalam03 Happy to answer each of these.

  • How and who sends this task to Celery?

The POST /articles/{slug}/ai-summarize endpoint does — it calls analyse_article_task.delay(article.id, article.abstract) which queues the job. The frontend (or any authenticated client) hits that endpoint to trigger it. The result is then polled via GET /articles/ai-task/{task_id} using the task ID returned.

  • How do we generate summaries on existing articles? On every get-article call or scheduled?

Right now it's fully on-demand — nothing runs automatically. The next step I'd suggest is caching the result: add ai_summary and ai_keywords fields to the Article model so after the first run the result is stored and returned immediately on subsequent calls.
We could also wire a post-save signal to auto-queue new articles. A scheduled job for backfilling existing articles is also straightforward with Celery Beat.

  • What hardware resources are needed on the cloud?
    llama3.2 (3B parameters) needs roughly 4–6 GB of RAM and runs fine on CPU, just slower. On cloud, a small GPU instance (e.g., AWS g4dn.xlarge) gives good throughput. For a budget option, the quantized llama3.2:1b variant cuts that in half. For production I'd recommend deploying Ollama on a dedicated instance and pointing OLLAMA_BASE_URL at it — the Django/Celery side doesn't need any GPU.

  • How accurate is the summarization and keyword extraction?
    For well-written scientific abstracts, llama3.2 is quite solid — summaries are coherent and keywords are topically relevant. Accuracy drops on very domain-specific jargon (e.g. niche chemistry or genomics). We could improve this by switching to a larger model (llama3.1:8b) or by prompt-tuning for scientific text. I can add some example outputs on a few real SciCommons articles if that helps.

  • How to extend this for a recommendation system?
    The keywords extracted per article are the natural starting point. Store them on the Article model, then GET /articles/{slug}/related can query Article.objects.filter(keywords__overlap=[...]) ranked by overlap count — no vector DB needed for MVP. When we want semantic similarity (not just keyword overlap), we'd generate embeddings using a lightweight model like nomic-embed-text via Ollama and store them in pgvector. That's a clean extension of this same Ollama infrastructure. I actually have this scoped out in my GSoC proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants