Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions examples/pdf_research_agent/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# ─────────────────────────────────────────────────────────────────────────────
# PDF Research Agent — Environment Variables
# Copy this file to .env and fill in your values before running the agent.
#
# cp .env.example .env
# ─────────────────────────────────────────────────────────────────────────────


# ── LLM Provider ─────────────────────────────────────────────────────────────
# Required. Get your free key at https://openrouter.ai
OPENROUTER_API_KEY=your_openrouter_api_key_here


# ── Bindu Deployment ─────────────────────────────────────────────────────────
# URL the agent server binds to. Change port if 3775 is already in use.
BINDU_DEPLOYMENT_URL=http://localhost:3775


# ── Storage Backend ──────────────────────────────────────────────────────────
# "memory" — default, no external dependency, data lost on restart
# "postgres" — persistent storage, requires DATABASE_URL below
STORAGE_TYPE=memory

# Required only when STORAGE_TYPE=postgres
# Example: postgresql+asyncpg://user:password@localhost:5432/bindu
DATABASE_URL=


# ── Scheduler Backend ────────────────────────────────────────────────────────
# "memory" — default, in-process scheduler
# "redis" — distributed scheduler, requires REDIS_URL below
SCHEDULER_TYPE=memory

# Required only when SCHEDULER_TYPE=redis
# Example: redis://localhost:6379/0
REDIS_URL=


# ── Observability / Error Tracking (optional) ────────────────────────────────
# Leave blank to disable Sentry error tracking.
SENTRY_DSN=
SENTRY_ENVIRONMENT=development
SENTRY_TRACES_SAMPLE_RATE=0.1
Empty file.
129 changes: 129 additions & 0 deletions examples/pdf_research_agent/pdf_research_agent.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
"""
PDF Research Agent Example for Bindu

This example agent accepts either a PDF file path or raw text and
returns a structured summary. It demonstrates how to wrap a simple
document-processing workflow using `bindufy()` so the agent becomes
a live service.


Prerequisites
-------------
uv add bindu agno pypdf python-dotenv

Usage
-----
export OPENROUTER_API_KEY="your_api_key_here"
python pdf_research_agent.py

The agent will be live at http://localhost:3775
Send it a message like:
{"role": "user", "content": "/path/to/paper.pdf"}
or paste raw text directly as the message content.
"""
from bindu.penguin.bindufy import bindufy
from agno.agent import Agent
from agno.models.openrouter import OpenRouter
from dotenv import load_dotenv
import os
load_dotenv()

# ---------------------------------------------------------------------------
# 1. Helper — extract text from a PDF path or pass raw text straight through
# ---------------------------------------------------------------------------

def _read_content(source: str) -> str:
"""Return plain text from a PDF file path, or the source string itself."""
if source.strip().endswith(".pdf") and os.path.isfile(source.strip()):
try:
from pypdf import PdfReader # optional dependency
reader = PdfReader(source.strip())
pages = [page.extract_text() or "" for page in reader.pages]
return "\n\n".join(pages)
except ImportError:
return (
f"[pypdf not installed — cannot read '{source.strip()}'. "
"Run: uv add pypdf]"
)
return source # treat as raw document text


# ---------------------------------------------------------------------------
# 2. Agent definition
# ---------------------------------------------------------------------------

agent = Agent(
instructions=(
"You are a research assistant that reads documents and produces clear, "
"concise summaries. When given document text:\n"
" 1. Identify the main topic or thesis.\n"
" 2. List the key findings or arguments (3-5 bullet points).\n"
" 3. Note any important conclusions or recommendations.\n"
"Be factual and brief. If the text is too short or unclear, say so."
),
model=OpenRouter(
id="openai/gpt-4o-mini",
api_key=os.getenv("OPENROUTER_API_KEY")
),
)


# ---------------------------------------------------------------------------
# 3. Bindu configuration
# ---------------------------------------------------------------------------

config = {
"author": "your.email@example.com",
"name": "pdf_research_agent",
"description": "Summarises PDF files and document text using OpenRouter.",
"version": "1.0.0",
"capabilities": {},
"auth": {"enabled": False},
"storage": {"type": "memory"},
"scheduler": {"type": "memory"},
"deployment": {
"url": "http://localhost:3775",
"expose": True,
},
}


# ---------------------------------------------------------------------------
# 4. Handler — the bridge between Bindu messages and the agent
# ---------------------------------------------------------------------------

def handler(messages: list[dict[str, str]]):
"""
Receive a conversation history from Bindu, extract the latest user
message, read its content (PDF or raw text), and return a summary.

Args:
messages: Standard A2A message list, e.g.
[{"role": "user", "content": "/path/to/doc.pdf"}]

Returns:
Agent response with the document summary.
"""
# Grab the most recent user message
user_messages = [m for m in messages if m.get("role") == "user"]
if not user_messages:
return "No user message found. Please send a PDF path or document text."

user_input = user_messages[-1].get("content", "")
document_text = _read_content(user_input)

# Build a prompt that includes the full document text
prompt = f"Summarize the following document and highlight the key insights:\n\n{document_text}"
enriched_messages = [{"role": "user", "content": prompt}]

result = agent.run(input=enriched_messages)
return result


# ---------------------------------------------------------------------------
# 5. Bindu-fy the agent — one call turns it into a live microservice
# ---------------------------------------------------------------------------

if __name__ == "__main__":
print("PDF Research Agent running at http://localhost:3775")
bindufy(config, handler)
80 changes: 80 additions & 0 deletions examples/pdf_research_agent/skills/pdf-research-skill/skill.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
id: pdf-research-skill
name: PDF Research Skill
version: 1.0.0
author: your.email@example.com
description: >
Processes PDF files and raw document text to produce structured, LLM-generated
summaries. Identifies the main thesis, extracts key findings, and surfaces
conclusions or recommendations. Powered by OpenRouter and exposed as a live
A2A-compliant Bindu microservice.

features:
- Accept a local PDF file path or raw document text as input
- Extract full text from multi-page PDFs using pypdf
- Generate structured summaries with thesis, key findings, and conclusions
- Graceful fallback to raw-text mode when pypdf is unavailable
- Deployed as a live Bindu microservice via bindufy()
- A2A / AP2 / X402 protocol-compliant out of the box

tags:
- pdf
- research
- summarisation
- document-analysis
- openrouter
- nlp

input_modes:
- text/plain # Raw document text pasted directly
- application/pdf # Local PDF file path resolved server-side

output_modes:
- text/plain # Structured summary returned as plain text

examples:
- input: "/home/user/papers/attention_is_all_you_need.pdf"
output: |
**Topic:** Transformer architecture for sequence-to-sequence tasks.
**Key Findings:**
- Self-attention replaces recurrence and convolution entirely.
- Multi-head attention allows the model to attend to different positions jointly.
- The model achieves state-of-the-art BLEU on WMT 2014 EN-DE and EN-FR.
- Training time is significantly reduced compared to RNN-based models.
- Positional encodings preserve sequence order without recurrence.
**Conclusions:** The Transformer generalises well to other tasks beyond MT and
enables highly parallelisable training at scale.

- input: |
Climate change is the long-term shift in global temperatures and weather
patterns. While natural factors play a role, human activity since the 1800s
has been the dominant driver through greenhouse gas emissions...
output: |
**Topic:** Causes and impacts of climate change.
**Key Findings:**
- Human activity is the primary driver since the industrial revolution.
- Greenhouse gas emissions are the main mechanism of warming.
- Weather patterns and sea levels are measurably shifting.
**Conclusions:** Urgent systemic action is required to limit warming below 1.5°C.

capabilities_detail:
pdf_extraction:
description: Reads multi-page PDFs page-by-page using pypdf and joins extracted text.
fallback: If pypdf is not installed, returns a clear installation hint to the caller.
max_pages: unlimited # all pages extracted; very large PDFs may increase latency

summarisation:
model: openrouter/openai/gpt-4o-mini
prompt_strategy: |
Instruction-tuned prompt asks the model to:
1. Identify the main topic or thesis.
2. List 3–5 key findings or arguments.
3. Surface important conclusions or recommendations.
fallback_behaviour: If input text is too short or unclear the model says so explicitly.

transport:
protocol: A2A (JSON-RPC over HTTP)
endpoint: POST /
port: 3775
auth: disabled (configurable via config.auth.enabled)
storage: in-memory (configurable — see .env.example for Postgres/Redis options)
scheduler: in-memory (configurable — see .env.example for Redis options)