Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
__pycache__/
*.py[cod]
*$py.class
*.so
.env
*.egg-info/
dist/
build/
.eggs/
venv/
.venv/
node_modules/
.pytest_cache/
1 change: 1 addition & 0 deletions gemini-pdf-reader/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
GEMINI_API_KEY=your_key_here
95 changes: 95 additions & 0 deletions gemini-pdf-reader/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Gemini PDF Reader

A dual-pane desktop PDF reader with integrated Google Gemini Pro contextual explanations. Built with PyQt6, PyMuPDF, and the Google Generative AI SDK.

## Features

- **Dual-Pane Layout**: PDF viewer on the left, explanation panel on the right
- **Gemini Integration**: Right-click selected text to get AI-powered explanations, summaries, or definitions
- **Smart Context**: Automatically chunks large PDFs to fit within token limits, prioritizing pages near your selection
- **Cross-Reference Highlighting**: Gemini's references to the document are highlighted in both the explanation panel and the PDF viewer
- **Conversation History**: All explanations are preserved in a scrollable history for the current session
- **Dark Mode**: Toggle between light and dark themes with `Ctrl+D`
- **Export**: Save all explanations as a Markdown file
- **Keyboard Shortcuts**: `Ctrl+E` to explain selected text, `Ctrl+O` to open a PDF

## Installation

```bash
cd gemini-pdf-reader
pip install -r requirements.txt
```

## Configuration

1. Copy `.env.example` to `.env`:
```bash
cp .env.example .env
```
2. Edit `.env` and add your Google Gemini API key:
```
GEMINI_API_KEY=your_actual_key_here
```

Alternatively, launch the app and use **Help → Set API Key** to enter your key. It will be saved to `~/.gemini-pdf-reader/config.env`.

## Usage

```bash
python main.py
```

1. **Open a PDF**: `File → Open PDF...` or `Ctrl+O`
2. **Select text**: Click and drag on the PDF to select a passage
3. **Get explanation**: Right-click and choose "Explain with Gemini", "Summarize Selection", or "Define Term" — or press `Ctrl+E`
4. **View highlights**: Referenced text from the document is highlighted in yellow in both the explanation panel and the PDF viewer
5. **Export**: Click "Export Explanations" to save all session explanations as Markdown

## Keyboard Shortcuts

| Shortcut | Action |
|----------|--------|
| `Ctrl+O` | Open PDF |
| `Ctrl+E` | Explain selected text |
| `Ctrl+D` | Toggle dark mode |
| `Ctrl+Q` | Quit |

## Project Structure

```
gemini-pdf-reader/
├── main.py # Entry point
├── ui/
│ ├── main_window.py # Main window layout
│ ├── pdf_viewer.py # PDF rendering + selection + highlighting
│ └── explanation_panel.py # Right-side explanation display
├── core/
│ ├── gemini_client.py # Gemini API wrapper
│ ├── pdf_processor.py # PDF text extraction + chunking
│ └── highlighter.py # Cross-reference highlighting logic
├── utils/
│ ├── config.py # API key management
│ └── themes.py # Dark/light theme definitions
├── tests/
│ └── test_core.py # Unit tests for core modules
├── requirements.txt
├── .env.example
└── README.md
```

## Error Handling

- **No API key**: Shows a clear message in the explanation panel with instructions
- **API rate limits**: Displays the error message from the API
- **PDF load failures**: Shows an error dialog
- **Empty selections**: Warns the user to select text first
- **Image-only PDFs**: Warns that no extractable text was found and suggests OCR

## Requirements

- Python 3.9+
- PyQt6
- PyMuPDF (fitz)
- google-generativeai
- python-dotenv
- markdown
Empty file.
117 changes: 117 additions & 0 deletions gemini-pdf-reader/core/gemini_client.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
"""Google Gemini API client wrapper."""

from typing import Optional

import google.generativeai as genai

from utils.config import load_api_key


EXPLAIN_PROMPT = (
"You are a document reading assistant. The user is reading the following "
"document:\n\n{context}\n\n"
"They have selected the following passage and want it explained in the "
"context of this document:\n\n\"{selected}\"\n\n"
"Explain this passage clearly. Reference specific parts of the document "
"to support your explanation. When you reference text from the document, "
"wrap it in <highlight>...</highlight> tags so the app can identify and "
"highlight those sections in the PDF viewer."
)

SUMMARIZE_PROMPT = (
"You are a document reading assistant. The user is reading the following "
"document:\n\n{context}\n\n"
"They have selected the following passage and want a concise summary:\n\n"
"\"{selected}\"\n\n"
"Provide a clear, concise summary. When you reference text from the "
"document, wrap it in <highlight>...</highlight> tags."
)

DEFINE_PROMPT = (
"You are a document reading assistant. The user is reading the following "
"document:\n\n{context}\n\n"
"They want you to define the following term/phrase in the context of this "
"document:\n\n\"{selected}\"\n\n"
"Provide a clear definition. When you reference text from the document, "
"wrap it in <highlight>...</highlight> tags."
)


class GeminiClient:
"""Wrapper around the Google Generative AI SDK for Gemini Pro."""

def __init__(self, api_key: Optional[str] = None) -> None:
"""Initialize the Gemini client.

Args:
api_key: Optional API key. If not provided, attempts to load
from environment/config.

Raises:
ValueError: If no API key is available.
"""
self._api_key = api_key or load_api_key()
if not self._api_key:
raise ValueError(
"No Gemini API key found. Please set GEMINI_API_KEY in your "
".env file or enter it in the application settings."
)
genai.configure(api_key=self._api_key)
self._model = genai.GenerativeModel("gemini-1.5-pro")

def explain(self, context: str, selected_text: str) -> str:
"""Ask Gemini to explain selected text in context.

Args:
context: The full (or chunked) document text.
selected_text: The user-selected passage.

Returns:
Gemini's explanation as a string.
"""
return self._query(EXPLAIN_PROMPT, context, selected_text)

def summarize(self, context: str, selected_text: str) -> str:
"""Ask Gemini to summarize selected text.

Args:
context: The full (or chunked) document text.
selected_text: The user-selected passage.

Returns:
Gemini's summary as a string.
"""
return self._query(SUMMARIZE_PROMPT, context, selected_text)

def define(self, context: str, selected_text: str) -> str:
"""Ask Gemini to define a term from the document.

Args:
context: The full (or chunked) document text.
selected_text: The term or phrase to define.

Returns:
Gemini's definition as a string.
"""
return self._query(DEFINE_PROMPT, context, selected_text)

def _query(self, prompt_template: str, context: str, selected: str) -> str:
"""Send a prompt to Gemini and return the response text.

Args:
prompt_template: Prompt template with {context} and {selected}.
context: Document context.
selected: Selected text.

Returns:
The model's response text.

Raises:
RuntimeError: On API errors.
"""
prompt = prompt_template.format(context=context, selected=selected)
try:
response = self._model.generate_content(prompt)
return response.text
except Exception as exc:
raise RuntimeError(f"Gemini API error: {exc}") from exc
76 changes: 76 additions & 0 deletions gemini-pdf-reader/core/highlighter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
"""Cross-reference highlighting logic.

Parses <highlight>...</highlight> tags from Gemini responses and provides
utilities for rendering highlighted text in both the explanation panel
and the PDF viewer.
"""

import re
from dataclasses import dataclass
from typing import List


@dataclass
class HighlightSegment:
"""A segment of text extracted from <highlight> tags."""
text: str


def extract_highlights(response_text: str) -> List[HighlightSegment]:
"""Extract all <highlight>-tagged text from a Gemini response.

Args:
response_text: The raw response from Gemini.

Returns:
List of HighlightSegment objects.
"""
pattern = re.compile(r"<highlight>(.*?)</highlight>", re.DOTALL)
return [HighlightSegment(text=m.group(1).strip()) for m in pattern.finditer(response_text)]


def response_to_html(response_text: str) -> str:
"""Convert a Gemini response to HTML with highlighted segments styled.

Replaces <highlight>...</highlight> tags with styled <span> elements
and converts basic Markdown formatting to HTML.

Args:
response_text: The raw response from Gemini.

Returns:
HTML string suitable for rendering in a QTextBrowser.
"""
import markdown

# Temporarily replace highlight tags with placeholders
placeholder_map: dict[str, str] = {}
counter = 0

def _replace_highlight(m: re.Match) -> str:
nonlocal counter
key = f"GEMINIHIGHLIGHT{counter}ENDHIGHLIGHT"
text = m.group(1).strip()
placeholder_map[key] = (
f'<span style="background-color: #fff3cd; padding: 2px 4px; '
f'border-radius: 3px; cursor: pointer;" '
f'class="highlight-ref">{text}</span>'
)
counter += 1
return key

text = re.sub(
r"<highlight>(.*?)</highlight>",
_replace_highlight,
response_text,
flags=re.DOTALL,
)

# Convert Markdown to HTML
html = markdown.markdown(text, extensions=["extra", "nl2br"])

# Restore highlights
for key, replacement in placeholder_map.items():
html = html.replace(key, replacement)

return html
Loading
Loading