Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Project Agents.md guide for OpenAI Codex

This Agents.md file provides comprehensive guidance for OpenAI Codex and other AI agents working with this codebase.

## Project structure for OpenAI Codex navigation

- `/api`: FastAPI route handlers and request/response schemas
- `/core`: Application core (auth helpers, configuration, celery setup, logging)
- `/database`: Database layer, async SQLAlchemy engine setup, and CRUD utilities
- `/models`: SQLAlchemy ORM models and domain models
- `/parsers`: File parsers for supported book formats (e.g., PDF, EPUB)
- `/services`: Business logic and storage backends (e.g., filesystem, MinIO)
- `/config`: Project configuration files (e.g., Redis settings)
- `/alembic`: Migration environment and versioned migration scripts
- `/storage`: Runtime file storage for uploaded/processed books (do not edit directly)
- `main.py`: FastAPI application entrypoint
- `pyproject.toml`: Tooling and dependency configuration
- `/tests`: Test modules (`test_*.py` / `tests_*.py`) for validating project functionality

## Coding conventions for OpenAI Codex

### General conventions for Agents.md implementation

- Use **Python 3.11+** for all code
- Follow **PEP 8** style, formatted with **Black** (line length 120) and **isort**
- Lint with **ruff**; fix reported issues before committing
- Write expressive function and variable names and include docstrings
- Add comments explaining non-trivial or performance-critical logic

### FastAPI and service module guidelines for OpenAI Codex

- Prefer `async`/`await` in endpoints and service functions
- Keep route handlers thin; delegate heavy logic to service modules
- Define request/response models with **Pydantic**
- Use snake_case filenames for modules (e.g., `book_service.py`)

### Database and migration guidelines for OpenAI Codex

- Define models using SQLAlchemy's declarative syntax
- Generate an Alembic migration for each schema change
- Keep migrations idempotent and clearly documented

### Parser and storage guidelines for OpenAI Codex

- Implement new parsers by extending `parsers.base_parser.BaseParser`
- Keep parser classes small and format-agnostic
- For storage backends, implement the `services.storage.storage_backend.StorageBackend` interface

## Testing requirements for OpenAI Codex

Run tests with the following commands:

```
# Run all tests
pytest

# Run a specific test file
pytest path/to/test_file.py

# Run tests with coverage
pytest --cov
```

## Pull request guidelines for OpenAI Codex

1. Provide a clear description of the change
2. Reference related issues or tickets
3. Ensure all tests pass and code is formatted and linted
4. Include API docs or example requests and responses for new endpoints
5. Keep each pull request focused on a single feature or fix

## Programmatic checks for OpenAI Codex

Before submitting changes, run:

```
# Lint check
ruff .

# Format check
black --check .
# Type check
mypy .

# Run tests
pytest
```

All checks must pass before OpenAI Codex-generated code can be merged.
4 changes: 1 addition & 3 deletions api/v1/routes/auth.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,9 +108,7 @@ async def update_user_preferences(
db: AsyncSession = Depends(get_database),
current_user: User = Security(get_current_user),
):
current_preferences = (
current_user.preferences if isinstance(current_user.preferences, dict) else {}
)
current_preferences = current_user.preferences if isinstance(current_user.preferences, dict) else {}

new_preferences = {
**current_preferences,
Expand Down
4 changes: 1 addition & 3 deletions api/v1/routes/books.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,7 @@ def construct_book_display(book_data: dict, request: Request) -> BookDisplay:
book_data = book_data.copy()

for cover in book_data["covers"]:
cover["url"] = (
f"{base_url}api/v1/books/{book_data['id']}/cover?variant={cover['variant']}"
)
cover["url"] = f"{base_url}api/v1/books/{book_data['id']}/cover?variant={cover['variant']}"

if "file_path" in book_data:
book_data["download_url"] = f"{base_url}api/v1/books/{book_data['id']}/download"
Expand Down
2 changes: 1 addition & 1 deletion api/v1/schemas/storage_schemas.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from typing import Any

from pydantic import BaseModel, ValidationError, field_validator
from pydantic import BaseModel, field_validator, ValidationError


class FileStorageConfig(BaseModel):
Expand Down
5 changes: 2 additions & 3 deletions core/auth.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from fastapi import Depends, HTTPException, status
from fastapi.security import APIKeyHeader, OAuth2PasswordBearer
from jose import JWTError, jwt

from passlib.context import CryptContext
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.future import select
Expand Down Expand Up @@ -42,9 +43,7 @@ def get_password_hash(password):
def create_access_token(data: dict, expires_delta: timedelta | None = None):
to_encode = data.copy()

expire = datetime.now(UTC) + (
expires_delta or timedelta(minutes=ACCESS_TOKEN_EXPIRE_MINUTES)
)
expire = datetime.now(UTC) + (expires_delta or timedelta(minutes=ACCESS_TOKEN_EXPIRE_MINUTES))

to_encode.update({"exp": expire})
encoded_jwt = jwt.encode(to_encode, SECRET_KEY, algorithm=ALGORITHM)
Expand Down
2 changes: 1 addition & 1 deletion models/storage.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from typing import Any

from sqlalchemy import JSON, Boolean, ForeignKey, String
from sqlalchemy import Boolean, ForeignKey, JSON, String
from sqlalchemy.orm import Mapped, mapped_column

from database.base import Base
Expand Down
14 changes: 3 additions & 11 deletions parsers/epub_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,20 +92,15 @@ def extract_cover_image_data(self, file_path: Path) -> tuple[bytes, str] | None:
cover_item = None

for item in book.get_items():
if item.get_type() == ebooklib.ITEM_IMAGE and "cover-image" in (
item.properties or []
):
if item.get_type() == ebooklib.ITEM_IMAGE and "cover-image" in (item.properties or []):
cover_item = item
break

if not cover_item:
for meta_info in book.get_metadata("OPF", "meta"):
if isinstance(meta_info, tuple) and len(meta_info) > 1:
attributes = meta_info[1]
if (
isinstance(attributes, dict)
and attributes.get("name") == "cover"
):
if isinstance(attributes, dict) and attributes.get("name") == "cover":
cover_id = attributes.get("content")

if cover_id:
Expand All @@ -116,10 +111,7 @@ def extract_cover_image_data(self, file_path: Path) -> tuple[bytes, str] | None:
common_cover_names = ["cover.jpg", "cover.jpeg", "cover.png"]

for item in book.get_items_of_type(ebooklib.ITEM_IMAGE):
if (
item.get_name().lower() in common_cover_names
or "cover" in item.get_name().lower()
):
if item.get_name().lower() in common_cover_names or "cover" in item.get_name().lower():
cover_item = item
break

Expand Down
16 changes: 4 additions & 12 deletions parsers/pdf_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,7 @@ def parse_metadata(self, file_path: Path) -> dict[str, Any]:
metadata["title"] = meta.get("title")

metadata["authors"] = [
{"name": author}
for author in meta.get("author", "").split(";")
if author.strip()
{"name": author} for author in meta.get("author", "").split(";") if author.strip()
]

metadata["publisher"] = meta.get("producer")
Expand All @@ -32,15 +30,9 @@ def parse_metadata(self, file_path: Path) -> dict[str, Any]:
from contextlib import suppress

with suppress(Exception):
metadata["publication_date"] = (
f"{pub_date[2:6]}-{pub_date[6:8]}-{pub_date[8:10]}"
)

metadata["tags"] = [
tag.strip()
for tag in meta.get("keywords", "").split(",")
if tag.strip()
]
metadata["publication_date"] = f"{pub_date[2:6]}-{pub_date[6:8]}-{pub_date[8:10]}"

metadata["tags"] = [tag.strip() for tag in meta.get("keywords", "").split(",") if tag.strip()]

metadata["format"] = "PDF"
metadata["page_count"] = doc.page_count
Expand Down
8 changes: 5 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ python_files = "test_*.py tests_*.py"
asyncio_mode = "auto"

[tool.ruff]
line-length = 88
line-length = 120
indent-width = 4
target-version = "py311"

Expand Down Expand Up @@ -125,6 +125,7 @@ fixable = ["ALL"]
unfixable = []

[tool.ruff.lint.isort]
order-by-type = false
known-first-party = ["api", "core", "database", "models", "parsers", "services", "storage"]
force-sort-within-sections = true

Expand All @@ -135,12 +136,13 @@ skip-magic-trailing-comma = false
line-ending = "auto"

[tool.black]
line-length = 88
line-length = 120
target-version = ["py311"]

[tool.isort]
order_by_type = false
profile = "black"
line_length = 88
line_length = 120
known_first_party = ["api", "core", "config", "database", "models", "parsers", "services", "storage"]

[tool.mypy]
Expand Down
4 changes: 1 addition & 3 deletions services/book_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -343,9 +343,7 @@ async def get_books(
sort_order,
)

return [
BookInDB.model_validate(book.__dict__).model_dump() for book in books
], int(count or 0)
return [BookInDB.model_validate(book.__dict__).model_dump() for book in books], int(count or 0)

async def get_book_by_id(
self,
Expand Down
Loading