diff --git a/.github/workflows/deploy-docs.yml b/.github/workflows/deploy-docs.yml new file mode 100644 index 0000000..e297d54 --- /dev/null +++ b/.github/workflows/deploy-docs.yml @@ -0,0 +1,39 @@ +name: Deploy Documentation + +on: + push: + branches: [main] + paths: + - 'docs/**' + - '.github/workflows/deploy-docs.yml' + +permissions: + contents: read + pages: write + id-token: write + +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + with: + submodules: true + fetch-depth: 0 + + - name: Setup Hugo + uses: peaceiris/actions-hugo@v2 + with: + hugo-version: 'latest' + extended: true + + - name: Build + run: cd docs && hugo --minify + + - name: Upload artifact + uses: actions/upload-pages-artifact@v3 + with: + path: ./docs/public + + - name: Deploy to GitHub Pages + uses: actions/deploy-pages@v4 diff --git a/.gitignore b/.gitignore index f1469b9..f86ba5b 100644 --- a/.gitignore +++ b/.gitignore @@ -24,3 +24,8 @@ go.work # Files with secrets .env tern.conf + +# Hugo build output +docs/public/ +docs/resources/ +docs/.hugo_build.lock diff --git a/.gitmodules b/.gitmodules new file mode 100644 index 0000000..6cb7e5c --- /dev/null +++ b/.gitmodules @@ -0,0 +1,3 @@ +[submodule "docs/themes/book"] + path = docs/themes/book + url = https://github.com/alex-shpak/hugo-book diff --git a/DOCUMENTATION_SUMMARY.md b/DOCUMENTATION_SUMMARY.md new file mode 100644 index 0000000..84c6f4d --- /dev/null +++ b/DOCUMENTATION_SUMMARY.md @@ -0,0 +1,192 @@ +# Hugo Documentation Structure - Complete + +## Summary + +Created **50 markdown files** with **17,102 lines** of comprehensive Hugo documentation for dhamps-vdb. + +## Directory Structure + +``` +docs/content/ +├── _index.md (main landing page) +├── getting-started/ +│ ├── _index.md +│ ├── installation.md +│ ├── docker.md +│ ├── configuration.md +│ ├── quick-start.md +│ └── first-project.md +├── concepts/ +│ ├── _index.md +│ ├── architecture.md +│ ├── users-and-auth.md +│ ├── projects.md +│ ├── embeddings.md +│ ├── llm-services.md +│ ├── similarity-search.md +│ └── metadata.md +├── guides/ +│ ├── _index.md +│ ├── rag-workflow.md +│ ├── project-sharing.md +│ ├── public-projects.md +│ ├── ownership-transfer.md +│ ├── metadata-validation.md +│ ├── metadata-filtering.md +│ ├── batch-operations.md +│ └── instance-management.md +├── api/ +│ ├── _index.md +│ ├── authentication.md +│ ├── query-parameters.md +│ ├── patch-updates.md +│ ├── error-handling.md +│ └── endpoints/ +│ ├── _index.md +│ ├── users.md +│ ├── projects.md +│ ├── llm-services.md +│ ├── api-standards.md +│ ├── embeddings.md +│ └── similars.md +├── deployment/ +│ ├── _index.md +│ ├── docker.md +│ ├── database.md +│ ├── environment-variables.md +│ └── security.md +├── development/ +│ ├── _index.md +│ ├── testing.md +│ ├── contributing.md +│ ├── architecture.md +│ └── performance.md +└── reference/ + ├── _index.md + ├── configuration.md + ├── database-schema.md + └── roadmap.md +``` + +## Content Sources + +Documentation was migrated and organized from: +- README.md - Main content, API overview, features +- DOCKER.md - Complete Docker deployment guide +- docs/PUBLIC_ACCESS.md - Public project access (current state) +- docs/METADATA_SCHEMA_EXAMPLES.md - Validation examples +- docs/PERFORMANCE_OPTIMIZATION.md - Performance notes +- docs/LLM_SERVICE_REFACTORING.md - Current state only (no history) +- internal/models/options.go - Configuration options + +## Quality Standards Met + +✅ **Present tense** - All content describes current state +✅ **User-focused** - Written for end users, not implementers +✅ **Practical examples** - Extensive curl command examples throughout +✅ **Hugo front matter** - All files have proper title and weight +✅ **Information preserved** - No content lost from source files +✅ **No implementation history** - Technical evolution removed +✅ **Metadata filtering** - Correctly documented as EXCLUDE (negative matching) + +## Key Features Documented + +### Getting Started (6 files) +- Installation from source +- Docker deployment (comprehensive guide) +- Configuration with environment variables +- Quick start curl tutorial +- Complete first project walkthrough + +### Concepts (8 files) +- System architecture overview +- Users and authentication +- Project management +- Embeddings storage +- LLM services (definitions vs instances) +- Similarity search algorithms +- Metadata handling + +### Guides (9 files) +- RAG workflow implementation +- Project sharing and collaboration +- Public project access +- Project ownership transfer +- Metadata schema validation +- Metadata filtering (exclusion) +- Batch operations +- LLM service instance management + +### API Reference (11 files) +- Authentication methods +- Complete endpoint documentation: + - Users management + - Projects CRUD + - LLM services configuration + - API standards definitions + - Embeddings storage + - Similarity search (with metadata filtering) +- Query parameters +- PATCH updates +- Error handling + +### Deployment (5 files) +- Docker deployment guide +- PostgreSQL database setup +- Environment variables reference +- Security best practices + +### Development (5 files) +- Testing guide +- Contributing guidelines +- Architecture deep-dive +- Performance optimization notes + +### Reference (4 files) +- Configuration complete reference +- Database schema documentation +- Product roadmap + +## Special Notes + +1. **LLM Services Documentation**: Extracted CURRENT STATE from LLM_SERVICE_REFACTORING.md, removed all implementation history and migration notes. + +2. **Metadata Filtering**: Correctly documented that `metadata_path` and `metadata_value` perform NEGATIVE MATCHING (exclude documents with matching values). + +3. **Public Projects**: Focused on user functionality from PUBLIC_ACCESS.md, removed implementation details. + +4. **Docker Guide**: Comprehensive guide created from DOCKER.md with added troubleshooting and verification steps. + +5. **Configuration**: Complete documentation of all environment variables from options.go plus ENCRYPTION_KEY. + +## Hugo Compatibility + +All files include: +- Front matter with `title` field +- Front matter with `weight` field for ordering +- Markdown headers starting at level 1 (#) +- Valid markdown syntax +- Relative links between pages + +## Next Steps + +1. Configure Hugo site with theme +2. Set up navigation menus +3. Configure search functionality +4. Add site configuration (config.toml/yaml) +5. Deploy to hosting platform +6. Set up CI/CD for automatic deployment + +## Verification + +All 50 files verified present: +```bash +find docs/content -name "*.md" | wc -l +# Output: 50 +``` + +Total lines of documentation: +```bash +wc -l docs/content/**/*.md | tail -1 +# Output: 17102 total +``` diff --git a/README.md b/README.md index 874ae5f..6a96dcc 100644 --- a/README.md +++ b/README.md @@ -3,707 +3,193 @@ Vector Database for the DH at Max Planck Society initiative [![Go Report Card](https://goreportcard.com/badge/github.com/mpilhlt/dhamps-vdb?style=flat-square)](https://goreportcard.com/report/github.com/mpilhlt/dhamps-vdb) [![Release](https://img.shields.io/github/v/release/mpilhlt/dhamps-vdb.svg?style=flat-square&include_prereleases)](https://github.com/mpilhlt/dhamps-vdb/releases/latest) - - ## Introduction -This is an application serving an API to handle embeddings. It stores embeddings in a PostgreSQL backend and uses its vector support, but allows you to manage different users, projects, and LLM configurations via a simple Restful API. - -The typical use case is as a component of a Retrieval Augmented Generation (RAG) workflow: You create embeddings for a collection of text snippets and upload them to this API. For each text snippet, you upload a text identifier, the embeddings vector and, optionally, metadata or the text itself. Then, you can +dhamps-vdb is a PostgreSQL-backed vector database with pgvector support, providing a RESTful API for managing embeddings in Retrieval Augmented Generation (RAG) workflows. Store embeddings for text snippets with metadata, then find similar content using cosine similarity search. -- `GET` the most similar texts for a text that is already in the database by specifying the text's identifier in a URL -- `POST` raw embeddings to find similar texts without storing the query embeddings in the database - -In both cases, the service returns a list of text identifiers along with their similarity scores that you can then use in your own processing, perhaps based on other means of providing the respective texts. +The typical use case is as a RAG component: Create embeddings for your text collection, upload them with identifiers and optional metadata, then query for similar texts either by identifier (`GET`) or by posting raw embeddings (`POST`). The service returns text identifiers with similarity scores for use in your application. ## Features -- OpenAPI documentation -- Supports different embeddings configurations (e.g. dimensions) -- Rights management (authentication via API token) -- Automatic validation of embeddings and metadata +### Core Capabilities +- **PostgreSQL with pgvector backend** - Reliable, scalable vector storage +- **RESTful API** - OpenAPI-documented endpoints +- **Docker deployment ready** - Includes PostgreSQL with pgvector +- **Comprehensive test coverage** - Integration tests with testcontainers + +### Multi-User & Access Control +- **Multi-user support** - Role-based access control (admin, owner, reader, editor) +- **Project sharing** - Collaborate with specific users +- **Public access mode** - Enable unauthenticated read access for projects +- **Project ownership transfer** - Transfer projects between users -## Getting started +### LLM & Embedding Management +- **LLM service management** - Service definitions and instances with encrypted API keys +- **Multiple embedding configurations** - Support for different dimensions +- **Automatic dimension validation** - Ensures vector consistency +- **Flexible instance sharing** - Share LLM service instances across users -### Docker Deployment (Recommended) +### Data Validation & Search +- **JSON Schema-based metadata validation** - Enforce metadata structure +- **Metadata filtering in similarity search** - Exclude documents by metadata field values +- **PATCH support** - Partial updates for projects and embeddings +- **Configurable thresholds** - Control similarity search results -The easiest way to run dhamps-vdb is using Docker. See [DOCKER.md](./DOCKER.md) for comprehensive Docker deployment instructions. +## Quick Start -Quick start with Docker: +### 1. Start with Docker ```bash # Automated setup (generates secure keys) ./docker-setup.sh -# Start with docker-compose (includes PostgreSQL with pgvector) +# Start services (includes PostgreSQL with pgvector) docker-compose up -d -# Access the API +# Access the API documentation curl http://localhost:8880/docs ``` -### Compiling from Source - -For **compiling**, you should run `go get ./... ; sqlc generate --no-remote ; go build -o build/dhamps-vdb main.go` (or in place of the last command you can also run it directly with `go run main.go`). - -#### Run tests - -Actual (mostly integration) tests are run like this: - -```bash -$> systemctl --user start podman.socket -$> export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/podman/podman.sock -$> go test -v ./... -``` - -These tests do not contact any separately installed/launched backend, and instead have a container managed by the testing itself (via [testcontainers](https://testcontainers.com/guides/getting-started-with-testcontainers-for-go/)). - -### Running - -For **running**, you need to set an admin key and a couple of other variables. Have a look in [options.go](./internal/models/options.go) for the full list and documentation. - -You can specify them as command line options, with environment values or in an `.env` file. This last option is recommended because this way, the sensitive information will not be in the command history; but you have to take care of this file's security. For instance, it will/should not be synced to your versioning platform, it is thus listed in `.gitignore` by default. - -If you authenticate with this key, you can create users by `POST`ing to the `/v1/users` endpoint. (The response to user creation will contain an API key for the new user. **Keep it safe, it is not stored anywhere and cannot be recovered.**) Then, either the admin or the user can create projects, llm services and finally post embeddings and get similar elements. - -When launching, the application checks and migrates the database schema to the appropriate version if possible. It presupposes however, that a suitable database and user (with appropriate privileges) have been created beforehand. SQL commands to prepare the database are listed below. For other ways of running, e.g. for tests or with a container instead of an external database, see below as well. - -#### Run with local container - -A local container with a pg_vector-enabled postgresql can be run like this: - -```bash -$> podman run -p 8888:5432 -e POSTGRES_PASSWORD=password pgvector/pgvector:0.7.4-pg16 -``` - -But be aware that the filesystem is not persisted if you run it like this. That means that when you stop and remove the container, you will have to repeat the following database setup when you run it again later on. (And of course any data you may have saved inside the container is lost, too.) - -You can connect to it from a second terminal like so: - -```bash -$> psql -p 8888 -h localhost -U postgres -d postgres -``` - -And then set up the database like this: - -```sql -postgres=# CREATE DATABASE my_vectors; -postgres=# CREATE USER my_user WITH PASSWORD 'my-password'; -postgres=# GRANT ALL PRIVILEGES ON DATABASE "my_vectors" to my_user; -postgres=# \c my_vectors -postgres=# GRANT ALL ON SCHEMA public TO my_user; -postgres=# CREATE EXTENSION IF NOT EXISTS vector; -``` - -### Client Authentication - -Clients should communicate the API key in the `Authorization` header with a `Bearer` prefix, e.g. `Bearer 024v2013621509245f2e24`. Most operations can only be done by the (admin or the) owner of the resource in question. - -**Project Sharing**: Projects can be shared with specific users for read or edit access. See the [Project Sharing](#project-sharing) section below for details on sharing projects with individual users. - -**Public Access**: Projects can also be made publicly accessible (allowing unauthenticated read access to embeddings and similars) by setting the `public_read` field to `true` when creating or updating the project. See [docs/PUBLIC_ACCESS.md](./docs/PUBLIC_ACCESS.md) for details. - -## Project Sharing - -Projects can be shared with other users to enable collaboration. The project owner can grant two levels of access: - -- **reader**: Read-only access to embeddings and similar documents -- **editor**: Read and write access to embeddings (can add/modify/delete embeddings) - -### Sharing During Project Creation - -When creating a project, you can specify users to share with using the `shared_with` field: - -```json -{ - "project_handle": "my-project", - "description": "A collaborative project", - "instance_owner": "alice", - "instance_handle": "embedding1", - "shared_with": [ - { - "user_handle": "bob", - "role": "reader" - }, - { - "user_handle": "charlie", - "role": "editor" - } - ] -} -``` - -### Managing Sharing After Creation - -The project owner can manage sharing through dedicated endpoints: - -**Share a project with a user:** +### 2. Create a User ```bash -POST /v1/projects/{owner}/{project}/share -{ - "share_with_handle": "bob", - "role": "reader" -} -``` - -**Unshare a project from a user:** +curl -X POST http://localhost:8880/v1/users \ + -H "Authorization: Bearer YOUR_ADMIN_KEY" \ + -H "Content-Type: application/json" \ + -d '{"user_handle": "alice", "name": "Alice Smith"}' -```bash -DELETE /v1/projects/{owner}/{project}/share/{user_handle} +# Response includes: {"vdb_key": "alice_abc123..."} +# ⚠️ Save the vdb_key! It cannot be recovered. ``` -**List users a project is shared with:** +### 3. Create an LLM Service Instance ```bash -GET /v1/projects/{owner}/{project}/shared-with +# Use a system-provided definition (openai-large, openai-small, etc.) +curl -X PUT http://localhost:8880/v1/llm-instances/alice/my-openai \ + -H "Authorization: Bearer ALICE_VDB_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "definition_owner": "_system", + "definition_handle": "openai-large", + "api_key_encrypted": "YOUR_OPENAI_API_KEY" + }' ``` -Only the project owner can view the list of shared users and manage sharing. Users who have been granted access to a project cannot see which other users also have access. - -## Project Ownership Transfer - -The project owner can transfer ownership of a project to another user. This is useful when: -- A project maintainer is leaving and wants to hand over control -- Organizational changes require reassigning project ownership -- Consolidating projects under a different user account - -**Transfer ownership:** +### 4. Create a Project ```bash -POST /v1/projects/{owner}/{project}/transfer-ownership -{ - "new_owner_handle": "new_owner" -} +curl -X POST http://localhost:8880/v1/projects/alice \ + -H "Authorization: Bearer ALICE_VDB_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "my-texts", + "description": "My text embeddings", + "instance_owner": "alice", + "instance_handle": "my-openai" + }' ``` -**Important notes:** -- Only the current owner can transfer ownership -- The new owner must be an existing user -- The new owner cannot already have a project with the same handle -- After transfer, the old owner will lose all access to the project -- If the new owner was previously shared with the project, their sharing role will be upgraded to owner -- All embeddings and other project data remain intact during transfer - -**Example:** +### 5. Upload Embeddings ```bash -# Alice transfers her project to Bob -curl -X POST "https://api.example.com/v1/projects/alice/my-project/transfer-ownership" \ - -H "Authorization: Bearer alice_api_key" \ +curl -X POST http://localhost:8880/v1/embeddings/alice/my-texts \ + -H "Authorization: Bearer ALICE_VDB_KEY" \ -H "Content-Type: application/json" \ -d '{ - "new_owner_handle": "bob" + "embeddings": [{ + "text_id": "doc1", + "instance_handle": "my-openai", + "vector": [0.1, 0.2, 0.3, ...], + "vector_dim": 3072, + "metadata": {"author": "John Doe", "year": 2024} + }] }' - -# After transfer, the project is accessible at /v1/projects/bob/my-project ``` -### Shared User Access - -Once a project is shared with a user, they can: -- View project metadata -- Retrieve embeddings (reader and editor) -- Search for similar documents (reader and editor) -- Add, modify, or delete embeddings (editor only) - -Shared users **cannot**: -- Delete the project -- Change project settings -- Manage sharing (add/remove other users) -- View the list of other users the project is shared with - -## Data Validation - -The API provides automatic validation to ensure data quality and consistency: - -### Embeddings Dimension Validation - -When uploading embeddings, the system automatically validates: - -1. **Vector dimension consistency**: The `vector_dim` field in your embeddings must match the `dimensions` configured in the LLM service being used. -2. **Vector length verification**: The actual number of elements in the `vector` array must match the declared `vector_dim`. - -If validation fails, you'll receive a `400 Bad Request` response with a detailed error message explaining the mismatch. - -**Example error response:** - -```json -{ - "title": "Bad Request", - "status": 400, - "detail": "dimension validation failed: vector dimension mismatch: embedding declares 3072 dimensions but LLM service 'openai-large' expects 5 dimensions" -} -``` - -### Similarity Query Dimension Filtering - -When querying for similar embeddings, the system automatically filters results to only include embeddings with matching dimensions. This ensures that similarity comparisons are only made between vectors of the same dimensionality, preventing invalid comparisons. - -The similarity queries enforce: -- Only embeddings with matching `vector_dim` are compared -- Only embeddings from the same project are considered -- Vector similarity is calculated using cosine distance on compatible dimensions - -### Metadata Schema Validation - -Projects can optionally define a JSON Schema to validate metadata attached to embeddings. This ensures that all embeddings in a project have consistent, well-structured metadata. - -#### Defining a Metadata Schema - -Include a `metadataScheme` field when creating or updating a project with a valid JSON Schema: - -```json -{ - "project_handle": "my-project", - "description": "Project with metadata validation", - "metadataScheme": "{\"type\":\"object\",\"properties\":{\"author\":{\"type\":\"string\"},\"year\":{\"type\":\"integer\"}},\"required\":[\"author\"]}" -} -``` - -The schema above requires an `author` field (string) and allows an optional `year` field (integer). - -#### Schema Validation on Upload - -When uploading embeddings to a project with a metadata schema, the API validates each embedding's metadata against the schema. If validation fails, you'll receive a detailed error message: - -**Example error response:** - -```json -{ - "title": "Bad Request", - "status": 400, - "detail": "metadata validation failed for text_id 'doc123': metadata validation failed:\n - author: author is required" -} -``` - -### Admin Sanity Check - -Administrators can verify database integrity using the `/v1/admin/sanity-check` endpoint. This endpoint: - -- Checks all embeddings have dimensions matching their LLM service -- Validates all metadata against project schemas (if defined) -- Reports issues and warnings in a structured format - -**Example sanity check request:** +### 6. Find Similar Documents ```bash -curl -X GET http://localhost:8080/v1/admin/sanity-check \ - -H "Authorization: ******" -``` - -**Example response:** - -```json -{ - "status": "PASSED", - "total_projects": 5, - "issues_count": 0, - "warnings_count": 1, - "warnings": [ - "Project alice/project1 has 100 embeddings but no metadata schema defined" - ] -} -``` - -Status values: -- `PASSED`: No issues or warnings found -- `WARNING`: No critical issues, but warnings exist -- `FAILED`: Validation issues found that need attention - -#### Example Metadata Schemas - -**Simple schema with required fields:** - -```json -{ - "type": "object", - "properties": { - "author": {"type": "string"}, - "year": {"type": "integer"}, - "language": {"type": "string"} - }, - "required": ["author", "year"] -} -``` - -**Schema with nested objects:** - -```json -{ - "type": "object", - "properties": { - "author": { - "type": "object", - "properties": { - "name": {"type": "string"}, - "id": {"type": "string"} - }, - "required": ["name"] - }, - "publication": { - "type": "object", - "properties": { - "year": {"type": "integer"}, - "title": {"type": "string"} - } - } - }, - "required": ["author"] -} -``` - -**Schema with enums and constraints:** - -```json -{ - "type": "object", - "properties": { - "genre": { - "type": "string", - "enum": ["fiction", "non-fiction", "poetry", "drama"] - }, - "rating": { - "type": "number", - "minimum": 0, - "maximum": 5 - }, - "tags": { - "type": "array", - "items": {"type": "string"} - } - } -} +# Get documents similar to doc1 +curl "http://localhost:8880/v1/similars/alice/my-texts/doc1?threshold=0.7&limit=5" \ + -H "Authorization: Bearer ALICE_VDB_KEY" ``` -For more information on JSON Schema syntax, see [json-schema.org](https://json-schema.org/). - -## API documentation - -### API Versioning - -We are at `v1`. The first path component to all the endpoints (except for the OpenAPI file) is the version number, e.g. `POST https:///v1/embeddings//`. - -### Endpoints - -In the following table, the version number is skipped for readibility reasons. Nevertheless, it is the first component of all these endpoints. - -For a more detailed, and always up-to-date documentation of the endpoints, including query parameters, return values and data schemes, see the automatically generated live OpenAPI document at `/openapi.yaml` or the browsable version at `/docs`. - -| Endpoint | Method | Description | Allowed Users | -| -------- | ------ | ----------- | ------------- | -| /admin/footgun | GET | Reset Database: Remove all records from database and reset serials/counters | admin | -| /admin/sanity-check | GET | Verify all data in database conforms to schemas and dimension requirements | admin | -| /users | GET | Get all users (list of handles) registered with the Db | admin | -| /users | POST | Register a new user with the Db | admin | -| /users/\ | GET | Get information about user \ | admin, \ | -| /users/\ | PUT | Register a new user with the Db | admin | -| /users/\ | DELETE | Delete a user and all their projects/llm services from the Db | admin, \ | -| /projects/\ | GET | Get all projects (objects) for user \ | admin, \ | -| /projects/\ | POST | Register a new project for user \ | admin, \ | -| /projects/\/\ | GET | Get project information for \'s project \ | admin, \, authorized readers | -| /projects/\/\ | PUT | Register a new project calles \ for user \ | admin, \ | -| /projects/\/\ | DELETE | Delete \'s project \ | admin, \ | -| /projects/\/\/share | POST | Share \'s project \ with another user | admin, \ | -| /projects/\/\/share/\ | DELETE | Unshare \'s project \ from \ | admin, \ | -| /projects/\/\/shared-with | GET | Get list of users \'s project \ is shared with | admin, \ | -| /llm-services/\ | GET | Get all LLM services (objects) for user \ | admin, \ | -| /llm-services/\ | POST | Register a new LLM service for user \ | admin, \ | -| /llm-services/\/ | GET | Get information about LLM service of user \ | admin, \ | -| /llm-services/\/ | PUT | Register a new LLM service called for user \ | admin, \ | -| /llm-services/\/ | DELETE | Delete \'s LLM service | admin, \ | -| /api-standards | GET | Get all defined API standards* | public | -| /api-standards | POST | Register a new API standard* | admin | -| /api-standards/\ | GET | Get information about API standard* \ | public | -| /api-standards/\ | PUT | Register a new API standard* \ | admin | -| /api-standards/\ | DELETE | Delete API standard* \ | admin | -| /embeddings/\/\ | GET | Get all embeddings for \'s project \ (use `limit` and `offset` for paging) | admin, \, authorized readers | -| /embeddings/\/\ | POST | Register a new record with an embeddings vector for \'s project \ | admin, \ | -| /embeddings/\/\ | DELETE | Delete ***all*** embeddings for \'s project \ | admin, \ | -| /embeddings/\/\/\ | GET | Get embeddings and other information about text \ from \'s project \ | admin, \, authorized readers | -| /embeddings/\/\/\ | DELETE | Delete record \ from \'s project \ | admin, \ | -| /similars/\/\/\ | GET | Get a list of documents similar to the text \ in \'s project \, with similarity scores | admin, \, authorized readers | -| /similars/\/\ | POST | Find similar documents using raw embeddings without storing them, with similarity scores | admin, \, authorized readers | - -\* API standards are definitions of how to access an LLM Service: API endpoints, authentication mechanism etc. They are referred to from LLM Service definitions. When LLM Processing will be attempted, this is what will be implemented. Examples are the Cohere Embed API, Version 2, as documented in , or the OpenAI Embeddings API, Version 1, as documented in . You can find these examples in the [valid_api_standard\*.json](./testdata/) files in the `testdata` directory. - -### Similarity Search - -The API provides two endpoints for finding similar documents using vector similarity: - -#### GET Similar Documents (from stored embeddings) - -Find documents similar to an already-stored document by its identifier: +### 7. Filter by Metadata ```bash -GET /v1/similars/{username}/{projectname}/{identifier} +# Exclude documents from the same author +curl "http://localhost:8880/v1/similars/alice/my-texts/doc1?threshold=0.7&metadata_path=author&metadata_value=John%20Doe" \ + -H "Authorization: Bearer ALICE_VDB_KEY" ``` -**Query Parameters:** -- `count` (optional, default: 10, max: 200): Number of similar documents to return -- `threshold` (optional, default: 0.5, range: 0-1): Minimum similarity score threshold -- `limit` (optional, default: 10, max: 200): Maximum number of results to return -- `offset` (optional, default: 0): Pagination offset -- `metadata_path` (optional): Filter results by metadata field path (must be used with `metadata_value`) -- `metadata_value` (optional): Metadata value to exclude from results (must be used with `metadata_path`) +## Getting Started -**Example:** +📚 **[Read the Full Documentation](https://mpilhlt.github.io/dhamps-vdb/)** -```bash -curl -X GET "https:///v1/similars/alice/myproject/doc123?count=5&threshold=0.7" \ - -H "Authorization: Bearer " -``` - -#### POST Similar Documents (from raw embeddings) - -Find similar documents by submitting a raw embedding vector without storing it in the database: +- **[Installation Guide](https://mpilhlt.github.io/dhamps-vdb/getting-started/installation/)** - Build from source +- **[Docker Guide](https://mpilhlt.github.io/dhamps-vdb/getting-started/docker/)** - Detailed Docker deployment (also see [DOCKER.md](./DOCKER.md)) +- **[Configuration](https://mpilhlt.github.io/dhamps-vdb/getting-started/configuration/)** - Environment variables and options +- **[Your First Project](https://mpilhlt.github.io/dhamps-vdb/getting-started/first-project/)** - Complete walkthrough +- **[API Reference](https://mpilhlt.github.io/dhamps-vdb/api/)** - Complete API documentation -```bash -POST /v1/similars/{username}/{projectname} -``` +## Key Concepts -**Request Body:** +- **[Architecture](https://mpilhlt.github.io/dhamps-vdb/concepts/architecture/)** - How dhamps-vdb works +- **[Users & Authentication](https://mpilhlt.github.io/dhamps-vdb/concepts/users-and-auth/)** - API keys and roles +- **[Projects](https://mpilhlt.github.io/dhamps-vdb/concepts/projects/)** - Organizing embeddings +- **[LLM Services](https://mpilhlt.github.io/dhamps-vdb/concepts/llm-services/)** - Definitions vs. instances +- **[Similarity Search](https://mpilhlt.github.io/dhamps-vdb/concepts/similarity-search/)** - How it works -```json -{ - "vector": [0.1, 0.2, 0.3, ...] -} -``` +## Common Tasks -The vector must be an array of float32 values with dimensions matching the project's LLM service instance configuration. +- **[RAG Workflow](https://mpilhlt.github.io/dhamps-vdb/guides/rag-workflow/)** - End-to-end example +- **[Project Sharing](https://mpilhlt.github.io/dhamps-vdb/guides/project-sharing/)** - Collaborate with others +- **[Metadata Filtering](https://mpilhlt.github.io/dhamps-vdb/guides/metadata-filtering/)** - Exclude documents in search +- **[Metadata Validation](https://mpilhlt.github.io/dhamps-vdb/guides/metadata-validation/)** - Enforce schemas +- **[Public Projects](https://mpilhlt.github.io/dhamps-vdb/guides/public-projects/)** - Unauthenticated access -**Query Parameters:** Same as GET endpoint above. +## Development -**Example:** +### Building from Source ```bash -curl -X POST "https:///v1/similars/alice/myproject?count=10&threshold=0.8" \ - -H "Authorization: Bearer " \ - -H "Content-Type: application/json" \ - -d '{ - "vector": [-0.020850, 0.018522, 0.053270, 0.071384, 0.020003] - }' -``` - -#### Response Format - -Both similarity endpoints return the same response format with document identifiers and their similarity scores: - -```json -{ - "$schema": "http://localhost:8080/schemas/SimilarResponseBody.json", - "user_handle": "alice", - "project_handle": "myproject", - "results": [ - { - "id": "doc456", - "similarity": 0.95 - }, - { - "id": "doc789", - "similarity": 0.87 - }, - { - "id": "doc321", - "similarity": 0.82 - } - ] -} -``` - -**Response Fields:** - -- `user_handle`: The project owner's username -- `project_handle`: The project identifier -- `results`: Array of similar documents, ordered by similarity (highest first) - - `id`: Document identifier - - `similarity`: Cosine similarity score (0-1, where 1 is most similar) - -#### Dimension Validation +# Install dependencies and generate code +go get ./... +sqlc generate --no-remote -When using the POST endpoint, the API automatically validates that: -1. The project has an associated LLM service instance -2. The submitted vector dimensions match the LLM service instance's configured dimensions -3. If dimensions don't match, a `400 Bad Request` error is returned with details +# Build +go build -o build/dhamps-vdb main.go -**Example error:** - -```json -{ - "title": "Bad Request", - "status": 400, - "detail": "vector dimension mismatch: expected 1536 dimensions, got 768" -} +# Or run directly +go run main.go ``` -#### Metadata Filtering +### Running Tests -Both endpoints support filtering results by metadata fields. The filter uses negative matching (excludes documents where the metadata field matches the specified value): +Tests use testcontainers for integration testing: ```bash -# Exclude documents with author="John Doe" -curl -X GET "https:///v1/similars/alice/myproject/doc123?metadata_path=author&metadata_value=John%20Doe" \ - -H "Authorization: Bearer " -``` - -This is useful for excluding documents from the same source, author, or category when finding similar content. - -### Partial Updates with PATCH - -For resources that support both GET and PUT operations, PATCH requests are automatically available for partial updates. You only need to include the fields you want to change. This is particularly useful for updating single fields without having to provide all resource data. - -#### **Supported resources:** - -- Users: `/v1/users/{username}` -- Projects: `/v1/projects/{username}/{projectname}` -- LLM Services: `/v1/llm-services/{username}/{llm_servicename}` -- API Standards: `/v1/api-standards/{standardname}` - -#### **Example: Enable world-readable access for a project** +# Start container runtime (if using podman) +systemctl --user start podman.socket +export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/podman/podman.sock -```bash -curl -X PATCH https:///v1/projects/alice/myproject \ - -H "Authorization: Bearer " \ - -H "Content-Type: application/json" \ - -d '{"shared_with": ["*"]}' +# Run tests +go test -v ./... ``` -#### **Example: Update project description** +For more details, see the **[Testing Guide](https://mpilhlt.github.io/dhamps-vdb/development/testing/)**. -```bash -curl -X PATCH https:///v1/projects/alice/myproject \ - -H "Authorization: Bearer " \ - -H "Content-Type: application/json" \ - -d '{"description": "Updated project description"}' -``` - -The PATCH endpoint merges your changes with the existing resource data retrieved via GET, then applies the update via PUT. - -## Code creation and structure - -This API is programmed in go and uses the [huma](https://huma.rocks/) framework with go's stock `http.ServeMux()` routing. - -Some initial code and some later bugfixes have been developed in dialogue with [ChatGPT](./docs/ChatGPT.md). After manual inspection and correction, this is the project structure: - -```default -dhamps-vdb/ -├── .env // This is not distributed because it's in .gitignore -├── .gitignore -├── .repopackignore -├── LICENSE -├── README.md -├── go.mod -├── go.sum -├── main.go -├── repopack-output.xml -├── repopack.config.json -├── sqlc.yaml -├── template.env -├── api/ -│ └── openapi.yml // OpenAPI spec file, not up to date -├── docs/ -│ └── ChatGPT.md // Code as suggested by ChatGPT (GPT4 turbo and GPT4o) on 2024-06-09 -├── internal/ -│ ├── auth/ -│ │ └── authenticate.go -│ ├── database/ -│ │ ├── migrations/ -│ │ │ ├── 001_create_initial_scheme.sql -│ │ │ ├── 002_create_emb_index.sql -│ │ │ ├── tern.conf // This is not distributed because it's in .gitignore -│ │ │ └── tern.conf.tpl -│ │ ├── queries/ -│ │ │ └── queries.sql -│ │ ├── database.go -│ │ ├── db.go // This is auto-generated by sqlc -│ │ ├── migrations.go -│ │ ├── models.go // This is auto-generated by sqlc -│ │ └── queries.sql.go // This is auto-generated by sqlc -│ ├── handlers/ -│ │ ├── admin.go -│ │ ├── admin_test.go -│ │ ├── api_standards.go -│ │ ├── api_standards_test.go -│ │ ├── embeddings.go -│ │ ├── embeddings_test.go -│ │ ├── handlers.go -│ │ ├── handlers_test.go -│ │ ├── llm_processes.go -│ │ ├── instances.go -│ │ ├── llm_services_test.go -│ │ ├── projects.go -│ │ ├── projects_test.go -│ │ ├── similars.go -│ │ ├── users.go -│ │ └── users_test.go -│ └── models/ -│ ├── admin.go -│ ├── api_standards.go -│ ├── embeddings.go -│ ├── llm_processes.go -│ ├── instances.go -│ ├── options.go -│ ├── projects.go -│ ├── similars.go -│ └── users.go -├── testdata/ -│ ├── postgres/ -│ │ ├── enable-vector.sql -│ │ └── users.yml -│ ├── invalid_api_standard.json -│ ├── invalid_embeddings.json -│ ├── ... -│ ├── valid_api_standard_cohere_v2.json -│ ├── valid_api_standard_ollama.json -│ ├── valid_api_standard_openai_v1.json -│ ├── valid_embeddings.json -│ ├── valid_llm_service_cohere-multilingual-3.json -│ ├── valid_llm_service_openai-large-full.json -│ ├── ... -│ └── valid_user.json -└── web/ // Web resources for the html response (in the future) -``` +## Contributing -## Roadmap - -- [ ] Revisit all **documentation** - - [ ] Add documentation for metadata filtering of similars (the GET query parameters are called `metadata_path` and `metadata_value` as in: `https://xy.org/vdb-api/v1/similars/sal/sal-openai-large/https%3A%2F%2Fid.myproject.net%2Ftexts%2FW0011%3A1.3.1.3.1?threshold=0.7&limit=5&metadata_path=author_id&metadata_value=A0083`) -- [ ] Network connectivity - - [ ] Implement and make consequent use of **max_idle** (5), **max_concurr** (5), **timeouts**, and **cancellations** - - [ ] **Concurrency** (leaky bucket approach) and **Rate limiting** (redis, sliding window, implement headers). See for limiting. - - [ ] Caching -- [ ] Add API standards for anthropic, mistral, llama.cpp, ollama, vllm, llmstudio -- [ ] HTML UI? -- [ ] Allow to request verbose information even in list outputs (with a verbose=yes query parameter?) -- [ ] Add possiblity to use PATCH method to change existing resources -- [ ] Proper logging with `--verbose` and `--quiet` modes -- [x] Transfer of projects from one owner to another as new operation -- [x] Make sure pagination is supported consistently -- [x] Dockerization -- [x] Prevent acceptance of requests as user "_system" -- [x] Tests - - [x] When testing, check cleanup by adding a new query/function to see if all tables are empty - - [x] Make sure input is validated consistently -- [x] Catch POST to existing resources -- [x] User authentication & restrictions on some API calls -- [x] API versioning -- [x] better **options** handling () -- [x] handle **metadata** - - [x] Validation with metadata schema -- [x] Allow to filter similar passages by metadata field (so as to exclude e.g. documents from the same author) -- [x] Use **transactions** (most importantly, when an action requires several queries, e.g. projects being added and then linked to several read-authorized users) -- [x] Always use specific error messages -- [x] Add project sharing/unsharing functions & API paths -- [x] Add definition creation/listing/deletion functions & paths -- [x] Add mechanism to allow anonymous/public reading access to embeddings (via `"*"` in `shared_with`) +Contributions are welcome! Please see our **[Contributing Guide](https://mpilhlt.github.io/dhamps-vdb/development/contributing/)** for details. ## License -[MIT License](./LICENSE) +This project is licensed under the terms specified in the [LICENSE](./LICENSE) file. -## Versions +## Support -- 2026-02-08 **v0.1.0**: Fix many things, add many things, still API v1 on the way to stable... -- 2024-12-10 **v0.0.1**: Initial public release (still work in progress) of API v1 +- 📖 [Documentation](https://mpilhlt.github.io/dhamps-vdb/) +- 🐛 [Report Issues](https://github.com/mpilhlt/dhamps-vdb/issues) +- 💬 [GitHub Discussions](https://github.com/mpilhlt/dhamps-vdb/discussions) diff --git a/docs/ChatGPT.md b/docs/ChatGPT.md deleted file mode 100644 index 4dcf142..0000000 --- a/docs/ChatGPT.md +++ /dev/null @@ -1,486 +0,0 @@ -# ChatGPT Output - -I had ChatGPT (4o) write code implementing an OpenAPI spec developed in dialogue with it. - -## OpenAPI spec - -```yaml -openapi: 3.0.0 -info: - title: Example API - version: 1.0.0 - -servers: - - url: 'https://example.com/' - -paths: - /admin/users: - put: - summary: Create or update a user - operationId: putAdminUsers - tags: - - Admin - security: - - bearerAuth: [] - requestBody: - required: true - content: - application/json: - schema: - type: object - properties: - username: - type: string - email: - type: string - apiKey: - type: string - responses: - '200': - description: User created or updated - content: - application/json: - schema: - type: object - properties: - id: - type: string - username: - type: string - email: - type: string - '401': - description: Unauthorized - '400': - description: Bad Request - - /admin/users/{user}: - patch: - summary: Update a specific user - operationId: patchAdminUser - tags: - - Admin - security: - - bearerAuth: [] - parameters: - - name: user - in: path - required: true - schema: - type: string - requestBody: - required: true - content: - application/json: - schema: - type: object - properties: - username: - type: string - email: - type: string - apiKey: - type: string - responses: - '200': - description: User updated - '401': - description: Unauthorized - '404': - description: User not found - - # Repeating this pattern for other methods like GET and DELETE - - /{user}/projects: - put: - summary: Create or update a project - operationId: putUserProjects - tags: - - Projects - security: - - bearerAuth: [] - # Parameters and request body definition goes here - responses: - '200': - description: Project created or updated - '401': - description: Unauthorized - - # Similar for PATCH, GET, DELETE on /{user}/projects/:project - - /{user}/{project}/embeddings: - put: - summary: Create or update embeddings for a project - operationId: putProjectEmbeddings - tags: - - Embeddings - security: - - bearerAuth: [] - # Define request and response structure - post: - summary: Create embeddings for provided text - operationId: postProjectEmbeddings - tags: - - Embeddings - security: - - bearerAuth: [] - # Define request and response structure - - # Similar for GET, DELETE on /{user}/{project}/embeddings/:id - - /{user}/{project}/similars/{id}: - get: - summary: Retrieve similar items - operationId: getProjectSimilars - tags: - - Similars - security: - - bearerAuth: [] - parameters: - - name: count - in: query - schema: - type: integer - - name: threshold - in: query - schema: - type: number - # Define response structure - - /llm-process: - post: - summary: Process text with LLM service - operationId: llmProcess - tags: - - LLM - security: - - bearerAuth: [] - requestBody: - required: true - content: - application/json: - schema: - type: object - properties: - serviceId: - type: string - projectId: - type: string - contextId: - type: string - textFields: - type: array - items: - type: string - responses: - '200': - description: Processed successfully - '401': - description: Unauthorized - -components: - securitySchemes: - bearerAuth: - type: http - scheme: bearer - bearerFormat: JWT -``` - -> Missing information and assumptions: -> -> - Detailed request and response schemas for each endpoint are needed. -> - Error response codes need to be defined for each operation. -> - Authentication method is assumed to be Bearer Token (JWT), which needs to be confirmed. -> - Path and query parameters need to be fully specified with types and descriptions. -> - For the llm-process endpoint, the exact nature of "multiple text fields or text-id parameters" needs clarification for accurate schema definition. -> This is a basic structure, and you will need to fill in the details based on the actual requirements and data structures of your API. - -Then, requesting PASETO tokens: - -```yaml -openapi: 3.0.0 -info: - title: Example API - version: 1.0.0 - -servers: - - url: 'https://example.com/' - -paths: - /admin/users: - put: - summary: Create or update a user - operationId: putAdminUsers - tags: - - Admin - security: - - pasetoAuth: [] - requestBody: - required: true - content: - application/json: - schema: - type: object - properties: - username: - type: string - email: - type: string - apiKey: - type: string - responses: - '200': - description: User created or updated - '401': - description: Unauthorized - # Define other responses as needed - - # Other endpoints definitions go here... - -components: - securitySchemes: - pasetoAuth: - type: http - scheme: bearer - bearerFormat: PASETO -``` - -> Please note that while OpenAPI doesn't natively distinguish between JWT and PASETO as they both use Bearer tokens, specifying bearerFormat helps indicate the expected token format. It's also important to document this in the API documentation so that consumers know that the API uses PASETO tokens for authentication. - -## Golang implementation - -Here is the golang implementation using the huma framework and go's own `http.ServeMux()` router: - -Project structure: - -```default -example-api/ -├── main.go -├── handlers/ -│ ├── admin.go -│ ├── projects.go -│ ├── embeddings.go -│ ├── similars.go -│ ├── llm_process.go -└── models/ - ├── user.go - ├── project.go - ├── embedding.go - ├── similar.go - ├── llm_process.go -``` - -```{code} go -:filename: main.go -:linenos: -package main - -import ( - "net/http" - - "github.com/danielgtaylor/huma/v2" - "example-api/handlers" -) - -func main() { - router := http.NewServeMux() - api := huma.NewRouter("Example API", "1.0.0") - - handlers.RegisterAdminRoutes(api) - handlers.RegisterProjectRoutes(api) - handlers.RegisterEmbeddingRoutes(api) - handlers.RegisterSimilarRoutes(api) - handlers.RegisterLLMProcessRoutes(api) - - router.Handle("/", api) - http.ListenAndServe(":8080", router) -} -``` - -```{code} go -:filename: handlers/admin.go -:linenos: -package handlers - -import ( - "github.com/danielgtaylor/huma/v2" - "example-api/models" -) - -func RegisterAdminRoutes(api *huma.Router) { - api.Resource("/admin/users"). - Put("Create or update a user", func(ctx huma.Context, input models.User) { - // Implement your logic here - }) - - api.Resource("/admin/users/{user}"). - Patch("Update a specific user", func(ctx huma.Context, input models.User) { - // Implement your logic here - }). - Get("Get a specific user", func(ctx huma.Context) { - // Implement your logic here - }). - Delete("Delete a specific user", func(ctx huma.Context) { - // Implement your logic here - }) -} -``` - -```{code} go -:filename: handlers/projects.go -:linenos: -package handlers - -import ( - "github.com/danielgtaylor/huma/v2" - "example-api/models" -) - -func RegisterProjectRoutes(api *huma.Router) { - api.Resource("/{user}/projects"). - Put("Create or update a project", func(ctx huma.Context, input models.Project) { - // Implement your logic here - }) - - api.Resource("/{user}/projects/{project}"). - Patch("Update a specific project", func(ctx huma.Context, input models.Project) { - // Implement your logic here - }). - Get("Get a specific project", func(ctx huma.Context) { - // Implement your logic here - }). - Delete("Delete a specific project", func(ctx huma.Context) { - // Implement your logic here - }) -} -``` - -```{code} go -:filename: handlers/embeddings.go -:linenos: -package handlers - -import ( - "github.com/danielgtaylor/huma/v2" - "example-api/models" -) - -func RegisterEmbeddingRoutes(api *huma.Router) { - api.Resource("/{user}/{project}/embeddings"). - Put("Create or update embeddings for a project", func(ctx huma.Context, input models.Embedding) { - // Implement your logic here - }). - Post("Create embeddings for provided text", func(ctx huma.Context, input models.Embedding) { - // Implement your logic here - }) - - api.Resource("/{user}/{project}/embeddings/{id}"). - Get("Get a specific embedding", func(ctx huma.Context) { - // Implement your logic here - }). - Delete("Delete a specific embedding", func(ctx huma.Context) { - // Implement your logic here - }) -} -``` - -```{code} go -:filename: handlers/similars.go -:linenos: -package handlers - -import ( - "github.com/danielgtaylor/huma/v2" - "example-api/models" -) - -func RegisterSimilarRoutes(api *huma.Router) { - api.Resource("/{user}/{project}/similars/{id}"). - Get("Retrieve similar items", func(ctx huma.Context) { - // Implement your logic here - }) - - api.Resource("/{user}/{project}/similars"). - Post("Retrieve similar items with query", func(ctx huma.Context, input models.SimilarQuery) { - // Implement your logic here - }) -} -``` - -```{code} go -:filename: handlers/llm_process.go -:linenos: -package handlers - -import ( - "github.com/danielgtaylor/huma/v2" - "example-api/models" -) - -func RegisterLLMProcessRoutes(api *huma.Router) { - api.Resource("/llm-process"). - Post("Process text with LLM service", func(ctx huma.Context, input models.LLMProcessRequest) { - // Implement your logic here - }) -} -``` - -```{code} go -:filename: models/user.go -:linenos: -package models - -type User struct { - Username string `json:"username"` - Email string `json:"email"` - APIKey string `json:"apiKey"` -} -``` - -```{code} go -:filename: models/project.go -:linenos: -package models - -type Project struct { - // Define project fields here -} -``` - -```{code} go -:filename: models/embedding.go -:linenos: -package models - -type Embedding struct { - // Define embedding fields here -} -``` - -```{code} go -:filename: models/similar.go -:linenos: -package models - -type SimilarQuery struct { - Count int `json:"count"` - Threshold float64 `json:"threshold"` -} -``` - -```{code} go -:filename: models/llm_process.go -:linenos: -package models - -type LLMProcessRequest struct { - ServiceID string `json:"serviceId"` - ProjectID string `json:"projectId"` - ContextID string `json:"contextId"` - TextFields []string `json:"textFields"` -} -``` - -> This code sets up the basic structure and routing for your API using the Huma v2 framework and Go's http.ServeMux. Each handler file contains the route definitions for a specific part of the API, and each model file defines the data structures used in the API. -> -> You will need to fill in the implementation details for each handler function and define the full structure of the models as required by your API. diff --git a/docs/LLM_SERVICE_REFACTORING.md b/docs/LLM_SERVICE_REFACTORING.md deleted file mode 100644 index 6febdf4..0000000 --- a/docs/LLM_SERVICE_REFACTORING.md +++ /dev/null @@ -1,751 +0,0 @@ -# LLM Service Architecture Refactoring - Complete Documentation - -## Table of Contents - -1. [Overview](#overview) -2. [Implementation Summary](#implementation-summary) -3. [Architecture](#architecture) -4. [Completed Work](#completed-work) -5. [Usage Guide](#usage-guide) -6. [Security Features](#security-features) -7. [Migration Guide](#migration-guide) -8. [Testing](#testing) -9. [Remaining Optional Work](#remaining-optional-work) - -## Overview - -This refactoring separates LLM services into two distinct concepts: - -1. **LLM Service Definitions** - Reusable templates owned by `_system` or users - - Contain configuration templates (endpoint, model, dimensions, API standard) - - Can be owned by `_system` (global templates) or individual users - - Used as templates for creating instances - -2. **LLM Service Instances** - User-specific configurations with encrypted API keys - - Contain actual service configurations and credentials - - Owned by individual users - - Can optionally reference a definition - - Support API key encryption - - Can be shared with other users - -## Implementation Summary - -### ✅ All Core Requirements Completed - -1. **Admin can manage _system definitions** - - `_system` user created in migration - - 4 default definitions seeded (openai-large, openai-small, cohere-v4, gemini-embedding-001) - - API standards (openai, cohere, gemini) created before definitions - -2. **Users can list all accessible instances** - - `GetAllAccessibleInstances` query returns owned + shared instances - - Users see all instances they own or have been granted access to - -3. **Handle-based instance references** - - Shared instances identified as `owner/handle` - - Own instances identified as `handle` - - Queries support handle-based lookups - -4. **API keys hidden from shared instances** - - API keys NEVER returned in GET/list responses (security) - - Write-only field in API - - Shared users can use instances but cannot see API keys - -5. **Multiple ways to create instances** - - From own definitions - - From _system definitions - - Standalone (all fields specified) - -6. **1:1 project-instance relationship** - - Projects must reference exactly one instance - - Enforced at database level - -### Build & Test Status - -- ✅ Code compiles successfully -- ✅ All tests passing (100% success rate) -- ✅ Migration tested and verified -- ✅ Encryption module tested - -## Architecture - -### Database Schema - -``` -definitions (templates) -├── definition_id (PK) -├── definition_handle -├── owner (FK → users, can be '_system') -├── endpoint, description, api_standard, model, dimensions -└── UNIQUE(owner, definition_handle) -└── Indexes: (owner, definition_handle), (definition_handle) - -instances (user-specific) -├── instance_id (PK) -├── instance_handle -├── owner (FK → users) -├── definition_id (FK → definitions, nullable) -├── endpoint, description, model, dimensions, api_standard -├── api_key_encrypted (BYTEA, AES-256-GCM encrypted) -└── UNIQUE(owner, instance_handle) - -instances_shared_with (n:m sharing) -├── user_handle (FK → users) -├── instance_id (FK → instances) -├── role (reader/editor/owner) -└── PRIMARY KEY(user_handle, instance_id) - -projects (1:1 with instances) -├── project_id (PK) -├── instance_id (FK → instances) -└── One project → One instance -``` - -### Key Tables Removed - -- `users_llm_services` - Redundant (ownership tracked via `instances.owner`) -- `projects_llm_services` - Replaced by 1:1 FK in projects table - -## Completed Work - -### 1. Database Migration (004) - -**File:** `internal/database/migrations/004_refactor_llm_services_architecture.sql` - -**Changes:** -- Created `definitions` table -- Renamed `instances` → `instances` -- Added `api_key_encrypted` BYTEA column -- Created `_system` user -- Dropped `users_llm_services` table (redundant) -- Modified `projects` table: removed many-to-many, added `instance_id` FK -- Created `instances_shared_with` table -- Seeded 3 API standards with documentation URLs: - - openai: https://platform.openai.com/docs/api-reference/embeddings - - cohere: https://docs.cohere.com/reference/embed - - gemini: https://ai.google.dev/gemini-api/docs/embeddings -- Seeded 4 default LLM service definitions: - - openai-large (3072 dimensions) - - openai-small (1536 dimensions) - - cohere-v4 (1536 dimensions) - - gemini-embedding-001 (3072 dimensions, default size) - -**Data Migration:** -- First linked LLM service per project → `project.instance_id` -- Rollback support included - -### 2. Encryption Module - -**File:** `internal/crypto/encryption.go` - -**Features:** -- AES-256-GCM encryption for API keys -- Uses `ENCRYPTION_KEY` environment variable (SHA256-hashed to ensure 32-byte key) -- Functions: - - `NewEncryptionKey(keyString)` - Create key from string - - `GenerateEncryptionKey()` - Generate random key - - `GetEncryptionKeyFromEnv()` - Read from environment - - `Encrypt(plaintext) → []byte` - - `Decrypt(ciphertext) → string` - - `EncryptToBase64(plaintext) → string` - - `DecryptFromBase64(base64) → string` - -**Testing:** Full test coverage in `internal/crypto/encryption_test.go` ✅ - -### 3. Database Queries (SQLC) - -**File:** `internal/database/queries/queries.sql` - -**Definitions:** -- `UpsertDefinition` - Create/update definition -- `DeleteDefinition` - Delete definition -- `RetrieveDefinition` - Get single definition -- `GetDefinitionsByUser` - List user's definitions -- `GetAllDefinitions` - List all definitions -- `GetSystemDefinitions` - List _system definitions - -**Instances:** -- `UpsertInstance` - Create/update instance (with encryption support) -- `CreateInstanceFromDefinition` - Create instance from definition template -- `DeleteInstance` - Delete instance -- `RetrieveInstance` - Get single instance -- `RetrieveInstanceByID` - Get instance by ID -- `RetrieveInstanceByOwnerHandle` - Get by owner/handle (supports both formats) -- `ShareInstance` - Share instance with another user -- `UnshareInstance` - Remove instance sharing -- `GetSharedUsersForInstance` - List users instance is shared with -- `GetInstanceByProject` - Get instance for project (1:1, renamed from plural) -- `GetInstancesByUser` - List user's owned instances -- `GetAllAccessibleInstances` - List owned + shared instances -- `GetSharedInstances` - List instances shared with user (sorted by role, owner, handle) - -**Updated Queries:** -- `UpsertProject` - Includes `instance_id` -- `UpsertEmbeddings` - Uses `instance_id` -- All embeddings queries - Updated to use instances table - -**SQLC Code Generated:** ✅ (`internal/database/models.go`, `internal/database/queries.sql.go`) - -### 4. Go Models - -**File:** `internal/models/instances.go` - -**Models:** -- `Definition` - For definitions -- `Instance` - For instances -- `LLMService` - Kept for backward API compatibility (maps to Instance) - -**Field Updates:** -- `InstanceHandle` (was `InstanceHandle`) -- `InstanceOwner` (was `LLMServiceOwner`) -- API keys marked as write-only (never returned in responses) - -### 5. Handlers - -**Updated Files:** -- `internal/handlers/instances.go` - All functions renamed with "Instance" suffix -- `internal/handlers/projects.go` - 1:1 instance relationship -- `internal/handlers/embeddings.go` - Uses instance from project -- `internal/handlers/admin.go` - Updated field names -- `internal/handlers/users.go` - Lists accessible instances -- `internal/handlers/validation.go` - Updated to InstanceHandle - -**Function Naming:** -- `putInstanceFunc` (was `putLLMFunc`) -- `getInstanceFunc` (was `getLLMFunc`) -- `deleteInstanceFunc` (was `deleteLLMFunc`) -- `getUserLLMsFunc` - Now returns all accessible instances (own + shared) - -**API Key Handling:** -- Encrypted on write if `ENCRYPTION_KEY` is set -- Never returned on read (security) -- Uses `Valid: true` consistently for nullable fields - -### 6. Environment Configuration - -**File:** `template.env` - -Added: -```bash -# Required for API key encryption (32+ characters recommended) -ENCRYPTION_KEY=your-secret-encryption-key-here-must-be-kept-secure -``` - -## Usage Guide - -### Creating an LLM Service Instance - -**Option A: Standalone (no definition)** -```bash -PUT /v1/llm-services/jdoe/my-openai -{ - "endpoint": "https://api.openai.com/v1/embeddings", - "api_standard": "openai", - "model": "text-embedding-3-large", - "dimensions": 3072, - "api_key_encypted": "sk-..." -} -``` - -**Option B: From _system definition** -```bash -# Use CreateInstanceFromDefinition query -# Handler would accept: -POST /v1/llm-services/jdoe/my-openai-instance -{ - "definition_owner": "_system", - "definition_handle": "openai-large", - "api_key_encrypted": "sk-..." -} -``` - -**Option C: From user's own definition** -```bash -# Similar to Option B, but with user as definition_owner -POST /v1/llm-services/jdoe/my-custom-instance -{ - "definition_owner": "jdoe", - "definition_handle": "my-custom-config", - "api_key_encrypted": "sk-..." -} -``` - -### Listing Accessible Instances - -```bash -GET /v1/llm-services/jdoe -# Returns all instances jdoe owns OR has been granted access to -# API keys are NOT included in response -``` - -### Creating a Project with Instance - -```bash -POST /v1/projects/jdoe/my-project -{ - "instance_id": 123, # or use handle-based reference - "description": "My project" -} -``` - -## Security Features - -### 1. API Key Encryption - -- **Algorithm:** AES-256-GCM -- **Key Source:** `ENCRYPTION_KEY` environment variable -- **Key Derivation:** SHA256 hash of environment variable -- **Storage:** `api_key_encrypted` BYTEA column - -### 2. Write-Only API Keys - -API keys are never returned in GET/list responses: -```json -GET /v1/llm-services/jdoe/my-openai -{ - "instance_id": 1, - "instance_handle": "my-openai", - "owner": "jdoe", - "endpoint": "...", - "model": "text-embedding-3-large", - "dimensions": 3072 - // Note: "api_key_encrypted" field is NOT present -} -``` - -### 3. Shared Instance Protection - -When an instance is shared: -- Shared users can USE the instance (e.g., for projects, embeddings) -- Shared users CANNOT see the API key -- Shared users CANNOT modify the instance (owner-only operation) -- Sharing is tracked in `instances_shared_with` table with role - -### 4. Admin-Only System Definitions - -- Only admin users can create/modify `_system` definitions -- Regular users can read `_system` definitions -- Regular users can create their own definitions -- No one can log in as `_system` - -## Migration Guide - -### For New Installations - -1. Run migrations: `make migrate-up` (or equivalent) -2. Set `ENCRYPTION_KEY` environment variable -3. Service is ready to use - -### For Existing Installations - -The migration (004) handles data migration automatically: - -**Automatic Changes:** -- `instances` table renamed to `instances` -- `users_llm_services` table dropped (ownership via owner column) -- `projects_llm_services` table dropped (replaced by FK) -- First linked instance per project → `project.instance_id` - -**Post-Migration Steps:** - -1. **Set Environment Variable:** - ```bash - export ENCRYPTION_KEY="your-secure-random-string-at-least-32-chars" - ``` - -### Breaking Changes - -**API Changes:** -- `GET /v1/llm-services/{user}` - No longer returns API keys -- `GET /v1/llm-services/{user}/{handle}` - No longer returns API keys -- Projects now require single instance (many-to-many removed) - -**Database:** -- `instances` → `instances` -- `users_llm_services` table removed -- `projects_llm_services` table removed - -**Backward Compatibility:** -- Existing endpoints continue to work -- Field names preserved in JSON responses (for API compatibility) -- Old plaintext API keys continue to work - -## Client Migration Guide - -This section explains what API clients need to change after the refactoring. - -### Summary of Changes for Clients - -**Good News:** Most API endpoints remain unchanged! The main changes are: -1. Projects must be created with a valid LLM service instance (1:1 relationship) -2. Embeddings JSON uses `instance_handle` instead of `llm_service_handle` - -### API Endpoints - No Changes Required - -All existing API endpoints continue to work with the same paths: - -``` -✅ PUT /v1/llm-services/{user}/{handle} # Create/update instance -✅ GET /v1/llm-services/{user} # List user's instances -✅ GET /v1/llm-services/{user}/{handle} # Get specific instance -✅ DELETE /v1/llm-services/{user}/{handle} # Delete instance - -✅ PUT /v1/projects/{user}/{project} # Create/update project -✅ GET /v1/projects/{user}/{project} # Get project -✅ DELETE /v1/projects/{user}/{project} # Delete project - -✅ POST /v1/embeddings/{user}/{project} # Upload embeddings -✅ GET /v1/embeddings/{user}/{project} # List embeddings -✅ DELETE /v1/embeddings/{user}/{project} # Delete embeddings -``` - -### Change #2: Embeddings Field Name Update - -**Before:** Used `llm_service_handle` in embeddings JSON -```json -POST /v1/embeddings/alice/my-project -{ - "text_id": "doc1", - "llm_service_handle": "my-openai", ← Old field name - "embedding": [0.1, 0.2, ...], - "metadata": {...} -} -``` - -**After:** Use `instance_handle` instead -```json -POST /v1/embeddings/alice/my-project -{ - "text_id": "doc1", - "instance_handle": "my-openai", ← New field name - "embedding": [0.1, 0.2, ...], - "metadata": {...} -} -``` - -**Action Required:** -- ⚠️ Update embedding upload code to use `instance_handle` field -- ⚠️ Update code that reads embeddings to expect `instance_handle` in responses - -### Change #3: Projects Must Have LLM Service Instance - -**Before:** Projects could be created without specifying an LLM service -```json -PUT /v1/projects/alice/my-project -{ - "description": "My project" -} -``` - -**After:** Projects require a valid `instance_id` -```json -PUT /v1/projects/alice/my-project -{ - "description": "My project", - "instance_id": 123 ← Required -} -``` - -**Action Required:** -- ⚠️ Create an LLM service instance BEFORE creating projects -- ⚠️ Include `instance_id` in project creation requests -- ℹ️ You can find the instance_id from the GET instances response - -### Complete Migration Workflow - -Here's the recommended workflow for clients: - -#### Step 1: Create LLM Service Instance (if not exists) - -```bash -# Check if instance exists -GET /v1/llm-services/alice - -# If not, create one -PUT /v1/llm-services/alice/my-openai -Content-Type: application/json - -{ - "endpoint": "https://api.openai.com/v1/embeddings", - "api_standard": "openai", - "model": "text-embedding-3-large", - "dimensions": 3072, - "api_key_encrypted": "sk-proj-your-key-here" -} - -# Response includes instance_id -{ - "instance_id": 123, - "instance_handle": "my-openai", - "owner": "alice", - "endpoint": "https://api.openai.com/v1/embeddings", - "api_standard": "openai", - "model": "text-embedding-3-large", - "dimensions": 3072 - // Note: api_key_encrypted not returned -} -``` - -#### Step 2: Create Project with Instance ID - -```bash -PUT /v1/projects/alice/my-project -Content-Type: application/json - -{ - "description": "My research project", - "instance_id": 123 // From step 1 -} -``` - -#### Step 3: Upload Embeddings - -```bash -POST /v1/embeddings/alice/my-project -Content-Type: application/json - -{ - "text_id": "doc1", - "instance_handle": "my-openai", // Use instance_handle (not llm_service_handle) - "embedding": [0.1, 0.2, 0.3, ...], - "metadata": { - "title": "Document 1", - "author": "Alice" - } -} -``` - -### Environment Setup for New Installations - -**Before:** No encryption key needed -```bash -# .env -DATABASE_URL=postgresql://... -ADMIN_KEY=your-admin-key -``` - -**After:** Add encryption key -```bash -# .env -DATABASE_URL=postgresql://... -ADMIN_KEY=your-admin-key -ENCRYPTION_KEY=your-secure-32-char-minimum-key # NEW -``` - -**Action Required:** -- ⚠️ Add `ENCRYPTION_KEY` to your environment variables -- ✅ Use a strong, random string (32+ characters recommended) -- ✅ Keep this key secure - losing it means losing access to encrypted API keys - -### Migration Checklist for Existing Clients - -Use this checklist to ensure your client is fully migrated: - -- [ ] **Stop reading API keys from GET responses** - - Update code to store API keys locally instead - -- [ ] **Update embedding field names** - - Change `llm_service_handle` → `instance_handle` in upload code - - Update parsing code to read `instance_handle` from responses - -- [ ] **Update project creation workflow** - - Create LLM service instance first - - Include `instance_id` in project creation - - Get instance_id from instance creation/list response - -- [ ] **Update environment configuration** - - Add `ENCRYPTION_KEY` to environment variables (for new installations) - - Restart services to pick up new configuration - -- [ ] **Test end-to-end workflow** - - Create instance → Create project → Upload embeddings - - Verify all steps work correctly - -### Troubleshooting - -**Problem:** "Project must have instance_id" error - -**Solution:** Create an LLM service instance first, then use its ID when creating the project. - ---- - -**Problem:** Embeddings upload fails with "unknown field llm_service_handle" - -**Solution:** Update your JSON to use `instance_handle` instead of `llm_service_handle`. - ---- - -**Problem:** Can't see API key after creating instance - -**Solution:** This is expected behavior (security improvement). Store the API key on the client side when you create the instance. - ---- - -**Problem:** Old embeddings have `llm_service_handle` in their data - -**Solution:** Existing embeddings stored before the migration will continue to have the old field name in their metadata. This is preserved for backward compatibility. When retrieving these embeddings, your client should be able to handle both `llm_service_handle` (old data) and `instance_handle` (new data). However, all NEW embeddings uploaded after migration must use `instance_handle`. - ---- - -**Problem:** Missing ENCRYPTION_KEY environment variable - -**Solution:** Add `ENCRYPTION_KEY=your-secure-key` to your environment variables. This is only required for new installations or if you want to start encrypting API keys. - -### Testing Your Migration - -Here's a test sequence to verify your client works correctly: - -```bash -# 1. Create instance -curl -X PUT "http://localhost:8000/v1/llm-services/testuser/test-instance" \ - -H "Authorization: Bearer your-api-key" \ - -H "Content-Type: application/json" \ - -d '{ - "endpoint": "https://api.openai.com/v1/embeddings", - "api_standard": "openai", - "model": "text-embedding-3-large", - "dimensions": 3072, - "api_key_encryped": "test-key" - }' - -# 2. List instances (verify api_key_encrypted is NOT returned) -curl -X GET "http://localhost:8000/v1/llm-services/testuser" \ - -H "Authorization: Bearer your-api-key" - -# 3. Create project with instance_id -curl -X PUT "http://localhost:8000/v1/projects/testuser/test-project" \ - -H "Authorization: Bearer your-api-key" \ - -H "Content-Type: application/json" \ - -d '{ - "description": "Test project", - "instance_id": 123 - }' - -# 4. Upload embeddings with instance_handle -curl -X POST "http://localhost:8000/v1/embeddings/testuser/test-project" \ - -H "Authorization: Bearer your-api-key" \ - -H "Content-Type: application/json" \ - -d '{ - "text_id": "test1", - "instance_handle": "test-instance", - "embedding": [0.1, 0.2, 0.3], - "metadata": {} - }' -``` - -### Summary of Required Client Changes - -| Area | Old Behavior | New Behavior | Action | -|------|-------------|--------------|--------| -| **API Endpoints** | Same paths | Same paths | ✅ No change | -| **API Keys in GET** | Returned in response | NOT returned | ⚠️ Stop reading, store locally | -| **Embeddings field** | `llm_service_handle` | `instance_handle` | ⚠️ Update field name | -| **Project creation** | Optional instance | Required `instance_id` | ⚠️ Create instance first | -| **Project-instance** | Many-to-many | 1:1 relationship | ⚠️ One instance per project | -| **Environment vars** | No encryption key | `ENCRYPTION_KEY` needed | ⚠️ Add to .env (new installs) | - -**Legend:** ✅ No action needed | ⚠️ Action required | ℹ️ Optional/informational - -## Testing - -### Test Status - -**✅ All Tests Passing (100% success rate):** -- TestLLMServicesFunc: 16/16 subtests -- TestEmbeddingsFunc: All subtests -- TestValidationFunc: All subtests (updated to use InstanceHandle) -- TestUserFunc: All subtests -- TestPublicAccess: Pass -- TestSimilarsFunc: Pass - -### Test Fixes Applied - -1. **Query Bug Fixed:** `GetAllAccessibleInstances` had user_handle filter in JOIN ON clause, preventing owned instances from being returned -2. **Test Expectations Updated:** Removed API key from expected responses (security) - -### Test Coverage - -**Current Coverage:** -- ✅ Basic Instance CRUD operations -- ✅ Authentication/authorization -- ✅ Invalid JSON handling -- ✅ Non-existent resource handling -- ✅ API key hiding in responses -- ✅ Field name updates (InstanceHandle, etc.) - -## Remaining Optional Work - -### Potential Enhancements (~7 hours total) - -#### 1. Split Test Files (1 hour) -- Create `definitions_test.go` for Definition tests -- Create `instances_test.go` for Instance tests -- Better organization and clarity - -#### 2. Add Definition Tests (2 hours) -- Creating definitions as _system user (admin only) -- Preventing non-admin users from creating _system definitions -- User-owned definitions -- Invalid input handling -- Deletion behavior - -#### 3. Add Instance Sharing Tests (2 hours) -- Sharing instances with other users -- Listing shared instances -- Access control verification -- API key protection for shared instances -- Revoking access - -#### 4. Add Encryption Tests (1 hour) -- API key encryption/decryption roundtrip -- Handling missing ENCRYPTION_KEY -- Key update scenarios - -#### 5. Documentation (1 hour) -- API documentation for new endpoints -- Examples of instance creation from definitions -- Security best practices - -### New Endpoints (Not Implemented) - -Consider adding these endpoints in the future: -- `GET /v1/llm-service-definitions` - List all available definitions -- `GET /v1/llm-service-definitions/_system` - List system definitions -- `POST /v1/llm-service-definitions/{user}` - Create user definition -- `POST /v1/llm-instances/{user}/from-definition/{handle}` - Create from definition -- `POST /v1/llm-instances/{user}/{instance}/share/{target}` - Share instance -- `DELETE /v1/llm-instances/{user}/{instance}/share/{target}` - Revoke sharing - -### API Key Migration Tool (Not Implemented) - -Create a CLI tool or admin endpoint to: -- List all instances with plaintext API keys -- Re-encrypt them using the current ENCRYPTION_KEY -- Verify successful encryption -- Remove plaintext keys - -## Design Decisions - -1. **Encryption:** Application-level encryption (not PostgreSQL's pgcrypto) for portability -2. **Key Storage:** Environment variable (not file-based) for security and container-friendliness -3. **Backward Compatibility:** Keep existing endpoints, map to new backend -4. **Default Instances:** Projects MUST specify an instance (no auto-creation) -5. **Sharing Model:** Read-only sharing (only owner can modify) -6. **System Definitions:** Owned by `_system` user, created in migration -7. **Ownership Tracking:** Via `owner` column (removed redundant join table) - -## References - -- Encryption implementation: `internal/crypto/encryption.go` -- Migration: `internal/database/migrations/004_refactor_llm_services_architecture.sql` -- Queries: `internal/database/queries/queries.sql` -- Performance notes: See `docs/PERFORMANCE_OPTIMIZATION.md` -- Test data: `testdata/valid_embeddings*.json` (updated to use instance_handle) - -## Support - -For questions or issues: -1. Review this documentation -2. Check the migration file for schema details -3. Review test files for usage examples -4. See PERFORMANCE_OPTIMIZATION.md for performance tuning diff --git a/docs/METADATA_SCHEMA_EXAMPLES.md b/docs/METADATA_SCHEMA_EXAMPLES.md deleted file mode 100644 index f760faa..0000000 --- a/docs/METADATA_SCHEMA_EXAMPLES.md +++ /dev/null @@ -1,230 +0,0 @@ -# Metadata Schema Validation Examples - -This document provides practical examples of using metadata schema validation in the dhamps-vdb API. - -## Example 1: Creating a Project with a Metadata Schema - -```bash -# Create a project with a metadata schema -curl -X POST http://localhost:8080/v1/projects/alice \ - -H "Authorization: Bearer YOUR_API_KEY" \ - -H "Content-Type: application/json" \ - -d '{ - "project_handle": "literary-texts", - "description": "Literary texts with structured metadata", - "metadataScheme": "{\"type\":\"object\",\"properties\":{\"author\":{\"type\":\"string\"},\"year\":{\"type\":\"integer\"},\"genre\":{\"type\":\"string\",\"enum\":[\"poetry\",\"prose\",\"drama\"]}},\"required\":[\"author\",\"year\"]}" - }' -``` - -## Example 2: Uploading Embeddings with Valid Metadata - -```bash -# Upload embeddings that conform to the schema -curl -X POST http://localhost:8080/v1/embeddings/alice/literary-texts \ - -H "Authorization: Bearer YOUR_API_KEY" \ - -H "Content-Type: application/json" \ - -d '{ - "embeddings": [{ - "text_id": "kant-critique-pure-reason", - "llm_service_handle": "openai-large", - "text": "Critique of Pure Reason excerpt", - "vector": [0.1, 0.2, 0.3, 0.4, 0.5], - "vector_dim": 5, - "metadata": { - "author": "Immanuel Kant", - "year": 1781, - "genre": "prose" - } - }] - }' -``` - -## Example 3: Validation Error - Missing Required Field - -```bash -# This will fail because "year" is required but missing -curl -X POST http://localhost:8080/v1/embeddings/alice/literary-texts \ - -H "Authorization: Bearer YOUR_API_KEY" \ - -H "Content-Type: application/json" \ - -d '{ - "embeddings": [{ - "text_id": "some-text", - "llm_service_handle": "openai-large", - "vector": [0.1, 0.2, 0.3, 0.4, 0.5], - "vector_dim": 5, - "metadata": { - "author": "John Doe" - } - }] - }' -``` - -Expected error response: -```json -{ - "$schema": "http://localhost:8080/schemas/ErrorModel.json", - "title": "Bad Request", - "status": 400, - "detail": "metadata validation failed for text_id 'some-text': metadata validation failed:\n - year is required" -} -``` - -## Example 4: Validation Error - Wrong Type - -```bash -# This will fail because "year" should be an integer, not a string -curl -X POST http://localhost:8080/v1/embeddings/alice/literary-texts \ - -H "Authorization: Bearer YOUR_API_KEY" \ - -H "Content-Type: application/json" \ - -d '{ - "embeddings": [{ - "text_id": "some-text", - "llm_service_handle": "openai-large", - "vector": [0.1, 0.2, 0.3, 0.4, 0.5], - "vector_dim": 5, - "metadata": { - "author": "John Doe", - "year": "1781" - } - }] - }' -``` - -## Example 5: Dimension Validation Error - -```bash -# This will fail because vector has 3 elements but declares 5 dimensions -curl -X POST http://localhost:8080/v1/embeddings/alice/literary-texts \ - -H "Authorization: Bearer YOUR_API_KEY" \ - -H "Content-Type: application/json" \ - -d '{ - "embeddings": [{ - "text_id": "some-text", - "llm_service_handle": "openai-large", - "vector": [0.1, 0.2, 0.3], - "vector_dim": 5, - "metadata": { - "author": "John Doe", - "year": 1781 - } - }] - }' -``` - -Expected error response: -```json -{ - "$schema": "http://localhost:8080/schemas/ErrorModel.json", - "title": "Bad Request", - "status": 400, - "detail": "dimension validation failed: vector length mismatch for text_id 'some-text': actual vector has 3 elements but vector_dim declares 5" -} -``` - -## Common Metadata Schema Patterns - -### Simple Required Fields -```json -{ - "type": "object", - "properties": { - "author": {"type": "string"}, - "year": {"type": "integer"} - }, - "required": ["author"] -} -``` - -### With Enums -```json -{ - "type": "object", - "properties": { - "genre": { - "type": "string", - "enum": ["poetry", "prose", "drama", "essay"] - }, - "language": { - "type": "string", - "enum": ["en", "de", "fr", "es", "la"] - } - }, - "required": ["genre"] -} -``` - -### Nested Objects -```json -{ - "type": "object", - "properties": { - "author": { - "type": "object", - "properties": { - "name": {"type": "string"}, - "birth_year": {"type": "integer"}, - "nationality": {"type": "string"} - }, - "required": ["name"] - } - }, - "required": ["author"] -} -``` - -### Arrays -```json -{ - "type": "object", - "properties": { - "keywords": { - "type": "array", - "items": {"type": "string"}, - "minItems": 1, - "maxItems": 10 - }, - "categories": { - "type": "array", - "items": { - "type": "string", - "enum": ["philosophy", "literature", "science", "history"] - } - } - } -} -``` - -### With Constraints -```json -{ - "type": "object", - "properties": { - "title": { - "type": "string", - "minLength": 1, - "maxLength": 200 - }, - "page_count": { - "type": "integer", - "minimum": 1 - }, - "rating": { - "type": "number", - "minimum": 0, - "maximum": 5 - } - } -} -``` - -## Tips - -1. **Escape JSON for command line**: When passing JSON schemas in curl commands, make sure to properly escape quotes or use single quotes for the outer JSON. - -2. **Use schema validators**: Before setting up your schema in the API, test it with online JSON Schema validators like [jsonschemavalidator.net](https://www.jsonschemavalidator.net/). - -3. **Start simple**: Begin with a simple schema and add more constraints as needed. You can always update the schema using a PATCH or PUT request to the project endpoint. - -4. **Optional metadata**: If you don't provide a `metadataScheme` when creating a project, metadata validation is skipped, and you can upload any JSON metadata. - -5. **Schema updates**: When you update a project's metadata schema, existing embeddings are not revalidated. The schema only applies to new or updated embeddings. diff --git a/docs/PERFORMANCE_OPTIMIZATION.md b/docs/PERFORMANCE_OPTIMIZATION.md deleted file mode 100644 index 564e1b3..0000000 --- a/docs/PERFORMANCE_OPTIMIZATION.md +++ /dev/null @@ -1,132 +0,0 @@ -# Performance Optimization Notes - -## Query Optimization Opportunities - -### GetAllAccessibleInstances Query - -**Current Implementation:** -```sql -SELECT instances.*, ... -FROM instances -LEFT JOIN instances_shared_with - ON instances."instance_id" = instances_shared_with."instance_id" -WHERE instances."owner" = $1 - OR instances_shared_with."user_handle" = $1 -ORDER BY instances."owner" ASC, instances."instance_handle" ASC -LIMIT $2 OFFSET $3; -``` - -**Issue:** -The LEFT JOIN combined with OR conditions in WHERE clause may result in inefficient query execution. The query planner might struggle to use indexes effectively. - -**Recommended Optimization:** -Use UNION ALL to separate owned instances from shared instances: - -```sql --- Get owned instances -SELECT instances.*, 'owner' as "role", true as "is_owner" -FROM instances -WHERE instances."owner" = $1 - -UNION ALL - --- Get shared instances -SELECT instances.*, - instances_shared_with."role", - false as "is_owner" -FROM instances -INNER JOIN instances_shared_with - ON instances."instance_id" = instances_shared_with."instance_id" -WHERE instances_shared_with."user_handle" = $1 - AND instances."owner" != $1 -- Avoid duplicates - -ORDER BY "owner" ASC, "instance_handle" ASC -LIMIT $2 OFFSET $3; -``` - -**Benefits:** -1. Query planner can use separate index scans for each UNION branch -2. Owned instances can use index on (owner) -3. Shared instances can use index on (user_handle) -4. Clearer query execution plan -5. Better performance with large datasets - -**Tradeoff:** -- Slightly more complex SQL -- Need to deduplicate if user somehow has instance both owned and shared (unlikely scenario) - -**Recommendation:** -- Current implementation is correct and works well for small-medium datasets -- Consider optimization if performance becomes an issue with large numbers of instances -- Profile with EXPLAIN ANALYZE before and after optimization - -## Other Optimization Opportunities - -### Index Suggestions - -Current indexes (from migration 004): -- `definitions(definition_handle)` -- `definitions(owner, definition_handle)` (composite) -- `instances(instance_handle)` -- `instances_shared_with(instance_id, user_handle)` (implicit from PK) - -**Additional indexes to consider:** -1. `instances(owner)` - for owned instance lookups -2. `instances_shared_with(user_handle)` - for shared instance lookups -3. `instances(owner, instance_handle)` - composite for unique constraint - -### Caching Opportunities - -1. **System Definitions**: Cache _system definitions since they rarely change -2. **User Instances**: Cache user's instance list with short TTL -3. **API Standards**: Cache list of API standards (nearly static) - -### Query Analysis Tools - -```bash -# Analyze query performance -EXPLAIN ANALYZE SELECT ...; - -# Check table statistics -ANALYZE instances; -ANALYZE instances_shared_with; - -# View current indexes -\di llm_service_* -``` - -## Performance Testing - -### Recommended Tests - -1. **Load Test**: 1000 users, 10 instances each -2. **Sharing Test**: 100 users sharing instances with 50 others each -3. **Query Test**: Measure GetAllAccessibleInstances with varying instance counts - -### Metrics to Track - -- Query execution time (p50, p95, p99) -- Database connection pool usage -- Index hit rates -- Cache hit rates (if implemented) - -### Performance Targets - -Based on typical usage: -- Single instance lookup: < 10ms -- List all accessible instances: < 50ms (for < 100 instances) -- Create/update instance: < 100ms (including encryption) - -## Implementation Priority - -1. **High**: Profile current performance with realistic data -2. **Medium**: Implement UNION ALL optimization if query time > 100ms -3. **Low**: Add caching layer for frequently accessed data -4. **Low**: Add indexes based on actual query patterns - -## Notes - -- Current implementation prioritizes correctness over optimization -- All tests pass with current query structure -- Performance optimization should be data-driven (measure first) -- Don't optimize prematurely - wait for actual performance issues diff --git a/docs/PUBLIC_ACCESS.md b/docs/PUBLIC_ACCESS.md deleted file mode 100644 index 56ce84a..0000000 --- a/docs/PUBLIC_ACCESS.md +++ /dev/null @@ -1,123 +0,0 @@ -# Public Access to Embeddings - -## Overview - -Projects can be configured to allow unauthenticated (public) read access to embeddings and similar documents by setting the `public_read` field to `true` when creating or updating a project. - -**Note**: The `shared_with` field is used for sharing projects with specific users. For public access, use the `public_read` boolean field. See the main [README.md](../README.md#project-sharing) for details on sharing with specific users. - -## Usage - -### Creating a Public Project - -When creating or updating a project, set `public_read` to `true`: - -```json -{ - "project_handle": "my-public-project", - "description": "A publicly accessible project", - "public_read": true -} -``` - -You can also combine public access with user-specific sharing: - -```json -{ - "project_handle": "my-project", - "description": "A public project with additional editors", - "public_read": true, - "shared_with": [ - { - "user_handle": "bob", - "role": "editor" - } - ] -} -``` - -### Endpoints with Public Access - -When a project has public read access enabled, the following endpoints can be accessed without authentication: - -- `GET /v1/projects/{user}/{project}` - Retrieve project metadata (including owner and shared_with) -- `GET /v1/embeddings/{user}/{project}` - Retrieve all embeddings for the project -- `GET /v1/embeddings/{user}/{project}/{text_id}` - Retrieve a specific embedding -- `GET /v1/similars/{user}/{project}/{text_id}` - Find similar documents - -### Endpoints Requiring Authentication - -Even for public projects, the following operations still require authentication: - -- `POST /v1/embeddings/{user}/{project}` - Create new embeddings -- `DELETE /v1/embeddings/{user}/{project}` - Delete all embeddings -- `DELETE /v1/embeddings/{user}/{project}/{text_id}` - Delete a specific embedding - -## Implementation Details - -### Database Schema - -A `public_read` boolean flag is stored in the `projects` table to indicate whether a project allows public access. - -### Authentication Flow - -1. When a request is made to a reader-protected endpoint, the middleware checks if authentication is required -2. If the project has `public_read` set to true, the request is allowed without an Authorization header -3. Unauthenticated requests are logged with the user set to "public" -4. If `public_read` is false or not set, normal authentication rules apply - -### Backwards Compatibility - -The `public_read` flag defaults to `false`, so existing projects continue to require authentication for read operations unless explicitly updated. - -### Project Metadata Display - -When a project has `public_read` enabled: -- The project is accessible without authentication for read operations -- The `public_read` flag will be set to `true` in the project metadata -- Anonymous users can view project metadata including owner and description - -## Security Considerations - -- Public access only applies to read operations (GET requests) -- Write operations (POST, PUT, DELETE) always require authentication -- Project metadata and ownership information is publicly visible for public projects -- The admin and owner authentication mechanisms are unaffected - -## Examples - -### Accessing a Public Project Without Authentication - -```bash -# Get project metadata without authentication -curl http://localhost:8080/v1/projects/alice/public-project -# Returns: {"project_handle": "public-project", "owner": "alice", "public_read": true, ...} - -# Get all embeddings without authentication -curl http://localhost:8080/v1/embeddings/alice/public-project - -# Get a specific embedding without authentication -curl http://localhost:8080/v1/embeddings/alice/public-project/text123 - -# Find similar documents without authentication -curl http://localhost:8080/v1/similars/alice/public-project/text123 -``` - -### Creating Embeddings Still Requires Authentication - -```bash -# This will fail with 401 Unauthorized -curl -X POST http://localhost:8080/v1/embeddings/alice/public-project \ - -H "Content-Type: application/json" \ - -d '{"embeddings": [...]}' - -# This will succeed with a valid API key -curl -X POST http://localhost:8080/v1/embeddings/alice/public-project \ - -H "Authorization: Bearer YOUR_API_KEY" \ - -H "Content-Type: application/json" \ - -d '{"embeddings": [...]}' -``` - -## Migration - -Existing projects are not affected. The `public_read` flag defaults to `false`, so all existing projects continue to require authentication for read operations unless explicitly updated to set `public_read: true`. diff --git a/docs/archetypes/default.md b/docs/archetypes/default.md new file mode 100644 index 0000000..25b6752 --- /dev/null +++ b/docs/archetypes/default.md @@ -0,0 +1,5 @@ ++++ +date = '{{ .Date }}' +draft = true +title = '{{ replace .File.ContentBaseName "-" " " | title }}' ++++ diff --git a/docs/config.toml b/docs/config.toml new file mode 100644 index 0000000..6b5d75c --- /dev/null +++ b/docs/config.toml @@ -0,0 +1,9 @@ +baseURL = 'https://mpilhlt.github.io/dhamps-vdb/' +languageCode = 'en-us' +title = 'dhamps-vdb Documentation' +theme = 'book' + +[params] + BookRepo = 'https://github.com/mpilhlt/dhamps-vdb' + BookEditPath = 'edit/main' + BookSearch = true diff --git a/docs/content/_index.md b/docs/content/_index.md new file mode 100644 index 0000000..62e336a --- /dev/null +++ b/docs/content/_index.md @@ -0,0 +1,54 @@ +--- +title: dhamps-vdb Documentation +type: docs +--- + +# dhamps-vdb Documentation + +Welcome to the documentation for **dhamps-vdb**, a vector database designed for Digital Humanities applications at the Max Planck Society initiative. + +## What is dhamps-vdb? + +dhamps-vdb is a PostgreSQL-backed vector database with pgvector support, providing a RESTful API for managing embeddings in Retrieval Augmented Generation (RAG) workflows. It offers multi-user support, project management, and flexible embedding configurations. + +## Key Features + +- **Multi-user Support**: Role-based access control (admin, owner, reader, editor) +- **Project Management**: Organize embeddings into projects with sharing capabilities +- **LLM Service Management**: Flexible service definitions and instances with encrypted API keys +- **Metadata Support**: JSON Schema validation and filtering in similarity search +- **PostgreSQL Backend**: Reliable storage with pgvector extension +- **RESTful API**: OpenAPI-documented endpoints +- **Docker Ready**: Easy deployment with Docker Compose + +## Quick Links + +- [Getting Started](/getting-started/) - Installation and first steps +- [Concepts](/concepts/) - Understand how dhamps-vdb works +- [API Reference](/api/) - Complete API documentation +- [Guides](/guides/) - How-to guides for common tasks + +## Getting Help + +- 📖 Browse this documentation +- 🐛 [Report issues](https://github.com/mpilhlt/dhamps-vdb/issues) +- 💬 [GitHub Discussions](https://github.com/mpilhlt/dhamps-vdb/discussions) + +## Quick Example + +```bash +# Start the service with Docker +./docker-setup.sh +docker-compose up -d + +# Create a user +curl -X POST http://localhost:8880/v1/users \ + -H "Authorization: Bearer YOUR_ADMIN_KEY" \ + -H "Content-Type: application/json" \ + -d '{"user_handle": "alice", "name": "Alice Smith"}' + +# Create a project and start working with embeddings +# See the Getting Started guide for a complete walkthrough +``` + +Ready to get started? Head over to the [Installation Guide](/getting-started/installation/). diff --git a/docs/content/api/_index.md b/docs/content/api/_index.md new file mode 100644 index 0000000..5930d11 --- /dev/null +++ b/docs/content/api/_index.md @@ -0,0 +1,39 @@ +--- +title: "API Reference" +weight: 4 +--- + +# API Reference + +Complete reference for the dhamps-vdb REST API. + +## API Version + +Current version: **v1** + +All endpoints are prefixed with `/v1/` (e.g., `POST /v1/embeddings/{user}/{project}`). + +## API Documentation + +The complete, always up-to-date API specification is available at: + +- **OpenAPI YAML**: `/openapi.yaml` +- **Interactive Documentation**: `/docs` + +## Reference Sections + +- [Authentication](authentication/) - API key authentication +- [Endpoints](endpoints/) - All available API endpoints +- [Query Parameters](query-parameters/) - Filtering and pagination +- [PATCH Updates](patch-updates/) - Partial resource updates +- [Error Handling](error-handling/) - Error responses and codes + +## Quick Example + +```bash +# Authenticate with API key +curl -X GET "https://api.example.com/v1/projects/alice" \ + -H "Authorization: Bearer your_api_key_here" +``` + +All API requests require authentication except for public project read operations. diff --git a/docs/content/api/authentication.md b/docs/content/api/authentication.md new file mode 100644 index 0000000..074a9fa --- /dev/null +++ b/docs/content/api/authentication.md @@ -0,0 +1,194 @@ +--- +title: "Authentication" +weight: 1 +--- + +# API Authentication + +All API requests (except public project read operations) require authentication using API keys passed in the `Authorization` header with a `Bearer` prefix. + +## Authentication Method + +Include your API key in the `Authorization` header of every request: + +``` +Authorization: Bearer your_api_key_here +``` + +## API Key Types + +### Admin API Key + +- Full access to all API endpoints and resources +- Can create and manage users +- Can access all projects and resources across all users +- Required for administrative operations +- Set via `ADMIN_KEY` environment variable + +**Admin-only operations:** +- `POST /v1/users` - Create new users +- `GET /v1/users` - List all users +- `GET /v1/admin/*` - Admin endpoints +- Create/modify `_system` LLM service definitions +- Access any user's resources + +### User API Key + +- Access limited to user's own resources and shared projects +- Cannot create other users +- Cannot access admin endpoints +- Returned when creating a new user (store securely) + +**User capabilities:** +- Manage own projects and LLM services +- Upload and query embeddings in owned projects +- Access projects shared with them (read or edit access) +- Delete own account + +## Getting an API Key + +### Admin Key + +Set the admin key via environment variable when launching the service: + +```bash +export ADMIN_KEY="your-secure-admin-key-here" +./dhamps-vdb +``` + +### User Key + +Create a new user using the admin API key: + +```bash +curl -X POST "https://api.example.com/v1/users" \ + -H "Authorization: Bearer admin_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "user_handle": "alice", + "name": "Alice Doe", + "email": "alice@example.com" + }' +``` + +**Response:** + +```json +{ + "user_handle": "alice", + "name": "Alice Doe", + "email": "alice@example.com", + "api_key": "024v2013621509245f2e24" +} +``` + +**⚠️ Important:** The API key is only returned once during user creation. Store it securely - it cannot be recovered. + +## Authorization Header Format + +### Correct Format + +```bash +curl -X GET "https://api.example.com/v1/projects/alice" \ + -H "Authorization: Bearer 024v2013621509245f2e24" +``` + +### Common Mistakes + +❌ Missing "Bearer" prefix: +``` +Authorization: 024v2013621509245f2e24 +``` + +❌ Wrong prefix: +``` +Authorization: Token 024v2013621509245f2e24 +``` + +❌ Extra quotes: +``` +Authorization: Bearer "024v2013621509245f2e24" +``` + +## Authentication Errors + +### 401 Unauthorized + +Returned when authentication credentials are missing or invalid. + +**Causes:** +- Missing `Authorization` header +- Invalid API key +- Malformed header format + +**Example response:** + +```json +{ + "title": "Unauthorized", + "status": 401, + "detail": "Invalid or missing authorization credentials" +} +``` + +### 403 Forbidden + +Returned when authentication succeeds but authorization fails. + +**Causes:** +- Attempting to access another user's resources +- User lacking required permissions +- Non-admin accessing admin endpoints + +**Example response:** + +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "You don't have permission to access this resource" +} +``` + +## Public Access + +Projects can be made publicly accessible by setting `public_read: true` when creating or updating the project. Public projects allow unauthenticated read access to: + +- Project metadata +- Embeddings +- Similarity search + +See [Public Access Documentation](/docs/PUBLIC_ACCESS.md) for details. + +## Security Best Practices + +1. **Never commit API keys** to version control +2. **Use environment variables** to store API keys in your applications +3. **Rotate keys regularly** by creating new users and deleting old ones +4. **Use HTTPS** in production to prevent key interception +5. **Store admin keys securely** using secrets management systems +6. **Limit admin key distribution** to essential personnel only + +## Example Authenticated Request + +```bash +# List user's projects +curl -X GET "https://api.example.com/v1/projects/alice" \ + -H "Authorization: Bearer 024v2013621509245f2e24" + +# Create a new project +curl -X POST "https://api.example.com/v1/projects/alice" \ + -H "Authorization: Bearer 024v2013621509245f2e24" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "my-project", + "description": "My research project", + "instance_owner": "alice", + "instance_handle": "my-openai" + }' +``` + +## Related Documentation + +- [Error Handling](error-handling/) - Complete error response reference +- [Users Endpoint](endpoints/users/) - User management API diff --git a/docs/content/api/endpoints/_index.md b/docs/content/api/endpoints/_index.md new file mode 100644 index 0000000..f7a9d92 --- /dev/null +++ b/docs/content/api/endpoints/_index.md @@ -0,0 +1,41 @@ +--- +title: "Endpoints" +weight: 1 +--- + +# API Endpoints + +Complete reference for all dhamps-vdb API endpoints. + +## Endpoint Categories + +- [Users](users/) - User management +- [Projects](projects/) - Project operations +- [LLM Services](llm-services/) - LLM service instances +- [API Standards](api-standards/) - API standard definitions +- [Embeddings](embeddings/) - Embedding storage and retrieval +- [Similars](similars/) - Similarity search + +## Endpoint Format + +All endpoints follow the pattern: + +``` +{METHOD} /v1/{resource}/{user}/{identifier} +``` + +Where: +- `METHOD`: HTTP method (GET, POST, PUT, DELETE, PATCH) +- `resource`: Resource type (users, projects, embeddings, etc.) +- `user`: User handle (owner of the resource) +- `identifier`: Specific resource identifier + +## Authentication + +Most endpoints require authentication via the `Authorization` header: + +``` +Authorization: Bearer your_api_key_here +``` + +Public projects allow unauthenticated access to read operations. diff --git a/docs/content/api/endpoints/api-standards.md b/docs/content/api/endpoints/api-standards.md new file mode 100644 index 0000000..e45c079 --- /dev/null +++ b/docs/content/api/endpoints/api-standards.md @@ -0,0 +1,457 @@ +--- +title: "API Standards" +weight: 4 +--- + +# API Standards Endpoint + +Manage API standard definitions that specify how to authenticate with different LLM service providers. API standards define the authentication mechanism (Bearer token, API key header, etc.) used by LLM service instances. + +## Overview + +API standards are referenced by LLM service instances to determine how to authenticate API requests. Examples include: + +- **OpenAI**: Bearer token in `Authorization` header +- **Cohere**: API key in `Authorization` header with `Bearer` prefix +- **Google Gemini**: API key as query parameter +- **Ollama**: No authentication required + +Pre-seeded standards are available for common providers. See [testdata/valid_api_standard_*.json](https://github.com/mpilhlt/dhamps-vdb/tree/main/testdata) for examples. + +--- + +## Endpoints + +### List All API Standards + +Get all defined API standards. This endpoint is publicly accessible (no authentication required). + +**Endpoint:** `GET /v1/api-standards` + +**Authentication:** Public (no authentication required) + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/api-standards" +``` + +**Response:** + +```json +{ + "standards": [ + { + "api_standard_handle": "openai", + "description": "OpenAI Embeddings API, Version 1, as documented in https://platform.openai.com/docs/api-reference/embeddings", + "key_method": "auth_bearer", + "key_field": "Authorization" + }, + { + "api_standard_handle": "cohere", + "description": "Cohere Embed API, Version 2, as documented in https://docs.cohere.com/reference/embed", + "key_method": "auth_bearer", + "key_field": "Authorization" + }, + { + "api_standard_handle": "gemini", + "description": "Google Gemini Embedding API as documented in https://ai.google.dev/gemini-api/docs/embeddings", + "key_method": "query_param", + "key_field": "key" + } + ] +} +``` + +--- + +### Get API Standard + +Retrieve information about a specific API standard. This endpoint is publicly accessible. + +**Endpoint:** `GET /v1/api-standards/{standardname}` + +**Authentication:** Public (no authentication required) + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/api-standards/openai" +``` + +**Response:** + +```json +{ + "api_standard_handle": "openai", + "description": "OpenAI Embeddings API, Version 1, as documented in https://platform.openai.com/docs/api-reference/embeddings", + "key_method": "auth_bearer", + "key_field": "Authorization" +} +``` + +--- + +### Create API Standard + +Register a new API standard definition. Admin-only operation. + +**Endpoint:** `POST /v1/api-standards` + +**Authentication:** Admin only + +**Request Body:** + +```json +{ + "api_standard_handle": "custom-provider", + "description": "Custom provider embedding API", + "key_method": "auth_bearer", + "key_field": "Authorization" +} +``` + +**Parameters:** + +- `api_standard_handle` (string, required): Unique identifier for the standard +- `description` (string, required): Description including API documentation URL +- `key_method` (string, required): Authentication method + - `auth_bearer`: Bearer token in header + - `auth_apikey`: API key in header + - `query_param`: API key as query parameter + - `none`: No authentication +- `key_field` (string, required): Field name for the API key + - For headers: typically `"Authorization"` or `"X-API-Key"` + - For query params: parameter name like `"key"` or `"api_key"` + +**Example:** + +```bash +curl -X POST "https://api.example.com/v1/api-standards" \ + -H "Authorization: Bearer admin_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "api_standard_handle": "ollama", + "description": "Ollama local embedding API, no authentication required", + "key_method": "none", + "key_field": "" + }' +``` + +**Response:** + +```json +{ + "api_standard_handle": "ollama", + "description": "Ollama local embedding API, no authentication required", + "key_method": "none", + "key_field": "" +} +``` + +--- + +### Update API Standard (PUT) + +Create or update an API standard with a specific handle. Admin-only operation. + +**Endpoint:** `PUT /v1/api-standards/{standardname}` + +**Authentication:** Admin only + +**Request Body:** Same as POST endpoint + +**Example:** + +```bash +curl -X PUT "https://api.example.com/v1/api-standards/custom-provider" \ + -H "Authorization: Bearer admin_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "api_standard_handle": "custom-provider", + "description": "Updated description for custom provider", + "key_method": "auth_bearer", + "key_field": "Authorization" + }' +``` + +--- + +### Delete API Standard + +Delete an API standard definition. Admin-only operation. + +**Endpoint:** `DELETE /v1/api-standards/{standardname}` + +**Authentication:** Admin only + +**Example:** + +```bash +curl -X DELETE "https://api.example.com/v1/api-standards/custom-provider" \ + -H "Authorization: Bearer admin_api_key" +``` + +**Response:** + +```json +{ + "message": "API standard 'custom-provider' deleted successfully" +} +``` + +**⚠️ Warning:** Cannot delete API standards that are referenced by LLM service instances. + +--- + +### Partial Update (PATCH) + +Update specific API standard fields without providing all data. Admin-only operation. + +**Endpoint:** `PATCH /v1/api-standards/{standardname}` + +**Authentication:** Admin only + +**Example - Update description:** + +```bash +curl -X PATCH "https://api.example.com/v1/api-standards/openai" \ + -H "Authorization: Bearer admin_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "description": "OpenAI Embeddings API v1 - Updated documentation link" + }' +``` + +See [PATCH Updates](../patch-updates/) for more details. + +--- + +## API Standard Properties + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `api_standard_handle` | string | Yes | Unique identifier (e.g., "openai", "cohere") | +| `description` | string | Yes | Description with API documentation URL | +| `key_method` | string | Yes | Authentication method | +| `key_field` | string | Yes | Field name for API key/token | + +--- + +## Authentication Methods + +### auth_bearer + +Bearer token authentication in the `Authorization` header. + +**Example:** +```json +{ + "key_method": "auth_bearer", + "key_field": "Authorization" +} +``` + +**HTTP Request:** +``` +Authorization: Bearer sk-proj-abc123... +``` + +**Used by:** OpenAI, Cohere, Anthropic + +--- + +### auth_apikey + +API key in a custom header field. + +**Example:** +```json +{ + "key_method": "auth_apikey", + "key_field": "X-API-Key" +} +``` + +**HTTP Request:** +``` +X-API-Key: abc123... +``` + +**Used by:** Some custom API providers + +--- + +### query_param + +API key passed as a URL query parameter. + +**Example:** +```json +{ + "key_method": "query_param", + "key_field": "key" +} +``` + +**HTTP Request:** +``` +GET https://api.example.com/embed?key=abc123... +``` + +**Used by:** Google Gemini, some older APIs + +--- + +### none + +No authentication required. + +**Example:** +```json +{ + "key_method": "none", + "key_field": "" +} +``` + +**Used by:** Ollama (local), self-hosted models without auth + +--- + +## Pre-seeded API Standards + +The following API standards are created during database migration: + +### openai + +```json +{ + "api_standard_handle": "openai", + "description": "OpenAI Embeddings API, Version 1, as documented in https://platform.openai.com/docs/api-reference/embeddings", + "key_method": "auth_bearer", + "key_field": "Authorization" +} +``` + +### cohere + +```json +{ + "api_standard_handle": "cohere", + "description": "Cohere Embed API, Version 2, as documented in https://docs.cohere.com/reference/embed", + "key_method": "auth_bearer", + "key_field": "Authorization" +} +``` + +### gemini + +```json +{ + "api_standard_handle": "gemini", + "description": "Google Gemini Embedding API as documented in https://ai.google.dev/gemini-api/docs/embeddings", + "key_method": "query_param", + "key_field": "key" +} +``` + +--- + +## Use Cases + +### Creating a Custom API Standard + +For self-hosted or custom LLM services: + +```bash +curl -X POST "https://api.example.com/v1/api-standards" \ + -H "Authorization: Bearer admin_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "api_standard_handle": "vllm-local", + "description": "vLLM local deployment with custom auth", + "key_method": "auth_apikey", + "key_field": "X-API-Key" + }' +``` + +### Referencing in LLM Service Instance + +Once created, reference the API standard in your LLM service instance: + +```bash +curl -X PUT "https://api.example.com/v1/llm-services/alice/my-vllm" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "instance_handle": "my-vllm", + "endpoint": "https://vllm.local/v1/embeddings", + "api_standard": "vllm-local", + "model": "custom-embed", + "dimensions": 768, + "api_key_encrypted": "my-secret-key" + }' +``` + +--- + +## Common Errors + +### 400 Bad Request + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "Invalid key_method: must be one of auth_bearer, auth_apikey, query_param, none" +} +``` + +### 401 Unauthorized (Admin Operations) + +```json +{ + "title": "Unauthorized", + "status": 401, + "detail": "Invalid or missing authorization credentials" +} +``` + +### 403 Forbidden (Admin Operations) + +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "Only admin users can create/modify/delete API standards" +} +``` + +### 404 Not Found + +```json +{ + "title": "Not Found", + "status": 404, + "detail": "API standard 'custom-provider' not found" +} +``` + +### 409 Conflict + +```json +{ + "title": "Conflict", + "status": 409, + "detail": "Cannot delete API standard: referenced by 5 LLM service instances" +} +``` + +--- + +## Related Documentation + +- [LLM Services](llm-services/) - LLM service instances reference API standards +- [Authentication](../authentication/) - User authentication methods +- [Testdata Examples](https://github.com/mpilhlt/dhamps-vdb/tree/main/testdata) - Example API standard definitions diff --git a/docs/content/api/endpoints/embeddings.md b/docs/content/api/endpoints/embeddings.md new file mode 100644 index 0000000..886ab25 --- /dev/null +++ b/docs/content/api/endpoints/embeddings.md @@ -0,0 +1,532 @@ +--- +title: "Embeddings" +weight: 5 +--- + +# Embeddings Endpoint + +Store and retrieve vector embeddings with associated text identifiers and metadata. Embeddings are organized within projects and validated against the project's LLM service instance dimensions. + +## Endpoints + +### List Embeddings + +Get all embeddings for a project with pagination support. + +**Endpoint:** `GET /v1/embeddings/{username}/{projectname}` + +**Authentication:** Admin, owner, authorized readers, or public if `public_read` is enabled + +**Query Parameters:** + +- `limit` (integer, default: 10, max: 200): Maximum number of results to return +- `offset` (integer, default: 0): Pagination offset + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/embeddings/alice/research-docs?limit=20&offset=0" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "embeddings": [ + { + "text_id": "doc123", + "user_handle": "alice", + "project_handle": "research-docs", + "instance_handle": "openai-large", + "text": "This is a research document about machine learning.", + "vector": [-0.020850, 0.018522, 0.053270, ...], + "vector_dim": 3072, + "metadata": { + "author": "Alice Doe", + "year": 2024 + } + } + ], + "total_count": 150, + "limit": 20, + "offset": 0 +} +``` + +--- + +### Upload Embeddings (Batch) + +Upload one or more embeddings to a project. Supports batch upload for efficiency. + +**Endpoint:** `POST /v1/embeddings/{username}/{projectname}` + +**Authentication:** Admin, owner, or authorized editors + +**Request Body:** + +```json +{ + "embeddings": [ + { + "text_id": "doc123", + "instance_handle": "openai-large", + "text": "This is a research document.", + "vector": [-0.020850, 0.018522, 0.053270, ...], + "vector_dim": 3072, + "metadata": { + "author": "Alice Doe", + "year": 2024 + } + }, + { + "text_id": "doc124", + "instance_handle": "openai-large", + "text": "Another research document.", + "vector": [0.012345, -0.054321, 0.098765, ...], + "vector_dim": 3072, + "metadata": { + "author": "Bob Smith", + "year": 2024 + } + } + ] +} +``` + +**Parameters:** + +- `embeddings` (array, required): Array of embedding objects + - `text_id` (string, required): Unique identifier for the text (URL-encoded recommended) + - `instance_handle` (string, required): Handle of the LLM service instance used + - `vector` (array of floats, required): Embedding vector + - `vector_dim` (integer, required): Number of dimensions in the vector + - `text` (string, optional): The original text + - `metadata` (object, optional): Additional metadata (validated against project schema if defined) + +**Example:** + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/research-docs" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [ + { + "text_id": "doc123", + "instance_handle": "openai-large", + "text": "Machine learning research document.", + "vector": [-0.020850, 0.018522, 0.053270], + "vector_dim": 3, + "metadata": { + "author": "Alice Doe", + "year": 2024 + } + } + ] + }' +``` + +**Response:** + +```json +{ + "message": "1 embedding(s) uploaded successfully", + "uploaded": ["doc123"] +} +``` + +--- + +### Get Single Embedding + +Retrieve information about a specific embedding by its text identifier. + +**Endpoint:** `GET /v1/embeddings/{username}/{projectname}/{identifier}` + +**Authentication:** Admin, owner, authorized readers, or public if `public_read` is enabled + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/embeddings/alice/research-docs/doc123" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "text_id": "doc123", + "user_handle": "alice", + "project_handle": "research-docs", + "project_id": 1, + "instance_handle": "openai-large", + "text": "Machine learning research document.", + "vector": [-0.020850, 0.018522, 0.053270, ...], + "vector_dim": 3072, + "metadata": { + "author": "Alice Doe", + "year": 2024 + } +} +``` + +**Note:** URL-encode the identifier if it contains special characters (e.g., URLs). + +**Example with URL identifier:** + +```bash +# URL-encoded identifier +curl -X GET "https://api.example.com/v1/embeddings/alice/research-docs/https%3A%2F%2Fexample.com%2Fdoc123" \ + -H "Authorization: Bearer alice_api_key" +``` + +--- + +### Delete Single Embedding + +Delete a specific embedding by its text identifier. + +**Endpoint:** `DELETE /v1/embeddings/{username}/{projectname}/{identifier}` + +**Authentication:** Admin, owner, or authorized editors + +**Example:** + +```bash +curl -X DELETE "https://api.example.com/v1/embeddings/alice/research-docs/doc123" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "message": "Embedding 'doc123' deleted successfully" +} +``` + +--- + +### Delete All Embeddings + +Delete all embeddings in a project. + +**Endpoint:** `DELETE /v1/embeddings/{username}/{projectname}` + +**Authentication:** Admin or owner + +**Example:** + +```bash +curl -X DELETE "https://api.example.com/v1/embeddings/alice/research-docs" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "message": "All embeddings deleted from project alice/research-docs", + "deleted_count": 150 +} +``` + +**⚠️ Warning:** This operation is irreversible. + +--- + +## Request Format Details + +### Text Identifiers + +Text identifiers (`text_id`) should be: +- **Unique** within a project +- **URL-encoded** if they contain special characters +- **Descriptive** for easier retrieval + +**Examples:** +- Simple: `"doc123"`, `"article-456"` +- URL-based: `"https://example.com/docs/123"` (URL-encode in requests) +- Path-based: `"books/chapter1/section2"` + +### Vector Format + +Vectors must be: +- **Arrays of floats** (float32 or float64) +- **Matching dimensions** specified in `vector_dim` +- **Consistent dimensions** with the project's LLM service instance + +**Example:** + +```json +{ + "vector": [-0.020850, 0.018522, 0.053270, 0.071384, 0.020003], + "vector_dim": 5 +} +``` + +### Metadata Format + +Metadata is a flexible JSON object that can contain any valid JSON data: + +**Simple metadata:** + +```json +{ + "metadata": { + "author": "Alice Doe", + "year": 2024, + "category": "research" + } +} +``` + +**Nested metadata:** + +```json +{ + "metadata": { + "author": { + "name": "Alice Doe", + "id": "author-123" + }, + "publication": { + "year": 2024, + "title": "Machine Learning Research" + }, + "tags": ["AI", "ML", "embeddings"] + } +} +``` + +--- + +## Validation + +### Dimension Validation + +The API automatically validates that: + +1. **Vector dimension consistency**: The `vector_dim` field must match the dimensions configured in the LLM service instance +2. **Vector length verification**: The actual number of elements in the `vector` array must match the declared `vector_dim` + +**Error example:** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "dimension validation failed: vector dimension mismatch: embedding declares 3072 dimensions but LLM service 'openai-large' expects 1536 dimensions" +} +``` + +### Metadata Schema Validation + +If the project has a `metadataScheme` defined, all uploaded embeddings' metadata will be validated against it. + +**Example project schema:** + +```json +{ + "type": "object", + "properties": { + "author": {"type": "string"}, + "year": {"type": "integer"} + }, + "required": ["author"] +} +``` + +**Valid metadata:** + +```json +{ + "author": "Alice Doe", + "year": 2024 +} +``` + +**Invalid metadata (missing required field):** + +```json +{ + "year": 2024 +} +``` + +**Error response:** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "metadata validation failed for text_id 'doc123': metadata validation failed:\n - author: author is required" +} +``` + +--- + +## Response Format + +### List Response + +```json +{ + "embeddings": [...], + "total_count": 150, + "limit": 20, + "offset": 0 +} +``` + +### Single Embedding Response + +```json +{ + "text_id": "doc123", + "user_handle": "alice", + "project_handle": "research-docs", + "project_id": 1, + "instance_handle": "openai-large", + "text": "...", + "vector": [...], + "vector_dim": 3072, + "metadata": {...} +} +``` + +### Upload Response + +```json +{ + "message": "3 embedding(s) uploaded successfully", + "uploaded": ["doc123", "doc124", "doc125"] +} +``` + +--- + +## Common Errors + +### 400 Bad Request - Dimension Mismatch + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "dimension validation failed: vector dimension mismatch: embedding declares 3072 dimensions but LLM service 'openai-large' expects 1536 dimensions" +} +``` + +### 400 Bad Request - Invalid Vector + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "vector length (1536) does not match declared vector_dim (3072)" +} +``` + +### 400 Bad Request - Metadata Schema Violation + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "metadata validation failed for text_id 'doc123': metadata validation failed:\n - author: author is required" +} +``` + +### 403 Forbidden + +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "You don't have permission to upload embeddings to this project" +} +``` + +### 404 Not Found - Project + +```json +{ + "title": "Not Found", + "status": 404, + "detail": "Project 'alice/research-docs' not found" +} +``` + +### 404 Not Found - Embedding + +```json +{ + "title": "Not Found", + "status": 404, + "detail": "Embedding 'doc123' not found in project 'alice/research-docs'" +} +``` + +### 409 Conflict + +```json +{ + "title": "Conflict", + "status": 409, + "detail": "Embedding 'doc123' already exists in project 'alice/research-docs'" +} +``` + +--- + +## Best Practices + +### Batch Uploads + +Upload multiple embeddings in a single request for better performance: + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/research-docs" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [ + {...}, + {...}, + {...} + ] + }' +``` + +### URL-Encoded Identifiers + +Use URL encoding for identifiers with special characters: + +```python +import urllib.parse + +text_id = "https://example.com/docs/123" +encoded_id = urllib.parse.quote(text_id, safe='') +# Result: "https%3A%2F%2Fexample.com%2Fdocs%2F123" +``` + +### Metadata Design + +Design metadata schemas that: +- Include commonly queried fields +- Use consistent data types +- Validate required fields +- Support your similarity search filtering needs + +--- + +## Related Documentation + +- [Projects](projects/) - Project management and configuration +- [Similars](similars/) - Similarity search using embeddings +- [Query Parameters](../query-parameters/) - Pagination and filtering options +- [Error Handling](../error-handling/) - Complete error reference diff --git a/docs/content/api/endpoints/llm-services.md b/docs/content/api/endpoints/llm-services.md new file mode 100644 index 0000000..0c71efb --- /dev/null +++ b/docs/content/api/endpoints/llm-services.md @@ -0,0 +1,518 @@ +--- +title: "LLM Services" +weight: 3 +--- + +# LLM Services Endpoint + +Manage LLM service instances and definitions. The system supports two types of LLM service resources: + +1. **Definitions**: Reusable templates owned by `_system` or individual users +2. **Instances**: User-specific configurations with API keys that reference definitions or stand alone + +## Architecture Overview + +``` +definitions (templates) + └── owned by _system or users + └── contain: endpoint, model, dimensions, api_standard + └── no API keys + +instances (user-specific) + └── owned by individual users + └── reference a definition (optional) + └── contain encrypted API keys + └── can be shared with other users +``` + +For complete details, see [LLM Service Refactoring Documentation](/docs/LLM_SERVICE_REFACTORING.md). + +--- + +## Instance Endpoints + +### List User's Instances + +Get all LLM service instances owned by or shared with a user. + +**Endpoint:** `GET /v1/llm-services/{username}` + +**Authentication:** Admin or the user themselves + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/llm-services/alice" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "instances": [ + { + "instance_id": 1, + "instance_handle": "openai-large", + "owner": "alice", + "endpoint": "https://api.openai.com/v1/embeddings", + "description": "OpenAI large embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "definition_id": 1 + }, + { + "instance_id": 2, + "instance_handle": "custom-model", + "owner": "alice", + "endpoint": "https://custom.api.example.com/embed", + "api_standard": "openai", + "model": "custom-embed-v1", + "dimensions": 1536 + } + ] +} +``` + +**Note:** API keys are never returned in GET responses for security. + +--- + +### Create or Update Instance (PUT) + +Create a new LLM service instance or update an existing one. + +**Endpoint:** `PUT /v1/llm-services/{username}/{instance_handle}` + +**Authentication:** Admin or the user themselves + +**Request Body (Standalone Instance):** + +```json +{ + "instance_handle": "openai-large", + "endpoint": "https://api.openai.com/v1/embeddings", + "description": "OpenAI large embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "api_key_encrypted": "sk-proj-..." +} +``` + +**Request Body (From Definition):** + +```json +{ + "instance_handle": "my-openai", + "definition_owner": "_system", + "definition_handle": "openai-large", + "api_key_encrypted": "sk-proj-..." +} +``` + +**Parameters:** + +- `instance_handle` (string, required): Unique identifier within user's namespace +- `endpoint` (string, required for standalone): API endpoint URL +- `description` (string, optional): Instance description +- `api_standard` (string, required for standalone): Reference to API standard (e.g., "openai", "cohere") +- `model` (string, required for standalone): Model name +- `dimensions` (integer, required for standalone): Vector dimensions +- `api_key_encrypted` (string, optional): API key (encrypted if ENCRYPTION_KEY is set) +- `definition_owner` (string, optional): Owner of the definition template +- `definition_handle` (string, optional): Handle of the definition template + +**Example - Standalone:** + +```bash +curl -X PUT "https://api.example.com/v1/llm-services/alice/openai-large" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "instance_handle": "openai-large", + "endpoint": "https://api.openai.com/v1/embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "api_key_encrypted": "sk-proj-..." + }' +``` + +**Example - From _system Definition:** + +```bash +curl -X PUT "https://api.example.com/v1/llm-services/alice/my-openai" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "instance_handle": "my-openai", + "definition_owner": "_system", + "definition_handle": "openai-large", + "api_key_encrypted": "sk-proj-..." + }' +``` + +**Response:** + +```json +{ + "instance_id": 1, + "instance_handle": "openai-large", + "owner": "alice", + "endpoint": "https://api.openai.com/v1/embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072 +} +``` + +**Security Note:** API keys are encrypted using AES-256-GCM if the `ENCRYPTION_KEY` environment variable is set. Keys are never returned in responses. + +--- + +### Get Instance Information + +Retrieve information about a specific LLM service instance. + +**Endpoint:** `GET /v1/llm-services/{username}/{instance_handle}` + +**Authentication:** Admin, owner, or users with shared access + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/llm-services/alice/openai-large" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "instance_id": 1, + "instance_handle": "openai-large", + "owner": "alice", + "endpoint": "https://api.openai.com/v1/embeddings", + "description": "OpenAI large embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "definition_id": 1 +} +``` + +--- + +### Delete Instance + +Delete an LLM service instance. + +**Endpoint:** `DELETE /v1/llm-services/{username}/{instance_handle}` + +**Authentication:** Admin or the owner + +**Example:** + +```bash +curl -X DELETE "https://api.example.com/v1/llm-services/alice/openai-large" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "message": "LLM service instance alice/openai-large deleted successfully" +} +``` + +**⚠️ Warning:** Cannot delete instances that are in use by projects. + +--- + +### Partial Update (PATCH) + +Update specific instance fields without providing all data. + +**Endpoint:** `PATCH /v1/llm-services/{username}/{instance_handle}` + +**Authentication:** Admin or the owner + +**Example - Update description:** + +```bash +curl -X PATCH "https://api.example.com/v1/llm-services/alice/openai-large" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "description": "Updated description" + }' +``` + +See [PATCH Updates](../patch-updates/) for more details. + +--- + +## Instance Sharing + +### Share Instance + +Share an LLM service instance with another user. + +**Endpoint:** `POST /v1/llm-services/{owner}/{instance}/share` + +**Authentication:** Admin or the instance owner + +**Request Body:** + +```json +{ + "share_with_handle": "bob", + "role": "reader" +} +``` + +**Roles:** +- `reader`: Can use the instance but cannot see API keys +- `editor`: Can use the instance (owner can still modify) +- `owner`: Full control (cannot be granted via sharing) + +**Example:** + +```bash +curl -X POST "https://api.example.com/v1/llm-services/alice/openai-large/share" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "share_with_handle": "bob", + "role": "reader" + }' +``` + +**Important:** Shared users can USE the instance but cannot see the API key. + +--- + +### Unshare Instance + +Remove a user's access to a shared instance. + +**Endpoint:** `DELETE /v1/llm-services/{owner}/{instance}/share/{user_handle}` + +**Authentication:** Admin or the instance owner + +**Example:** + +```bash +curl -X DELETE "https://api.example.com/v1/llm-services/alice/openai-large/share/bob" \ + -H "Authorization: Bearer alice_api_key" +``` + +--- + +### List Shared Users + +Get a list of users the instance is shared with. + +**Endpoint:** `GET /v1/llm-services/{owner}/{instance}/shared-with` + +**Authentication:** Admin or the instance owner + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/llm-services/alice/openai-large/shared-with" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "instance": "alice/openai-large", + "shared_with": [ + { + "user_handle": "bob", + "role": "reader" + } + ] +} +``` + +--- + +## Definition Endpoints + +### List System Definitions + +Get all LLM service definitions owned by `_system`. + +**Endpoint:** `GET /v1/llm-service-definitions/_system` + +**Authentication:** Any authenticated user + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/llm-service-definitions/_system" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "definitions": [ + { + "definition_id": 1, + "definition_handle": "openai-large", + "owner": "_system", + "endpoint": "https://api.openai.com/v1/embeddings", + "description": "OpenAI text-embedding-3-large", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072 + }, + { + "definition_id": 2, + "definition_handle": "openai-small", + "owner": "_system", + "endpoint": "https://api.openai.com/v1/embeddings", + "description": "OpenAI text-embedding-3-small", + "api_standard": "openai", + "model": "text-embedding-3-small", + "dimensions": 1536 + } + ] +} +``` + +**Pre-seeded System Definitions:** +- `openai-large`: OpenAI text-embedding-3-large (3072 dimensions) +- `openai-small`: OpenAI text-embedding-3-small (1536 dimensions) +- `cohere-v4`: Cohere embed-english-v4.0 (1536 dimensions) +- `gemini-embedding-001`: Google Gemini embedding-001 (768 dimensions) + +--- + +### Create User Definition + +Create a reusable LLM service definition template. + +**Endpoint:** `POST /v1/llm-service-definitions/{username}` + +**Authentication:** Admin or the user themselves + +**Request Body:** + +```json +{ + "definition_handle": "custom-model", + "endpoint": "https://custom.api.example.com/embed", + "description": "Custom embedding model", + "api_standard": "openai", + "model": "custom-embed-v1", + "dimensions": 1024 +} +``` + +**Note:** Only admin users can create `_system` definitions. + +--- + +## Instance Properties + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `instance_handle` | string | Yes | Unique identifier within user's namespace | +| `owner` | string | Read-only | Instance owner's user handle | +| `endpoint` | string | Yes* | API endpoint URL | +| `description` | string | No | Instance description | +| `api_standard` | string | Yes* | Reference to API standard | +| `model` | string | Yes* | Model name | +| `dimensions` | integer | Yes* | Vector dimensions | +| `api_key_encrypted` | string | Write-only | API key (never returned) | +| `definition_id` | integer | Read-only | Reference to definition template | + +\* Required for standalone instances; inherited from definition if using template + +--- + +## Security Features + +### API Key Encryption + +- **Algorithm:** AES-256-GCM +- **Key Source:** `ENCRYPTION_KEY` environment variable +- **Storage:** Encrypted in `api_key_encrypted` column +- **Retrieval:** Never returned in API responses + +### API Key Protection + +API keys are write-only: +- Provided during instance creation/update +- Encrypted before storage +- Never returned in GET/list responses +- Shared users cannot see API keys + +### Shared Instance Access + +When an instance is shared: +- Shared users can USE the instance for projects +- Shared users CANNOT see the API key +- Shared users CANNOT modify the instance +- Only owner can manage sharing + +--- + +## Common Errors + +### 400 Bad Request + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "Instance must specify either all configuration fields or reference a definition" +} +``` + +### 403 Forbidden + +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "Only admin users can create _system definitions" +} +``` + +### 404 Not Found + +```json +{ + "title": "Not Found", + "status": 404, + "detail": "LLM service instance 'alice/openai-large' not found" +} +``` + +### 409 Conflict + +```json +{ + "title": "Conflict", + "status": 409, + "detail": "Cannot delete instance: in use by 3 projects" +} +``` + +--- + +## Related Documentation + +- [API Standards](api-standards/) - Managing API standard definitions +- [Projects](projects/) - Projects require an LLM service instance +- [LLM Service Refactoring](/docs/LLM_SERVICE_REFACTORING.md) - Complete architecture documentation diff --git a/docs/content/api/endpoints/projects.md b/docs/content/api/endpoints/projects.md new file mode 100644 index 0000000..f6147f0 --- /dev/null +++ b/docs/content/api/endpoints/projects.md @@ -0,0 +1,504 @@ +--- +title: "Projects" +weight: 2 +--- + +# Projects Endpoint + +Manage vector database projects. Each project contains embeddings and must be associated with an LLM service instance. + +## Endpoints + +### List User's Projects + +Get all projects owned by a user. + +**Endpoint:** `GET /v1/projects/{username}` + +**Authentication:** Admin, the user themselves, or users with shared access + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/projects/alice" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "projects": [ + { + "project_id": 1, + "project_handle": "research-docs", + "owner": "alice", + "description": "Research document embeddings", + "instance_id": 5, + "instance_owner": "alice", + "instance_handle": "openai-large", + "public_read": false, + "created_at": "2024-01-15T10:30:00Z" + } + ] +} +``` + +--- + +### Create Project + +Register a new project for a user. + +**Endpoint:** `POST /v1/projects/{username}` + +**Authentication:** Admin or the user themselves + +**Request Body:** + +```json +{ + "project_handle": "research-docs", + "description": "Research document embeddings", + "instance_owner": "alice", + "instance_handle": "openai-large", + "public_read": false, + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"author\":{\"type\":\"string\"}}}", + "shared_with": [ + { + "user_handle": "bob", + "role": "reader" + } + ] +} +``` + +**Parameters:** + +- `project_handle` (string, required): Unique project identifier within the user's namespace +- `description` (string, optional): Project description +- `instance_owner` (string, required): Owner of the LLM service instance +- `instance_handle` (string, required): Handle of the LLM service instance to use +- `public_read` (boolean, optional): Allow unauthenticated read access (default: false) +- `metadataScheme` (string, optional): JSON Schema for validating embedding metadata +- `shared_with` (array, optional): List of users to share the project with + +**Example:** + +```bash +curl -X POST "https://api.example.com/v1/projects/alice" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "research-docs", + "description": "Research document embeddings", + "instance_owner": "alice", + "instance_handle": "openai-large" + }' +``` + +**Response:** + +```json +{ + "project_id": 1, + "project_handle": "research-docs", + "owner": "alice", + "description": "Research document embeddings", + "instance_id": 5, + "instance_owner": "alice", + "instance_handle": "openai-large", + "public_read": false +} +``` + +--- + +### Get Project Information + +Retrieve information about a specific project. + +**Endpoint:** `GET /v1/projects/{username}/{projectname}` + +**Authentication:** Admin, owner, authorized readers, or public if `public_read` is enabled + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/projects/alice/research-docs" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "project_id": 1, + "project_handle": "research-docs", + "owner": "alice", + "description": "Research document embeddings", + "instance_id": 5, + "instance_owner": "alice", + "instance_handle": "openai-large", + "public_read": false, + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"author\":{\"type\":\"string\"}}}", + "created_at": "2024-01-15T10:30:00Z" +} +``` + +--- + +### Update Project (PUT) + +Create or update a project with the specified handle. + +**Endpoint:** `PUT /v1/projects/{username}/{projectname}` + +**Authentication:** Admin or the owner + +**Request Body:** Same as POST endpoint + +**Example:** + +```bash +curl -X PUT "https://api.example.com/v1/projects/alice/research-docs" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "research-docs", + "description": "Updated description", + "instance_owner": "alice", + "instance_handle": "openai-large" + }' +``` + +--- + +### Delete Project + +Delete a project and all its embeddings. + +**Endpoint:** `DELETE /v1/projects/{username}/{projectname}` + +**Authentication:** Admin or the owner + +**Example:** + +```bash +curl -X DELETE "https://api.example.com/v1/projects/alice/research-docs" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "message": "Project alice/research-docs deleted successfully" +} +``` + +**⚠️ Warning:** This operation is irreversible and will delete all embeddings in the project. + +--- + +### Partial Update (PATCH) + +Update specific project fields without providing all data. + +**Endpoint:** `PATCH /v1/projects/{username}/{projectname}` + +**Authentication:** Admin or the owner + +**Example - Enable public read access:** + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/research-docs" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "public_read": true + }' +``` + +**Example - Update description:** + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/research-docs" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "description": "Updated project description" + }' +``` + +See [PATCH Updates](../patch-updates/) for more details. + +--- + +## Project Sharing + +### Share Project + +Share a project with another user, granting them read or edit access. + +**Endpoint:** `POST /v1/projects/{owner}/{project}/share` + +**Authentication:** Admin or the project owner + +**Request Body:** + +```json +{ + "share_with_handle": "bob", + "role": "reader" +} +``` + +**Roles:** +- `reader`: Read-only access to embeddings and similarity search +- `editor`: Read and write access to embeddings + +**Example:** + +```bash +curl -X POST "https://api.example.com/v1/projects/alice/research-docs/share" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "share_with_handle": "bob", + "role": "reader" + }' +``` + +**Response:** + +```json +{ + "message": "Project shared successfully", + "project": "alice/research-docs", + "shared_with": "bob", + "role": "reader" +} +``` + +--- + +### Unshare Project + +Remove a user's access to a shared project. + +**Endpoint:** `DELETE /v1/projects/{owner}/{project}/share/{user_handle}` + +**Authentication:** Admin or the project owner + +**Example:** + +```bash +curl -X DELETE "https://api.example.com/v1/projects/alice/research-docs/share/bob" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "message": "Project unshared successfully", + "project": "alice/research-docs", + "removed_user": "bob" +} +``` + +--- + +### List Shared Users + +Get a list of users the project is shared with. + +**Endpoint:** `GET /v1/projects/{owner}/{project}/shared-with` + +**Authentication:** Admin or the project owner + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/projects/alice/research-docs/shared-with" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "project": "alice/research-docs", + "shared_with": [ + { + "user_handle": "bob", + "role": "reader" + }, + { + "user_handle": "charlie", + "role": "editor" + } + ] +} +``` + +**Note:** Only the project owner can view this list. Users with shared access cannot see who else has access. + +--- + +## Project Ownership Transfer + +### Transfer Ownership + +Transfer ownership of a project to another user. + +**Endpoint:** `POST /v1/projects/{owner}/{project}/transfer-ownership` + +**Authentication:** Admin or the project owner + +**Request Body:** + +```json +{ + "new_owner_handle": "bob" +} +``` + +**Example:** + +```bash +curl -X POST "https://api.example.com/v1/projects/alice/research-docs/transfer-ownership" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "new_owner_handle": "bob" + }' +``` + +**Response:** + +```json +{ + "message": "Project ownership transferred successfully", + "old_owner": "alice", + "new_owner": "bob", + "project_handle": "research-docs", + "new_path": "/v1/projects/bob/research-docs" +} +``` + +**Important Notes:** +- Only the current owner can transfer ownership +- The new owner must be an existing user +- The new owner cannot already have a project with the same handle +- After transfer, the old owner loses all access to the project +- All embeddings and project data remain intact + +--- + +## Query Parameters + +### List Projects + +When listing projects, the following query parameters are available: + +- `limit` (integer, default: 10, max: 200): Maximum number of results to return +- `offset` (integer, default: 0): Pagination offset + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/projects/alice?limit=20&offset=40" \ + -H "Authorization: Bearer alice_api_key" +``` + +--- + +## Project Properties + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `project_handle` | string | Yes | Unique identifier within user's namespace | +| `owner` | string | Read-only | Project owner's user handle | +| `description` | string | No | Project description | +| `instance_owner` | string | Yes | Owner of the LLM service instance | +| `instance_handle` | string | Yes | Handle of the LLM service instance | +| `instance_id` | integer | Read-only | Internal ID of the LLM service instance | +| `public_read` | boolean | No | Allow unauthenticated read access | +| `metadataScheme` | string | No | JSON Schema for metadata validation | +| `created_at` | timestamp | Read-only | Project creation timestamp | + +--- + +## Metadata Schema Validation + +Projects can define a JSON Schema to validate metadata attached to embeddings. See the main README for examples. + +**Example Schema:** + +```json +{ + "type": "object", + "properties": { + "author": {"type": "string"}, + "year": {"type": "integer"} + }, + "required": ["author"] +} +``` + +When uploading embeddings, metadata will be validated against this schema. Invalid metadata will result in a `400 Bad Request` error. + +--- + +## Common Errors + +### 400 Bad Request + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "Project must have instance_id" +} +``` + +### 403 Forbidden + +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "You don't have permission to modify this project" +} +``` + +### 404 Not Found + +```json +{ + "title": "Not Found", + "status": 404, + "detail": "Project 'alice/research-docs' not found" +} +``` + +### 409 Conflict + +```json +{ + "title": "Conflict", + "status": 409, + "detail": "Project 'alice/research-docs' already exists" +} +``` + +--- + +## Related Documentation + +- [LLM Services](llm-services/) - Managing LLM service instances +- [Embeddings](embeddings/) - Adding and retrieving embeddings +- [Similars](similars/) - Similarity search +- [Public Access](/docs/PUBLIC_ACCESS.md) - Public project configuration diff --git a/docs/content/api/endpoints/similars.md b/docs/content/api/endpoints/similars.md new file mode 100644 index 0000000..6c92524 --- /dev/null +++ b/docs/content/api/endpoints/similars.md @@ -0,0 +1,502 @@ +--- +title: "Similars" +weight: 6 +--- + +# Similarity Search Endpoints + +Find similar documents using vector similarity search. The API provides two methods: searching from stored embeddings or searching with raw vectors without storing them. + +## Endpoints + +### GET Similar Documents (from stored embeddings) + +Find documents similar to an already-stored document by its text identifier. + +**Endpoint:** `GET /v1/similars/{username}/{projectname}/{identifier}` + +**Authentication:** Admin, owner, authorized readers, or public if `public_read` is enabled + +**Query Parameters:** + +- `count` (integer, optional, default: 10, max: 200): Number of similar documents to return +- `threshold` (float, optional, default: 0.5, range: 0-1): Minimum similarity score threshold +- `limit` (integer, optional, default: 10, max: 200): Maximum number of results to return (alias for `count`) +- `offset` (integer, optional, default: 0): Pagination offset +- `metadata_path` (string, optional): Metadata field path for filtering (must be used with `metadata_value`) +- `metadata_value` (string, optional): Metadata value to exclude from results (must be used with `metadata_path`) + +**Example - Basic search:** + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=5&threshold=0.7" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Example - With metadata filtering:** + +```bash +# Exclude documents with author="John Doe" +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=10&metadata_path=author&metadata_value=John%20Doe" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "$schema": "http://localhost:8080/schemas/SimilarResponseBody.json", + "user_handle": "alice", + "project_handle": "research-docs", + "results": [ + { + "id": "doc456", + "similarity": 0.95 + }, + { + "id": "doc789", + "similarity": 0.87 + }, + { + "id": "doc321", + "similarity": 0.82 + } + ] +} +``` + +--- + +### POST Similar Documents (from raw embeddings) + +Find similar documents by submitting a raw embedding vector without storing it in the database. Useful for one-time queries or testing. + +**Endpoint:** `POST /v1/similars/{username}/{projectname}` + +**Authentication:** Admin, owner, authorized readers, or public if `public_read` is enabled + +**Query Parameters:** Same as GET endpoint above + +**Request Body:** + +```json +{ + "vector": [-0.020850, 0.018522, 0.053270, 0.071384, 0.020003, ...] +} +``` + +The vector must be an array of float values with dimensions matching the project's LLM service instance configuration. + +**Example - Basic search:** + +```bash +curl -X POST "https://api.example.com/v1/similars/alice/research-docs?count=10&threshold=0.8" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [-0.020850, 0.018522, 0.053270, 0.071384, 0.020003] + }' +``` + +**Example - With metadata filtering:** + +```bash +# Exclude documents from the same category +curl -X POST "https://api.example.com/v1/similars/alice/research-docs?count=5&metadata_path=category&metadata_value=biology" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [-0.020850, 0.018522, 0.053270, ...] + }' +``` + +**Response:** Same format as GET endpoint + +--- + +## Query Parameters Reference + +### count / limit + +Maximum number of similar documents to return. + +- **Type:** Integer +- **Default:** 10 +- **Max:** 200 +- **Note:** `count` and `limit` are aliases; use either one + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=20" +``` + +--- + +### threshold + +Minimum similarity score threshold. Only documents with similarity scores >= threshold are returned. + +- **Type:** Float +- **Default:** 0.5 +- **Range:** 0.0 to 1.0 (where 1.0 is most similar) + +**Example:** + +```bash +# Only return very similar documents (>= 0.8) +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?threshold=0.8" +``` + +--- + +### offset + +Pagination offset for large result sets. + +- **Type:** Integer +- **Default:** 0 +- **Use:** Skip the first N results + +**Example:** + +```bash +# Get results 21-40 +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=20&offset=20" +``` + +--- + +### metadata_path + +Metadata field path for filtering results. Must be used together with `metadata_value`. + +- **Type:** String +- **Format:** JSON path notation (e.g., `"author"`, `"author.name"`, `"publication.year"`) +- **Use:** Exclude documents where metadata field matches a specific value + +**Examples:** + +```bash +# Simple field +metadata_path=author + +# Nested field +metadata_path=author.name + +# Deeply nested field +metadata_path=publication.journal.name +``` + +--- + +### metadata_value + +Metadata value to exclude from results. Must be used together with `metadata_path`. + +- **Type:** String +- **Use:** Excludes documents where the metadata field at `metadata_path` equals this value + +**Example - Exclude same author:** + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?metadata_path=author&metadata_value=Alice%20Doe" +``` + +**Example - Exclude same category:** + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?metadata_path=category&metadata_value=research" +``` + +**Example - Nested field:** + +```bash +# Exclude documents from same author ID +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?metadata_path=author.id&metadata_value=A0083" +``` + +--- + +## Response Format + +Both GET and POST endpoints return the same response format: + +```json +{ + "$schema": "http://localhost:8080/schemas/SimilarResponseBody.json", + "user_handle": "alice", + "project_handle": "research-docs", + "results": [ + { + "id": "doc456", + "similarity": 0.95 + }, + { + "id": "doc789", + "similarity": 0.87 + }, + { + "id": "doc321", + "similarity": 0.82 + } + ] +} +``` + +**Response Fields:** + +- `$schema` (string): JSON schema reference +- `user_handle` (string): Project owner's username +- `project_handle` (string): Project identifier +- `results` (array): Array of similar documents, ordered by similarity (highest first) + - `id` (string): Document text identifier + - `similarity` (float): Cosine similarity score (0-1, where 1 is most similar) + +--- + +## Similarity Calculation + +### Cosine Distance + +The API uses **cosine distance** (or equivalently, cosine similarity) to calculate vector similarity: + +- **Range:** 0 to 1 +- **1.0:** Identical vectors +- **0.0:** Orthogonal vectors (completely dissimilar) +- **Higher values:** More similar documents + +### Dimension Filtering + +The system automatically filters results to only include embeddings with matching dimensions. This ensures: +- Only embeddings with matching `vector_dim` are compared +- Only embeddings from the same project are considered +- Invalid comparisons are prevented + +--- + +## Dimension Validation (POST only) + +When using the POST endpoint with raw embeddings, the API validates: + +1. The project has an associated LLM service instance +2. The submitted vector dimensions match the instance's configured dimensions +3. If dimensions don't match, a `400 Bad Request` error is returned + +**Error example:** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "vector dimension mismatch: expected 1536 dimensions, got 768" +} +``` + +--- + +## Metadata Filtering + +Both endpoints support metadata filtering to exclude documents based on metadata field values. This uses **negative matching** (excludes documents where the field matches the value). + +### Use Cases + +**Exclude documents from the same source:** + +```bash +# When finding similar documents to doc123, exclude others from the same author +curl -X GET ".../similars/alice/research-docs/doc123?metadata_path=author_id&metadata_value=A0083" +``` + +**Exclude documents from the same category:** + +```bash +# Find similar documents in other categories +curl -X GET ".../similars/alice/research-docs/doc123?metadata_path=category&metadata_value=biology" +``` + +**Exclude documents with the same tag:** + +```bash +# Find documents with similar content but different tags +curl -X POST ".../similars/alice/research-docs?metadata_path=primary_tag&metadata_value=machine-learning" \ + -d '{"vector": [...]}' +``` + +### Nested Field Access + +Use dot notation for nested metadata fields: + +```bash +# Exclude documents from the same author (nested field) +metadata_path=author.id&metadata_value=author-123 + +# Exclude documents from the same publication year +metadata_path=publication.year&metadata_value=2024 + +# Deeply nested field +metadata_path=source.journal.publisher&metadata_value=Springer +``` + +--- + +## Examples + +### Basic Similarity Search + +Find 5 most similar documents with at least 0.7 similarity: + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=5&threshold=0.7" \ + -H "Authorization: Bearer alice_api_key" +``` + +--- + +### Search with Raw Vector + +Submit a vector without storing it: + +```bash +curl -X POST "https://api.example.com/v1/similars/alice/research-docs?count=10" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [-0.020850, 0.018522, 0.053270, 0.071384, 0.020003] + }' +``` + +--- + +### Search with Metadata Filtering + +Find similar documents but exclude those from the same author: + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=10&metadata_path=author&metadata_value=John%20Doe" \ + -H "Authorization: Bearer alice_api_key" +``` + +--- + +### Paginated Results + +Get the next page of results: + +```bash +# Page 1 +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=20&offset=0" + +# Page 2 +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=20&offset=20" + +# Page 3 +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=20&offset=40" +``` + +--- + +### Complex Query + +High threshold, metadata filtering, and pagination: + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=50&threshold=0.9&offset=0&metadata_path=category&metadata_value=biology" \ + -H "Authorization: Bearer alice_api_key" +``` + +--- + +## Common Errors + +### 400 Bad Request - Dimension Mismatch (POST only) + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "vector dimension mismatch: expected 1536 dimensions, got 768" +} +``` + +### 400 Bad Request - Missing metadata_value + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "metadata_path requires metadata_value to be specified" +} +``` + +### 400 Bad Request - Invalid threshold + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "threshold must be between 0.0 and 1.0" +} +``` + +### 403 Forbidden + +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "You don't have permission to search this project" +} +``` + +### 404 Not Found - Project + +```json +{ + "title": "Not Found", + "status": 404, + "detail": "Project 'alice/research-docs' not found" +} +``` + +### 404 Not Found - Embedding (GET only) + +```json +{ + "title": "Not Found", + "status": 404, + "detail": "Embedding 'doc123' not found in project 'alice/research-docs'" +} +``` + +--- + +## Performance Considerations + +### Indexing + +The database uses vector indexes for efficient similarity search. See the database migrations for index configuration. + +### Result Limits + +- Default limit: 10 results +- Maximum limit: 200 results +- Use pagination for large result sets + +### Threshold Optimization + +Higher thresholds reduce result set size and improve performance: + +- **0.5-0.7:** Broad similarity (default) +- **0.7-0.85:** Moderate similarity +- **0.85-0.95:** High similarity +- **0.95-1.0:** Near-identical documents + +--- + +## Related Documentation + +- [Embeddings](embeddings/) - Storing embeddings for similarity search +- [Projects](projects/) - Project configuration +- [Query Parameters](../query-parameters/) - Complete parameter reference +- [Error Handling](../error-handling/) - Error response format diff --git a/docs/content/api/endpoints/users.md b/docs/content/api/endpoints/users.md new file mode 100644 index 0000000..f63e2e1 --- /dev/null +++ b/docs/content/api/endpoints/users.md @@ -0,0 +1,304 @@ +--- +title: "Users" +weight: 1 +--- + +# Users Endpoint + +Manage user accounts and API keys. User creation is admin-only, but users can manage their own account information. + +## Endpoints + +### List All Users + +Get a list of all registered user handles. + +**Endpoint:** `GET /v1/users` + +**Authentication:** Admin only + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/users" \ + -H "Authorization: Bearer admin_api_key" +``` + +**Response:** + +```json +{ + "users": ["alice", "bob", "charlie"] +} +``` + +--- + +### Create User + +Register a new user and generate their API key. + +**Endpoint:** `POST /v1/users` + +**Authentication:** Admin only + +**Request Body:** + +```json +{ + "user_handle": "alice", + "name": "Alice Doe", + "email": "alice@example.com" +} +``` + +**Parameters:** + +- `user_handle` (string, required): Unique identifier for the user (alphanumeric, hyphens, underscores) +- `name` (string, optional): User's full name +- `email` (string, optional): User's email address + +**Example:** + +```bash +curl -X POST "https://api.example.com/v1/users" \ + -H "Authorization: Bearer admin_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "user_handle": "alice", + "name": "Alice Doe", + "email": "alice@example.com" + }' +``` + +**Response:** + +```json +{ + "user_handle": "alice", + "name": "Alice Doe", + "email": "alice@example.com", + "api_key": "024v2013621509245f2e24" +} +``` + +**⚠️ Important:** The `api_key` is only returned once during user creation. Store it securely - it cannot be recovered later. + +**Error Responses:** + +- `400 Bad Request`: Invalid user handle or missing required fields +- `409 Conflict`: User handle already exists + +--- + +### Get User Information + +Retrieve information about a specific user. + +**Endpoint:** `GET /v1/users/{username}` + +**Authentication:** Admin or the user themselves + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/users/alice" \ + -H "Authorization: Bearer alice_or_admin_api_key" +``` + +**Response:** + +```json +{ + "user_handle": "alice", + "name": "Alice Doe", + "email": "alice@example.com" +} +``` + +**Note:** API key is never returned in GET responses for security reasons. + +--- + +### Create or Update User (PUT) + +Register a new user with a specific handle or update existing user information. + +**Endpoint:** `PUT /v1/users/{username}` + +**Authentication:** Admin only + +**Request Body:** + +```json +{ + "user_handle": "alice", + "name": "Alice Doe", + "email": "alice@example.com" +} +``` + +**Example:** + +```bash +curl -X PUT "https://api.example.com/v1/users/alice" \ + -H "Authorization: Bearer admin_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "user_handle": "alice", + "name": "Alice Smith", + "email": "alice.smith@example.com" + }' +``` + +**Response:** + +```json +{ + "user_handle": "alice", + "name": "Alice Smith", + "email": "alice.smith@example.com", + "api_key": "024v2013621509245f2e24" +} +``` + +--- + +### Delete User + +Delete a user and all their associated resources (projects, LLM services, embeddings). + +**Endpoint:** `DELETE /v1/users/{username}` + +**Authentication:** Admin or the user themselves + +**Example:** + +```bash +curl -X DELETE "https://api.example.com/v1/users/alice" \ + -H "Authorization: Bearer alice_or_admin_api_key" +``` + +**Response:** + +```json +{ + "message": "User alice deleted successfully" +} +``` + +**⚠️ Warning:** This operation is irreversible and will delete: +- All projects owned by the user +- All LLM service instances owned by the user +- All embeddings in the user's projects +- All sharing relationships + +--- + +### Partial Update (PATCH) + +Update specific user fields without providing all user data. + +**Endpoint:** `PATCH /v1/users/{username}` + +**Authentication:** Admin or the user themselves + +**Request Body:** + +```json +{ + "name": "Alice Smith" +} +``` + +**Example:** + +```bash +curl -X PATCH "https://api.example.com/v1/users/alice" \ + -H "Authorization: Bearer alice_or_admin_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "email": "newemail@example.com" + }' +``` + +See [PATCH Updates](../patch-updates/) for more details. + +--- + +## User Properties + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `user_handle` | string | Yes | Unique identifier (alphanumeric, hyphens, underscores) | +| `name` | string | No | User's full name | +| `email` | string | No | User's email address | +| `api_key` | string | Read-only | Generated API key (only returned on creation) | + +## Special User: _system + +The `_system` user is a special internal user that: +- Owns global LLM service definitions +- Cannot be used for authentication +- Cannot be deleted +- Is created automatically during database migrations + +Users can create LLM service instances from `_system` definitions. + +## Common Errors + +### 400 Bad Request + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "Invalid user_handle: must be alphanumeric with hyphens or underscores" +} +``` + +### 401 Unauthorized + +```json +{ + "title": "Unauthorized", + "status": 401, + "detail": "Invalid or missing authorization credentials" +} +``` + +### 403 Forbidden + +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "Only admin users can create new users" +} +``` + +### 404 Not Found + +```json +{ + "title": "Not Found", + "status": 404, + "detail": "User 'alice' not found" +} +``` + +### 409 Conflict + +```json +{ + "title": "Conflict", + "status": 409, + "detail": "User 'alice' already exists" +} +``` + +## Related Documentation + +- [Authentication](../authentication/) - API key authentication details +- [Projects](projects/) - Project management +- [LLM Services](llm-services/) - LLM service instances diff --git a/docs/content/api/error-handling.md b/docs/content/api/error-handling.md new file mode 100644 index 0000000..d97dcc6 --- /dev/null +++ b/docs/content/api/error-handling.md @@ -0,0 +1,768 @@ +--- +title: "Error Handling" +weight: 9 +--- + +# Error Handling + +The API uses standard HTTP status codes and returns structured error responses in JSON format for all error conditions. + +## Error Response Format + +All error responses follow this structure: + +```json +{ + "title": "Error Title", + "status": 400, + "detail": "Detailed error message explaining what went wrong" +} +``` + +**Fields:** + +- `title` (string): Human-readable error title +- `status` (integer): HTTP status code +- `detail` (string): Detailed description of the error + +## HTTP Status Codes + +### 2xx Success + +#### 200 OK + +Request succeeded. Response body contains requested data. + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/projects/alice" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "projects": [...] +} +``` + +--- + +#### 201 Created + +Resource created successfully. Response body contains the new resource. + +**Example:** + +```bash +curl -X POST "https://api.example.com/v1/users" \ + -H "Authorization: Bearer admin_api_key" \ + -d '{"user_handle": "bob"}' +``` + +**Response:** + +```json +{ + "user_handle": "bob", + "api_key": "..." +} +``` + +--- + +### 4xx Client Errors + +#### 400 Bad Request + +The request is invalid or contains malformed data. + +**Common causes:** +- Invalid JSON syntax +- Missing required fields +- Invalid field values or types +- Dimension mismatches in embeddings +- Metadata schema violations +- Invalid query parameter values + +**Examples:** + +**Invalid JSON:** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "Invalid JSON: unexpected end of input" +} +``` + +**Missing required field:** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "Missing required field: project_handle" +} +``` + +**Dimension mismatch:** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "dimension validation failed: vector dimension mismatch: embedding declares 3072 dimensions but LLM service 'openai-large' expects 1536 dimensions" +} +``` + +**Metadata validation:** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "metadata validation failed for text_id 'doc123': metadata validation failed:\n - author: author is required" +} +``` + +**Invalid query parameter:** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "limit must be between 1 and 200" +} +``` + +--- + +#### 401 Unauthorized + +Authentication failed or credentials are missing. + +**Common causes:** +- Missing `Authorization` header +- Invalid API key +- Expired API key +- Malformed `Authorization` header + +**Example:** + +```json +{ + "title": "Unauthorized", + "status": 401, + "detail": "Invalid or missing authorization credentials" +} +``` + +**Troubleshooting:** + +- Verify the `Authorization` header is present +- Check the API key is correct +- Ensure the header format is `Authorization: Bearer ` +- Verify the user still exists + +--- + +#### 403 Forbidden + +Authentication succeeded but authorization failed. The authenticated user lacks permission for the requested operation. + +**Common causes:** +- User attempting to access another user's private resources +- Non-admin attempting admin-only operations +- User attempting to modify shared resources they don't own +- Attempting to access resources with insufficient role (reader vs editor) + +**Examples:** + +**Admin-only operation:** + +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "Only admin users can create new users" +} +``` + +**Accessing another user's resource:** + +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "You don't have permission to access this project" +} +``` + +**Insufficient role:** + +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "You don't have permission to modify embeddings in this project. Editor role required." +} +``` + +--- + +#### 404 Not Found + +The requested resource does not exist. + +**Common causes:** +- Resource was deleted +- Incorrect resource identifier +- Typo in URL path +- Resource never existed + +**Examples:** + +**User not found:** + +```json +{ + "title": "Not Found", + "status": 404, + "detail": "User 'alice' not found" +} +``` + +**Project not found:** + +```json +{ + "title": "Not Found", + "status": 404, + "detail": "Project 'alice/research-docs' not found" +} +``` + +**Embedding not found:** + +```json +{ + "title": "Not Found", + "status": 404, + "detail": "Embedding 'doc123' not found in project 'alice/research-docs'" +} +``` + +**LLM service not found:** + +```json +{ + "title": "Not Found", + "status": 404, + "detail": "LLM service instance 'alice/openai-large' not found" +} +``` + +--- + +#### 409 Conflict + +The request conflicts with the current state of the resource. + +**Common causes:** +- Creating a resource that already exists +- Deleting a resource that is in use +- Concurrent modification conflicts + +**Examples:** + +**Resource already exists:** + +```json +{ + "title": "Conflict", + "status": 409, + "detail": "User 'alice' already exists" +} +``` + +**Resource in use:** + +```json +{ + "title": "Conflict", + "status": 409, + "detail": "Cannot delete LLM service instance: in use by 3 projects" +} +``` + +**Ownership transfer conflict:** + +```json +{ + "title": "Conflict", + "status": 409, + "detail": "New owner already has a project with handle 'research-docs'" +} +``` + +--- + +#### 422 Unprocessable Entity + +The request is syntactically correct but semantically invalid. + +**Common causes:** +- Invalid JSON Schema +- Logical validation failures +- Constraint violations + +**Example:** + +```json +{ + "title": "Unprocessable Entity", + "status": 422, + "detail": "metadataScheme is not valid JSON Schema: invalid schema structure" +} +``` + +--- + +### 5xx Server Errors + +#### 500 Internal Server Error + +An unexpected error occurred on the server. + +**Common causes:** +- Database connection failures +- Unexpected exceptions +- Configuration errors + +**Example:** + +```json +{ + "title": "Internal Server Error", + "status": 500, + "detail": "An internal error occurred. Please try again later." +} +``` + +**Action:** Contact support if the error persists. + +--- + +#### 503 Service Unavailable + +The service is temporarily unavailable. + +**Common causes:** +- Database maintenance +- Service overload +- Network issues + +**Example:** + +```json +{ + "title": "Service Unavailable", + "status": 503, + "detail": "Service temporarily unavailable. Please try again later." +} +``` + +**Action:** Retry the request after a delay. + +--- + +## Common Error Scenarios + +### Authentication Errors + +#### Missing Authorization Header + +**Request:** + +```bash +curl -X GET "https://api.example.com/v1/projects/alice" +``` + +**Response:** + +```json +{ + "title": "Unauthorized", + "status": 401, + "detail": "Invalid or missing authorization credentials" +} +``` + +**Solution:** Include the `Authorization` header: + +```bash +curl -X GET "https://api.example.com/v1/projects/alice" \ + -H "Authorization: Bearer alice_api_key" +``` + +--- + +#### Invalid API Key + +**Request:** + +```bash +curl -X GET "https://api.example.com/v1/projects/alice" \ + -H "Authorization: Bearer invalid_key" +``` + +**Response:** + +```json +{ + "title": "Unauthorized", + "status": 401, + "detail": "Invalid or missing authorization credentials" +} +``` + +**Solution:** Verify your API key is correct. + +--- + +### Permission Errors + +#### Non-Admin Creating Users + +**Request:** + +```bash +curl -X POST "https://api.example.com/v1/users" \ + -H "Authorization: Bearer alice_api_key" \ + -d '{"user_handle": "bob"}' +``` + +**Response:** + +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "Only admin users can create new users" +} +``` + +**Solution:** Use an admin API key. + +--- + +#### Accessing Another User's Project + +**Request:** + +```bash +curl -X GET "https://api.example.com/v1/projects/bob/private-project" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "You don't have permission to access this project" +} +``` + +**Solution:** Request the project owner to share the project with you. + +--- + +### Validation Errors + +#### Invalid JSON + +**Request:** + +```bash +curl -X POST "https://api.example.com/v1/users" \ + -H "Authorization: Bearer admin_api_key" \ + -H "Content-Type: application/json" \ + -d '{"user_handle": "bob"' +``` + +**Response:** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "Invalid JSON: unexpected end of input" +} +``` + +**Solution:** Fix the JSON syntax. + +--- + +#### Dimension Mismatch + +**Request:** + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/research-docs" \ + -d '{ + "embeddings": [{ + "text_id": "doc123", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, 0.3], + "vector_dim": 3 + }] + }' +``` + +**Response:** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "dimension validation failed: vector dimension mismatch: embedding declares 3 dimensions but LLM service 'openai-large' expects 3072 dimensions" +} +``` + +**Solution:** Ensure vector dimensions match the LLM service configuration. + +--- + +#### Metadata Schema Violation + +**Request:** + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/research-docs" \ + -d '{ + "embeddings": [{ + "text_id": "doc123", + "instance_handle": "openai-large", + "vector": [...], + "vector_dim": 3072, + "metadata": { + "year": 2024 + } + }] + }' +``` + +**Response (if project schema requires "author"):** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "metadata validation failed for text_id 'doc123': metadata validation failed:\n - author: author is required" +} +``` + +**Solution:** Include all required metadata fields. + +--- + +### Resource Conflicts + +#### Creating Duplicate Resource + +**Request:** + +```bash +curl -X POST "https://api.example.com/v1/users" \ + -H "Authorization: Bearer admin_api_key" \ + -d '{"user_handle": "alice"}' +``` + +**Response:** + +```json +{ + "title": "Conflict", + "status": 409, + "detail": "User 'alice' already exists" +} +``` + +**Solution:** Use a different user handle or update the existing user. + +--- + +#### Deleting Resource In Use + +**Request:** + +```bash +curl -X DELETE "https://api.example.com/v1/llm-services/alice/openai-large" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** + +```json +{ + "title": "Conflict", + "status": 409, + "detail": "Cannot delete LLM service instance: in use by 3 projects" +} +``` + +**Solution:** Delete or update the projects using this instance first. + +--- + +## Error Handling Best Practices + +### Client-Side Error Handling + +**Python example:** + +```python +import requests + +response = requests.get( + "https://api.example.com/v1/projects/alice", + headers={"Authorization": f"Bearer {api_key}"} +) + +if response.status_code == 200: + projects = response.json()["projects"] + print(f"Found {len(projects)} projects") + +elif response.status_code == 401: + print("Authentication failed - check API key") + +elif response.status_code == 403: + print("Permission denied") + +elif response.status_code == 404: + print("User not found") + +elif response.status_code >= 500: + print("Server error - retry later") + +else: + error = response.json() + print(f"Error: {error['detail']}") +``` + +--- + +### Retry Logic + +For transient errors (5xx), implement exponential backoff: + +```python +import time +import requests + +def api_request_with_retry(url, max_retries=3): + for attempt in range(max_retries): + response = requests.get(url, headers={...}) + + if response.status_code < 500: + return response + + if attempt < max_retries - 1: + wait_time = 2 ** attempt # Exponential backoff + print(f"Retrying in {wait_time}s...") + time.sleep(wait_time) + + return response +``` + +--- + +### Validation Before Request + +Validate data locally before sending to reduce 400 errors: + +```python +def create_embedding(text_id, vector, metadata): + # Validate locally + if not text_id: + raise ValueError("text_id is required") + + if not isinstance(vector, list): + raise ValueError("vector must be a list") + + if len(vector) != 3072: + raise ValueError("vector must have 3072 dimensions") + + # Send request + response = requests.post( + "https://api.example.com/v1/embeddings/alice/research-docs", + json={ + "embeddings": [{ + "text_id": text_id, + "instance_handle": "openai-large", + "vector": vector, + "vector_dim": len(vector), + "metadata": metadata + }] + }, + headers={"Authorization": f"Bearer {api_key}"} + ) + + return response.json() +``` + +--- + +## Troubleshooting + +### Check API Documentation + +Always refer to the live API documentation: + +- **OpenAPI YAML:** `/openapi.yaml` +- **Interactive Docs:** `/docs` + +--- + +### Verify Request Format + +Use tools like `curl -v` to inspect the full request: + +```bash +curl -v -X POST "https://api.example.com/v1/users" \ + -H "Authorization: Bearer admin_api_key" \ + -H "Content-Type: application/json" \ + -d '{"user_handle": "bob"}' +``` + +--- + +### Check Status and Logs + +For persistent 5xx errors, check service status and logs (if you have access). + +--- + +### Contact Support + +If errors persist or the cause is unclear: + +1. Note the error message and status code +2. Record the request details (method, URL, headers, body) +3. Check if the issue is reproducible +4. Contact support with this information + +--- + +## Related Documentation + +- [Authentication](authentication/) - API authentication guide +- [Query Parameters](query-parameters/) - Valid parameter values +- [Endpoints](endpoints/) - Complete endpoint reference diff --git a/docs/content/api/patch-updates.md b/docs/content/api/patch-updates.md new file mode 100644 index 0000000..00c1ad9 --- /dev/null +++ b/docs/content/api/patch-updates.md @@ -0,0 +1,547 @@ +--- +title: "PATCH Updates" +weight: 8 +--- + +# PATCH Method for Partial Updates + +The API supports PATCH requests for partial updates of resources. Instead of providing all resource fields (as required by PUT), you only need to include the fields you want to change. + +## Overview + +PATCH is automatically available for resources that support both GET and PUT operations. The PATCH endpoint: + +1. Retrieves the current resource state via GET +2. Merges your changes with the existing data +3. Applies the update via PUT + +This approach simplifies updates by eliminating the need to fetch, modify, and submit complete resource objects. + +## Supported Resources + +PATCH is available for: + +- **Users:** `/v1/users/{username}` +- **Projects:** `/v1/projects/{username}/{projectname}` +- **LLM Services:** `/v1/llm-services/{username}/{llm_servicename}` +- **API Standards:** `/v1/api-standards/{standardname}` + +## Request Format + +**Endpoint:** `PATCH {resource_url}` + +**Content-Type:** `application/json` + +**Body:** JSON object containing only the fields to update + +## Examples + +### Update Project Description + +Change only the project description without affecting other fields. + +**Request:** + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/research-docs" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "description": "Updated project description" + }' +``` + +**Response:** + +```json +{ + "project_id": 1, + "project_handle": "research-docs", + "owner": "alice", + "description": "Updated project description", + "instance_id": 5, + "instance_owner": "alice", + "instance_handle": "openai-large", + "public_read": false +} +``` + +--- + +### Enable Public Read Access + +Make a project publicly accessible without changing other settings. + +**Request:** + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/research-docs" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "public_read": true + }' +``` + +**Response:** + +```json +{ + "project_id": 1, + "project_handle": "research-docs", + "owner": "alice", + "description": "Research document embeddings", + "instance_id": 5, + "instance_owner": "alice", + "instance_handle": "openai-large", + "public_read": true +} +``` + +--- + +### Update User Email + +Change a user's email address without affecting other user data. + +**Request:** + +```bash +curl -X PATCH "https://api.example.com/v1/users/alice" \ + -H "Authorization: Bearer alice_or_admin_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "email": "alice.new@example.com" + }' +``` + +**Response:** + +```json +{ + "user_handle": "alice", + "name": "Alice Doe", + "email": "alice.new@example.com" +} +``` + +--- + +### Update LLM Service Description + +Change the description of an LLM service instance. + +**Request:** + +```bash +curl -X PATCH "https://api.example.com/v1/llm-services/alice/openai-large" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "description": "Production OpenAI embeddings service" + }' +``` + +**Response:** + +```json +{ + "instance_id": 1, + "instance_handle": "openai-large", + "owner": "alice", + "endpoint": "https://api.openai.com/v1/embeddings", + "description": "Production OpenAI embeddings service", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072 +} +``` + +--- + +### Update API Standard Documentation + +Update the description of an API standard (admin only). + +**Request:** + +```bash +curl -X PATCH "https://api.example.com/v1/api-standards/openai" \ + -H "Authorization: Bearer admin_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "description": "OpenAI Embeddings API, Version 1 - Updated 2024" + }' +``` + +--- + +### Update Multiple Fields + +You can update multiple fields in a single PATCH request. + +**Request:** + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/research-docs" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "description": "Updated description", + "public_read": true + }' +``` + +--- + +### Add Project Metadata Schema + +Add or update a project's metadata validation schema. + +**Request:** + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/research-docs" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"author\":{\"type\":\"string\"},\"year\":{\"type\":\"integer\"}},\"required\":[\"author\"]}" + }' +``` + +--- + +## Use Cases + +### Configuration Changes + +Update configuration settings without rebuilding entire resource objects: + +```bash +# Enable/disable public access +curl -X PATCH ".../projects/alice/research-docs" \ + -d '{"public_read": true}' + +# Update instance dimensions +curl -X PATCH ".../llm-services/alice/custom-model" \ + -d '{"dimensions": 1024}' +``` + +--- + +### Metadata Management + +Update descriptive metadata: + +```bash +# Update project description +curl -X PATCH ".../projects/alice/research-docs" \ + -d '{"description": "New description"}' + +# Update user name +curl -X PATCH ".../users/alice" \ + -d '{"name": "Alice Smith"}' +``` + +--- + +### Schema Evolution + +Add or update validation schemas: + +```bash +curl -X PATCH ".../projects/alice/research-docs" \ + -d '{ + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"category\":{\"type\":\"string\"}}}" + }' +``` + +--- + +## Authentication + +PATCH requests require the same authentication as PUT requests for the resource: + +| Resource | Who Can PATCH | +|----------|---------------| +| Users | Admin or the user themselves | +| Projects | Admin or project owner | +| LLM Services | Admin or instance owner | +| API Standards | Admin only | + +--- + +## Behavior Details + +### Merge Strategy + +PATCH uses a **shallow merge** strategy: + +- Top-level fields you specify **replace** the existing values +- Nested objects are replaced entirely (not deep-merged) +- Fields you don't specify remain unchanged + +**Example:** + +Existing project: +```json +{ + "description": "Old description", + "public_read": false, + "metadataScheme": "{...old schema...}" +} +``` + +PATCH request: +```json +{ + "description": "New description" +} +``` + +Result: +```json +{ + "description": "New description", + "public_read": false, + "metadataScheme": "{...old schema...}" +} +``` + +--- + +### Validation + +All field values are validated according to the resource's schema: + +- Field types must be correct +- Required fields (if specified) must be valid +- Constraints (e.g., string length, enum values) are enforced + +--- + +### Atomic Operations + +PATCH operations are atomic: +- Either all changes succeed, or none are applied +- If validation fails, the resource remains unchanged + +--- + +## Comparison: PATCH vs PUT + +### PUT (Complete Replacement) + +**Requires:** All fields (except read-only ones) + +```bash +curl -X PUT ".../projects/alice/research-docs" \ + -d '{ + "project_handle": "research-docs", + "description": "Updated description", + "instance_owner": "alice", + "instance_handle": "openai-large", + "public_read": false + }' +``` + +**Use when:** Creating or completely replacing a resource + +--- + +### PATCH (Partial Update) + +**Requires:** Only fields to change + +```bash +curl -X PATCH ".../projects/alice/research-docs" \ + -d '{ + "description": "Updated description" + }' +``` + +**Use when:** Modifying one or a few fields of an existing resource + +--- + +## Error Handling + +### 400 Bad Request + +Invalid field values or types: + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "Invalid value for field 'public_read': expected boolean, got string" +} +``` + +--- + +### 401 Unauthorized + +Missing or invalid authentication: + +```json +{ + "title": "Unauthorized", + "status": 401, + "detail": "Invalid or missing authorization credentials" +} +``` + +--- + +### 403 Forbidden + +Insufficient permissions: + +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "You don't have permission to modify this resource" +} +``` + +--- + +### 404 Not Found + +Resource doesn't exist: + +```json +{ + "title": "Not Found", + "status": 404, + "detail": "Project 'alice/research-docs' not found" +} +``` + +--- + +### 422 Unprocessable Entity + +Validation failed: + +```json +{ + "title": "Unprocessable Entity", + "status": 422, + "detail": "metadataScheme is not valid JSON Schema" +} +``` + +--- + +## Best Practices + +### Use PATCH for Single-Field Updates + +**Do:** +```bash +curl -X PATCH ".../projects/alice/research-docs" \ + -d '{"public_read": true}' +``` + +**Don't:** +```bash +# Unnecessarily complex +curl -X PUT ".../projects/alice/research-docs" \ + -d '{ + "project_handle": "research-docs", + "description": "Research docs", + "instance_owner": "alice", + "instance_handle": "openai-large", + "public_read": true + }' +``` + +--- + +### Group Related Changes + +Update multiple related fields in one request: + +```bash +curl -X PATCH ".../projects/alice/research-docs" \ + -d '{ + "description": "Updated project", + "public_read": true, + "metadataScheme": "{...new schema...}" + }' +``` + +--- + +### Validate Before Patching + +When possible, validate changes locally before submitting: + +```python +# Python example +def update_project_description(project_path, new_description): + if not new_description or len(new_description) > 500: + raise ValueError("Invalid description") + + response = requests.patch( + f"{API_BASE}/projects/{project_path}", + json={"description": new_description}, + headers={"Authorization": f"Bearer {API_KEY}"} + ) + return response.json() +``` + +--- + +### Handle Errors Gracefully + +```python +response = requests.patch( + f"{API_BASE}/projects/alice/research-docs", + json={"public_read": True}, + headers={"Authorization": f"Bearer {API_KEY}"} +) + +if response.status_code == 200: + print("Updated successfully") +elif response.status_code == 403: + print("Permission denied") +elif response.status_code == 404: + print("Project not found") +else: + print(f"Error: {response.json()['detail']}") +``` + +--- + +## Limitations + +### Not Available For + +PATCH is **not** available for: +- Endpoints that don't support GET and PUT +- List endpoints (e.g., `GET /v1/projects/alice`) +- Action endpoints (e.g., `/share`, `/transfer-ownership`) + +### Cannot Change Identifiers + +You cannot use PATCH to change resource identifiers: +- `user_handle` +- `project_handle` +- `instance_handle` +- `api_standard_handle` + +To rename a resource, you must create a new resource and delete the old one. + +--- + +## Related Documentation + +- [Users](endpoints/users/) - User management endpoints +- [Projects](endpoints/projects/) - Project management endpoints +- [LLM Services](endpoints/llm-services/) - LLM service instance endpoints +- [API Standards](endpoints/api-standards/) - API standard endpoints +- [Error Handling](error-handling/) - Error response reference diff --git a/docs/content/api/query-parameters.md b/docs/content/api/query-parameters.md new file mode 100644 index 0000000..514f24f --- /dev/null +++ b/docs/content/api/query-parameters.md @@ -0,0 +1,465 @@ +--- +title: "Query Parameters" +weight: 7 +--- + +# Query Parameters Reference + +Comprehensive reference for query parameters used across API endpoints for pagination, filtering, and search configuration. + +## Pagination Parameters + +### limit + +Maximum number of results to return in a single response. + +**Type:** Integer +**Default:** 10 +**Maximum:** 200 +**Minimum:** 1 + +**Used by:** +- `GET /v1/embeddings/{user}/{project}` - List embeddings +- `GET /v1/similars/{user}/{project}/{id}` - Similarity search +- `POST /v1/similars/{user}/{project}` - Raw vector search +- `GET /v1/projects/{user}` - List projects + +**Example:** + +```bash +# Get 50 embeddings +curl -X GET "https://api.example.com/v1/embeddings/alice/research-docs?limit=50" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Aliases:** +- `count` - Used in similarity search endpoints (same behavior as `limit`) + +--- + +### offset + +Number of results to skip before returning data. Used for pagination. + +**Type:** Integer +**Default:** 0 +**Minimum:** 0 + +**Used by:** +- `GET /v1/embeddings/{user}/{project}` - List embeddings +- `GET /v1/similars/{user}/{project}/{id}` - Similarity search +- `POST /v1/similars/{user}/{project}` - Raw vector search +- `GET /v1/projects/{user}` - List projects + +**Example:** + +```bash +# Get results 21-40 (page 2) +curl -X GET "https://api.example.com/v1/embeddings/alice/research-docs?limit=20&offset=20" \ + -H "Authorization: Bearer alice_api_key" + +# Get results 41-60 (page 3) +curl -X GET "https://api.example.com/v1/embeddings/alice/research-docs?limit=20&offset=40" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Pagination Formula:** +``` +Page N: offset = (N - 1) * limit +``` + +--- + +## Similarity Search Parameters + +### count + +Number of similar documents to return. Alias for `limit` in similarity search endpoints. + +**Type:** Integer +**Default:** 10 +**Maximum:** 200 +**Minimum:** 1 + +**Used by:** +- `GET /v1/similars/{user}/{project}/{id}` - Similarity search +- `POST /v1/similars/{user}/{project}` - Raw vector search + +**Example:** + +```bash +# Find 5 most similar documents +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=5" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Note:** `count` and `limit` can be used interchangeably in similarity endpoints. + +--- + +### threshold + +Minimum similarity score threshold. Only results with similarity >= threshold are returned. + +**Type:** Float +**Default:** 0.5 +**Range:** 0.0 to 1.0 +**Note:** 1.0 = most similar, 0.0 = least similar + +**Used by:** +- `GET /v1/similars/{user}/{project}/{id}` - Similarity search +- `POST /v1/similars/{user}/{project}` - Raw vector search + +**Example:** + +```bash +# Only return highly similar documents (>= 0.8) +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?threshold=0.8" \ + -H "Authorization: Bearer alice_api_key" + +# Return moderately similar documents (>= 0.6) +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?threshold=0.6" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Threshold Guidelines:** + +| Threshold | Interpretation | Use Case | +|-----------|----------------|----------| +| 0.95-1.0 | Near-identical | Duplicate detection | +| 0.85-0.95 | Very similar | Finding closely related documents | +| 0.7-0.85 | Moderately similar | Broad similarity search | +| 0.5-0.7 | Loosely similar | Exploratory search | +| 0.0-0.5 | Weakly similar | Generally not useful | + +--- + +## Metadata Filtering Parameters + +### metadata_path + +JSON path to a metadata field for filtering results. Must be used together with `metadata_value`. + +**Type:** String +**Format:** JSON path notation (e.g., `"author"`, `"author.name"`, `"publication.year"`) + +**Used by:** +- `GET /v1/similars/{user}/{project}/{id}` - Similarity search +- `POST /v1/similars/{user}/{project}` - Raw vector search + +**Path Notation:** +- **Simple field:** `author` +- **Nested field:** `author.name` +- **Deeply nested:** `publication.journal.name` + +**Example:** + +```bash +# Simple field path +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?metadata_path=author&metadata_value=John%20Doe" + +# Nested field path +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?metadata_path=author.id&metadata_value=A0083" +``` + +--- + +### metadata_value + +Metadata value to exclude from similarity search results. Must be used together with `metadata_path`. + +**Type:** String +**Behavior:** Negative matching (excludes documents where field equals this value) + +**Used by:** +- `GET /v1/similars/{user}/{project}/{id}` - Similarity search +- `POST /v1/similars/{user}/{project}` - Raw vector search + +**Example:** + +```bash +# Exclude documents from the same author +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?metadata_path=author&metadata_value=Alice%20Doe" \ + -H "Authorization: Bearer alice_api_key" + +# Exclude documents from the same category +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?metadata_path=category&metadata_value=biology" \ + -H "Authorization: Bearer alice_api_key" +``` + +**URL Encoding:** + +Always URL-encode metadata values that contain special characters: + +```bash +# Value: "John Doe" → "John%20Doe" +# Value: "2024-01-15" → "2024-01-15" (no special chars) +# Value: "author@example.com" → "author%40example.com" +``` + +--- + +## Parameter Combinations + +### Pagination with Filtering + +Combine limit, offset, and threshold: + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=20&offset=40&threshold=0.7" \ + -H "Authorization: Bearer alice_api_key" +``` + +--- + +### Similarity Search with Metadata Filtering + +Combine threshold and metadata filtering: + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=10&threshold=0.8&metadata_path=author&metadata_value=John%20Doe" \ + -H "Authorization: Bearer alice_api_key" +``` + +--- + +### Complete Query with All Parameters + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=50&offset=0&threshold=0.75&metadata_path=category&metadata_value=biology" \ + -H "Authorization: Bearer alice_api_key" +``` + +--- + +## Endpoint-Specific Parameters + +### List Embeddings + +**Endpoint:** `GET /v1/embeddings/{user}/{project}` + +**Supported Parameters:** +- `limit` (default: 10, max: 200) +- `offset` (default: 0) + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/embeddings/alice/research-docs?limit=100&offset=200" \ + -H "Authorization: Bearer alice_api_key" +``` + +--- + +### List Projects + +**Endpoint:** `GET /v1/projects/{user}` + +**Supported Parameters:** +- `limit` (default: 10, max: 200) +- `offset` (default: 0) + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/projects/alice?limit=50&offset=0" \ + -H "Authorization: Bearer alice_api_key" +``` + +--- + +### Similarity Search (GET) + +**Endpoint:** `GET /v1/similars/{user}/{project}/{id}` + +**Supported Parameters:** +- `count` / `limit` (default: 10, max: 200) +- `offset` (default: 0) +- `threshold` (default: 0.5, range: 0.0-1.0) +- `metadata_path` (optional, requires `metadata_value`) +- `metadata_value` (optional, requires `metadata_path`) + +**Example:** + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/research-docs/doc123?count=20&threshold=0.7&metadata_path=author&metadata_value=John%20Doe" \ + -H "Authorization: Bearer alice_api_key" +``` + +--- + +### Similarity Search (POST) + +**Endpoint:** `POST /v1/similars/{user}/{project}` + +**Supported Parameters:** Same as GET similarity search + +**Example:** + +```bash +curl -X POST "https://api.example.com/v1/similars/alice/research-docs?count=10&threshold=0.8" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [-0.020850, 0.018522, 0.053270, ...] + }' +``` + +--- + +## Parameter Validation + +### Invalid Parameter Values + +**Error Response:** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "limit must be between 1 and 200" +} +``` + +**Common Validation Errors:** + +- `limit` exceeds maximum (200) +- `limit` less than minimum (1) +- `offset` is negative +- `threshold` outside range 0.0-1.0 +- `metadata_path` without `metadata_value` +- `metadata_value` without `metadata_path` + +--- + +## Best Practices + +### Pagination + +**Do:** +- Use consistent `limit` values across pages +- Start with `offset=0` for the first page +- Increment `offset` by `limit` for each subsequent page + +**Example pagination logic:** + +```python +limit = 20 +page = 1 + +# Page 1 +offset = (page - 1) * limit # 0 +url = f"/v1/embeddings/alice/research?limit={limit}&offset={offset}" + +# Page 2 +page = 2 +offset = (page - 1) * limit # 20 +url = f"/v1/embeddings/alice/research?limit={limit}&offset={offset}" + +# Page 3 +page = 3 +offset = (page - 1) * limit # 40 +url = f"/v1/embeddings/alice/research?limit={limit}&offset={offset}" +``` + +--- + +### Similarity Search + +**Do:** +- Use higher thresholds (0.7-0.9) for focused searches +- Use lower thresholds (0.5-0.7) for exploratory searches +- Combine with `count` to limit result size +- Use metadata filtering to exclude unwanted results + +**Don't:** +- Request more results than needed (affects performance) +- Use very low thresholds (<0.5) unless necessary + +--- + +### Metadata Filtering + +**Do:** +- URL-encode metadata values with special characters +- Use specific field paths for nested metadata +- Test metadata paths with sample queries first + +**Example:** + +```bash +# Good: URL-encoded, specific path +metadata_path=author.id&metadata_value=A0083 + +# Good: Simple field +metadata_path=category&metadata_value=biology + +# Bad: Not URL-encoded +metadata_path=author name&metadata_value=John Doe + +# Good: URL-encoded +metadata_path=author%20name&metadata_value=John%20Doe +``` + +--- + +## Response Formats + +### Paginated Response + +Endpoints that support pagination typically return: + +```json +{ + "items": [...], + "total_count": 500, + "limit": 20, + "offset": 40, + "has_more": true +} +``` + +**Fields:** +- `items`: Array of results +- `total_count`: Total number of items available +- `limit`: Number of items requested per page +- `offset`: Current offset +- `has_more`: Boolean indicating if more results exist + +--- + +### Similarity Response + +Similarity endpoints return: + +```json +{ + "user_handle": "alice", + "project_handle": "research-docs", + "results": [ + { + "id": "doc456", + "similarity": 0.95 + }, + { + "id": "doc789", + "similarity": 0.87 + } + ] +} +``` + +**Notes:** +- Results are ordered by similarity (highest first) +- Only results >= `threshold` are included +- Maximum of `count`/`limit` results returned +- Filtered by `metadata_path`/`metadata_value` if specified + +--- + +## Related Documentation + +- [Embeddings](endpoints/embeddings/) - Embedding storage and retrieval +- [Similars](endpoints/similars/) - Similarity search details +- [Projects](endpoints/projects/) - Project management +- [Error Handling](error-handling/) - Error responses diff --git a/docs/content/concepts/_index.md b/docs/content/concepts/_index.md new file mode 100644 index 0000000..9d49ae0 --- /dev/null +++ b/docs/content/concepts/_index.md @@ -0,0 +1,24 @@ +--- +title: "Concepts" +weight: 2 +--- + +# Core Concepts + +Understanding the key concepts behind dhamps-vdb helps you make the most of its features. This section explains the fundamental building blocks and how they work together. + +## Overview + +dhamps-vdb is a vector database designed for Retrieval Augmented Generation (RAG) workflows. It stores embeddings with metadata and provides fast similarity search capabilities. + +## Key Components + +- **Users** - Individual accounts with authentication +- **Projects** - Containers for embeddings with access control +- **Embeddings** - Vector representations of text with metadata +- **LLM Services** - Configurations for embedding models +- **Similarity Search** - Find similar documents using vector distance + +## Architecture + +dhamps-vdb uses PostgreSQL with the pgvector extension for vector operations. It provides a RESTful API with token-based authentication and supports multi-user environments with project sharing. diff --git a/docs/content/concepts/architecture.md b/docs/content/concepts/architecture.md new file mode 100644 index 0000000..1151663 --- /dev/null +++ b/docs/content/concepts/architecture.md @@ -0,0 +1,341 @@ +--- +title: "Architecture" +weight: 1 +--- + +# Architecture + +dhamps-vdb is a vector database API designed for RAG (Retrieval Augmented Generation) workflows in Digital Humanities research. + +## System Overview + +``` +┌─────────────┐ +│ Client │ +│ Application │ +└──────┬──────┘ + │ HTTP/REST + │ +┌──────▼──────────────────────────┐ +│ dhamps-vdb API Server │ +│ ┌──────────────────────────┐ │ +│ │ Authentication Layer │ │ +│ └────────┬─────────────────┘ │ +│ ┌────────▼─────────────────┐ │ +│ │ Request Handlers │ │ +│ │ (Users, Projects, etc) │ │ +│ └────────┬─────────────────┘ │ +│ ┌────────▼─────────────────┐ │ +│ │ Validation Layer │ │ +│ │ (Dimensions, Metadata) │ │ +│ └────────┬─────────────────┘ │ +│ ┌────────▼─────────────────┐ │ +│ │ SQLC Queries │ │ +│ │ (Type-safe SQL) │ │ +│ └────────┬─────────────────┘ │ +└───────────┼──────────────────────┘ + │ + ┌───────▼──────────────┐ + │ PostgreSQL + 16 │ + │ with pgvector 0.7 │ + │ │ + │ ┌────────────────┐ │ + │ │ Vector Index │ │ + │ │ (HNSW/IVFFlat) │ │ + │ └────────────────┘ │ + └──────────────────────┘ +``` + +## Core Components + +### API Layer + +Built with [Huma](https://huma.rocks/) framework on top of Go's `http.ServeMux`: + +- OpenAPI documentation generation +- Automatic request/response validation +- JSON schema support +- REST endpoint routing + +### Authentication + +Token-based authentication using API keys: + +- **Admin key**: For administrative operations (user creation, system management) +- **User keys**: SHA-256 hashed, unique per user +- **Bearer token**: Transmitted in `Authorization` header + +### Data Storage + +PostgreSQL with pgvector extension: + +- **Vector storage**: Native pgvector support for embeddings +- **Vector search**: Cosine similarity using `<=>` operator +- **ACID compliance**: Transactional consistency +- **Relational integrity**: Foreign keys and constraints + +### Code Generation + +Uses [sqlc](https://sqlc.dev/) for type-safe database queries: + +- SQL queries → Go functions +- Compile-time type checking +- No ORM overhead +- Direct PostgreSQL integration + +## Data Model + +### Core Entities + +``` +users + ├── projects (1:many) + │ ├── embeddings (1:many) + │ └── instance (1:1) + │ + └── instances (1:many) + └── definition (many:1, optional) + +_system (special user) + └── definitions (1:many) +``` + +### Key Relationships + +**Users → Projects** +- One user owns many projects +- Projects can be shared with other users (reader/editor roles) +- Projects can be public (unauthenticated read access) + +**Projects → Instances** +- Each project references exactly one LLM service instance +- Instance defines embedding dimensions and configuration + +**Projects → Embeddings** +- One project contains many embeddings +- Each embedding has a unique text_id within the project +- Embeddings store vector, metadata, and optional text + +**Users → Instances** +- Users own their instances +- Instances can be shared with other users +- Instances store encrypted API keys + +**Instances → Definitions** +- Instances can optionally reference a definition (template) +- System definitions (`_system` owner) provide defaults +- User definitions allow custom templates + +## Request Flow + +### 1. Create Embedding + +``` +Client Request + ↓ +Authentication Middleware + ↓ +Authorization Check (owner/editor?) + ↓ +Dimension Validation (vector_dim matches instance?) + ↓ +Metadata Validation (matches project schema?) + ↓ +Database Insert (with transaction) + ↓ +Response +``` + +### 2. Similarity Search + +``` +Client Request (text_id or vector) + ↓ +Authentication Middleware (or public check) + ↓ +Authorization Check (owner/reader/public?) + ↓ +Dimension Validation (if raw vector) + ↓ +Vector Similarity Query + ├── Cosine distance calculation + ├── Threshold filtering + ├── Metadata filtering (exclude matches) + └── Limit/offset pagination + ↓ +Results (sorted by similarity) + ↓ +Response +``` + +## Storage Architecture + +### Vector Index + +pgvector supports multiple index types: + +- **IVFFlat**: Faster build, approximate search +- **HNSW**: Slower build, better recall + +Current implementation uses HNSW for better accuracy. + +### Vector Storage Format + +```sql +CREATE TABLE embeddings ( + embedding_id SERIAL PRIMARY KEY, + text_id TEXT NOT NULL, + project_id INT REFERENCES projects, + vector vector(3072), -- Dimension varies + vector_dim INT NOT NULL, + metadata JSONB, + text TEXT, + ... +) +``` + +### Index Strategy + +```sql +CREATE INDEX embedding_vector_idx +ON embeddings +USING hnsw (vector vector_cosine_ops); +``` + +Optimized for cosine similarity searches. + +## Security Architecture + +### API Key Encryption + +- **Algorithm**: AES-256-GCM +- **Key Source**: `ENCRYPTION_KEY` environment variable +- **Key Derivation**: SHA-256 hash to ensure 32-byte key +- **Storage**: Binary (BYTEA) in database + +### Access Control + +**Three-tier access model:** + +1. **Owner**: Full control (read, write, delete, share, transfer) +2. **Editor**: Read and write embeddings +3. **Reader**: Read-only access to embeddings and search + +**Special access:** +- **Admin**: System-wide operations (user management, sanity checks) +- **Public**: Unauthenticated read access (if `public_read=true`) + +### Data Isolation + +- Users can only access their own resources or shared resources +- Cross-user queries are prevented at the database level +- Project ownership enforced via foreign keys + +## Migration System + +Uses [tern](https://github.com/jackc/tern) for database migrations: + +``` +migrations/ + ├── 001_create_initial_scheme.sql + ├── 002_create_emb_index.sql + ├── 003_add_public_read_flag.sql + └── 004_refactor_llm_services_architecture.sql +``` + +Migrations run automatically on startup with rollback support. + +## Performance Characteristics + +### Vector Search Performance + +- **Small datasets** (<10K embeddings): <10ms per query +- **Medium datasets** (10K-100K): 10-50ms per query +- **Large datasets** (>100K): 50-200ms per query + +Performance depends on: +- Vector dimensions +- Index type and parameters +- Hardware (CPU, RAM, disk) +- Number of results requested + +### Scaling Considerations + +**Vertical Scaling:** +- More RAM = faster searches (more vectors in memory) +- Faster CPUs = faster vector comparisons +- SSD storage = faster index scans + +**Horizontal Scaling:** +- Read replicas for search queries +- Separate write/read workloads +- Connection pooling for concurrent requests + +## Technology Stack + +### Core Technologies + +- **Language**: Go 1.21+ +- **Web Framework**: Huma 2.x +- **Database**: PostgreSQL 16+ +- **Vector Extension**: pgvector 0.7.4 +- **Query Generator**: sqlc 1.x +- **Migration Tool**: tern 2.x + +### Development Tools + +- **Testing**: Go standard library + testcontainers +- **Documentation**: OpenAPI 3.0 (auto-generated) +- **Building**: Docker multi-stage builds +- **Deployment**: Docker Compose + +## Design Principles + +### 1. Type Safety + +- sqlc generates type-safe Go code from SQL +- Strong typing prevents SQL injection +- Compile-time validation of queries + +### 2. Simplicity + +- REST API (not GraphQL) +- Straightforward URL patterns +- Standard HTTP methods + +### 3. Security + +- API key encryption at rest +- No API keys in responses +- Role-based access control + +### 4. Validation + +- Automatic dimension validation +- Optional metadata schema validation +- Request/response validation via OpenAPI + +### 5. Extensibility + +- User-defined metadata schemas +- Custom LLM service configurations +- Flexible sharing model + +## Limitations + +### Current Constraints + +- **No multi-tenancy**: Each installation is single-tenant +- **No replication**: Manual setup required for HA +- **No caching**: All queries hit database +- **Synchronous API**: No async/batch upload endpoints + +### Future Enhancements + +See [Roadmap](../../reference/roadmap/) for planned improvements. + +## Next Steps + +- [Learn about users and authentication](users-and-auth/) +- [Understand projects](projects/) +- [Explore LLM services](llm-services/) diff --git a/docs/content/concepts/embeddings.md b/docs/content/concepts/embeddings.md new file mode 100644 index 0000000..a2e057d --- /dev/null +++ b/docs/content/concepts/embeddings.md @@ -0,0 +1,466 @@ +--- +title: "Embeddings" +weight: 4 +--- + +# Embeddings + +Embeddings are vector representations of text stored in dhamps-vdb for similarity search and retrieval. + +## What are Embeddings? + +Embeddings are numerical representations (vectors) of text that capture semantic meaning: + +- **Vector**: Array of floating-point numbers (e.g., 1536 or 3072 dimensions) +- **Dimensions**: Fixed length determined by LLM model +- **Similarity**: Vectors of similar text are close in vector space +- **Purpose**: Enable semantic search and retrieval + +## Embedding Structure + +### Required Fields + +- **text_id**: Unique identifier for the document (max 300 characters) +- **instance_handle**: LLM service instance that generated the embedding +- **vector**: Array of float32 values (embedding vector) +- **vector_dim**: Declared dimension count (must match vector length) + +### Optional Fields + +- **text**: Original text content (for reference) +- **metadata**: Structured JSON data about the document + +### Example + +```json +{ + "text_id": "doc-123", + "instance_handle": "my-openai", + "text": "Introduction to machine learning concepts", + "vector": [0.023, -0.015, 0.087, ..., 0.042], + "vector_dim": 3072, + "metadata": { + "title": "ML Introduction", + "author": "Alice", + "year": 2024, + "category": "tutorial" + } +} +``` + +## Creating Embeddings + +### Single Embedding + +```bash +POST /v1/embeddings/alice/research-docs + +{ + "embeddings": [ + { + "text_id": "doc1", + "instance_handle": "my-openai", + "vector": [0.1, 0.2, ..., 0.3], + "vector_dim": 3072, + "metadata": {"author": "Alice"} + } + ] +} +``` + +### Batch Upload + +```bash +POST /v1/embeddings/alice/research-docs + +{ + "embeddings": [ + { + "text_id": "doc1", + "instance_handle": "my-openai", + "vector": [...], + "vector_dim": 3072 + }, + { + "text_id": "doc2", + "instance_handle": "my-openai", + "vector": [...], + "vector_dim": 3072 + }, + ... + ] +} +``` + +**Batch upload tips:** +- Upload 100-1000 embeddings per request +- Use consistent instance_handle +- Ensure all vectors have same dimensions +- Include metadata for searchability + +## Text Identifiers + +### Format + +Text IDs can be any string up to 300 characters: + +**Common patterns:** +- **URLs**: `https://id.example.com/doc/123` +- **URNs**: `urn:example:doc:123` +- **Paths**: `/corpus/section1/doc123` +- **IDs**: `doc-abc-123-xyz` + +### URL Encoding + +URL-encode text IDs when using them in API paths: + +```bash +# Original ID +text_id="https://id.example.com/texts/W0017:1.3.1" + +# URL-encoded for API +encoded="https%3A%2F%2Fid.example.com%2Ftexts%2FW0017%3A1.3.1" + +# Use in API call +GET /v1/embeddings/alice/project/$encoded +``` + +### Uniqueness + +Text IDs must be unique within a project: +- Same ID in different projects: ✅ Allowed +- Same ID twice in one project: ❌ Conflict error + +## Validation + +### Dimension Validation + +The system automatically validates vector dimensions: + +**Checks performed:** +1. `vector_dim` matches declared instance dimensions +2. Actual `vector` array length matches `vector_dim` +3. All embeddings in project have consistent dimensions + +**Example error:** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "dimension validation failed: vector dimension mismatch: embedding declares 3072 dimensions but LLM service 'my-openai' expects 1536 dimensions" +} +``` + +### Metadata Validation + +If project has a metadata schema, all embeddings are validated: + +**Example error:** + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "metadata validation failed for text_id 'doc1': metadata validation failed:\n - author is required\n - year must be integer" +} +``` + +See [Metadata Validation Guide](../guides/metadata-validation/) for details. + +## Retrieving Embeddings + +### List All Embeddings + +```bash +GET /v1/embeddings/alice/research-docs?limit=100&offset=0 +``` + +Returns paginated list of embeddings with: +- text_id +- metadata +- vector_dim +- created_at + +Vectors are included by default (can be large). + +### Get Single Embedding + +```bash +GET /v1/embeddings/alice/research-docs/doc1 +``` + +Returns complete embedding including vector. + +### Pagination + +Use `limit` and `offset` for large projects: + +```bash +# First page (0-99) +GET /v1/embeddings/alice/research-docs?limit=100&offset=0 + +# Second page (100-199) +GET /v1/embeddings/alice/research-docs?limit=100&offset=100 + +# Third page (200-299) +GET /v1/embeddings/alice/research-docs?limit=100&offset=200 +``` + +## Updating Embeddings + +Currently, embeddings cannot be updated directly. To modify: + +1. **Delete** existing embedding +2. **Upload** new version with same text_id + +```bash +# Delete old version +DELETE /v1/embeddings/alice/research-docs/doc1 + +# Upload new version +POST /v1/embeddings/alice/research-docs +{ + "embeddings": [{ + "text_id": "doc1", + "instance_handle": "my-openai", + "vector": [...new vector...], + "vector_dim": 3072, + "metadata": {...updated metadata...} + }] +} +``` + +## Deleting Embeddings + +### Delete Single Embedding + +```bash +DELETE /v1/embeddings/alice/research-docs/doc1 +``` + +### Delete All Embeddings + +```bash +DELETE /v1/embeddings/alice/research-docs +``` + +**Warning:** This deletes all embeddings in the project permanently. + +## Metadata + +### Purpose + +Metadata provides structured information about documents: + +- **Filtering**: Exclude documents in similarity searches +- **Organization**: Categorize and group documents +- **Context**: Store additional document information +- **Validation**: Ensure consistent structure (with schema) + +### Structure + +Metadata is stored as JSONB in PostgreSQL: + +```json +{ + "author": "William Shakespeare", + "title": "Hamlet", + "year": 1603, + "act": 1, + "scene": 1, + "genre": "drama", + "language": "English" +} +``` + +### Nested Metadata + +Complex structures are supported: + +```json +{ + "author": { + "name": "William Shakespeare", + "birth_year": 1564, + "nationality": "English" + }, + "publication": { + "year": 1603, + "publisher": "First Folio", + "edition": 1 + }, + "tags": ["tragedy", "revenge", "madness"] +} +``` + +### Filtering by Metadata + +Use metadata to exclude documents from similarity searches: + +```bash +# Exclude documents from same author +GET /v1/similars/alice/project/doc1?metadata_path=author&metadata_value=Shakespeare +``` + +See [Metadata Filtering Guide](../guides/metadata-filtering/) for details. + +## Storage Considerations + +### Vector Storage + +Vectors are stored using pgvector extension: + +- **Type**: `vector(N)` where N is dimension count +- **Size**: 4 bytes per dimension + overhead +- **Example**: 3072-dimension vector ≈ 12KB + +### Storage Calculation + +Estimate storage per embedding: + +``` +Vector: 4 bytes × dimensions +Text ID: length in bytes (avg ~50 bytes) +Text: length in bytes (optional) +Metadata: JSON size (varies, avg ~500 bytes) +Overhead: ~100 bytes (indexes, etc.) + +Example (3072-dim with metadata): +4 × 3072 + 50 + 500 + 100 ≈ 13KB per embedding +``` + +### Large Projects + +For projects with millions of embeddings: + +- Use pagination when listing +- Consider partial indexes for metadata +- Monitor database size +- Plan backup strategy + +## Performance + +### Upload Performance + +- **Small batches** (1-10): ~100ms per request +- **Medium batches** (100-500): ~500ms-2s per request +- **Large batches** (1000+): ~2-10s per request + +### Retrieval Performance + +- **Single embedding**: <10ms +- **Paginated list** (100 items): ~50ms +- **Large project scan**: Use pagination + +### Optimization Tips + +- Batch uploads when possible +- Use appropriate page sizes +- Include only needed fields +- Monitor query performance + +## Common Patterns + +### Document Chunking + +Split long documents into chunks: + +```json +{ + "embeddings": [ + { + "text_id": "doc1:chunk1", + "text": "First part of document...", + "vector": [...], + "metadata": {"doc_id": "doc1", "chunk": 1} + }, + { + "text_id": "doc1:chunk2", + "text": "Second part of document...", + "vector": [...], + "metadata": {"doc_id": "doc1", "chunk": 2} + } + ] +} +``` + +### Versioned Documents + +Track document versions: + +```json +{ + "text_id": "doc1:v2", + "vector": [...], + "metadata": { + "doc_id": "doc1", + "version": 2, + "updated_at": "2024-01-15T10:30:00Z" + } +} +``` + +### Multi-Language Documents + +Store embeddings for different languages: + +```json +{ + "embeddings": [ + { + "text_id": "doc1:en", + "text": "English version...", + "vector": [...], + "metadata": {"doc_id": "doc1", "language": "en"} + }, + { + "text_id": "doc1:de", + "text": "Deutsche Version...", + "vector": [...], + "metadata": {"doc_id": "doc1", "language": "de"} + } + ] +} +``` + +## Troubleshooting + +### Dimension Mismatch + +**Error**: "vector dimension mismatch" + +**Cause**: Vector dimensions don't match instance configuration + +**Solution**: +- Check instance dimensions: `GET /v1/llm-services/owner/instance` +- Regenerate embeddings with correct model +- Ensure `vector_dim` matches actual vector length + +### Metadata Validation Failed + +**Error**: "metadata validation failed" + +**Cause**: Metadata doesn't match project schema + +**Solution**: +- Check project schema: `GET /v1/projects/owner/project` +- Update metadata to match schema +- Or update schema to accept metadata + +### Text ID Conflict + +**Error**: "embedding with text_id already exists" + +**Cause**: Attempting to upload duplicate text_id + +**Solution**: +- Use different text_id +- Delete existing embedding first +- Check for unintended duplicates + +## Next Steps + +- [Learn about similarity search](similarity-search/) +- [Explore metadata filtering](../guides/metadata-filtering/) +- [Understand LLM services](llm-services/) diff --git a/docs/content/concepts/llm-services.md b/docs/content/concepts/llm-services.md new file mode 100644 index 0000000..d1c74b9 --- /dev/null +++ b/docs/content/concepts/llm-services.md @@ -0,0 +1,428 @@ +--- +title: "LLM Services" +weight: 5 +--- + +# LLM Services + +LLM Services configure embedding generation, defining models, dimensions, and API access. + +## Architecture + +dhamps-vdb separates LLM services into two concepts: + +### LLM Service Definitions + +Reusable configuration templates owned by `_system` or users: + +- **Purpose**: Provide standard configurations +- **Ownership**: `_system` (global) or individual users +- **Contents**: Endpoint, model, dimensions, API standard +- **API Keys**: Not stored (templates only) +- **Usage**: Templates for creating instances + +### LLM Service Instances + +User-specific configurations with encrypted API keys: + +- **Purpose**: Actual service configurations users employ +- **Ownership**: Individual users +- **Contents**: Endpoint, model, dimensions, API key (encrypted) +- **Sharing**: Can be shared with other users +- **Projects**: Each project references exactly one instance + +## System Definitions + +### Available Definitions + +The `_system` user provides default definitions: + +| Handle | Model | Dimensions | API Standard | +|--------|-------|------------|--------------| +| openai-large | text-embedding-3-large | 3072 | openai | +| openai-small | text-embedding-3-small | 1536 | openai | +| cohere-v4 | embed-multilingual-v4.0 | 1536 | cohere | +| gemini-embedding-001 | text-embedding-004 | 3072 | gemini | + +### Viewing System Definitions + +```bash +GET /v1/llm-services/_system +Authorization: Bearer user_vdb_key +``` + +Returns list of available system definitions. + +## Creating Instances + +### From System Definition + +Use a predefined system configuration: + +```bash +PUT /v1/llm-services/alice/my-openai + +{ + "definition_owner": "_system", + "definition_handle": "openai-large", + "description": "My OpenAI embeddings", + "api_key_encrypted": "sk-proj-your-openai-key" +} +``` + +Inherits endpoint, model, dimensions from system definition. + +### From User Definition + +Reference a user-created definition: + +```bash +PUT /v1/llm-services/alice/custom-instance + +{ + "definition_owner": "alice", + "definition_handle": "my-custom-config", + "api_key_encrypted": "your-api-key" +} +``` + +### Standalone Instance + +Create without a definition: + +```bash +PUT /v1/llm-services/alice/standalone + +{ + "endpoint": "https://api.openai.com/v1/embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "description": "Standalone OpenAI instance", + "api_key_encrypted": "sk-proj-your-key" +} +``` + +All fields must be specified. + +## Instance Properties + +### Core Fields + +- **instance_handle**: Unique identifier (3-20 characters) +- **owner**: User who owns the instance +- **endpoint**: API endpoint URL +- **api_standard**: Authentication mechanism (openai, cohere, gemini) +- **model**: Model identifier +- **dimensions**: Vector dimensionality +- **description**: Human-readable description (optional) +- **definition_id**: Reference to definition (optional) + +### API Key Storage + +- **Write-only**: Provided on create/update +- **Encrypted**: AES-256-GCM encryption +- **Never returned**: Not included in GET responses +- **Secure**: Cannot be retrieved after creation + +## API Standards + +### Supported Standards + +| Standard | Key Method | Documentation | +|----------|------------|---------------| +| openai | Authorization: Bearer | [OpenAI Docs](https://platform.openai.com/docs/api-reference/embeddings) | +| cohere | Authorization: Bearer | [Cohere Docs](https://docs.cohere.com/reference/embed) | +| gemini | x-goog-api-key header | [Gemini Docs](https://ai.google.dev/gemini-api/docs/embeddings) | + +### Creating API Standards + +Admins can add new standards: + +```bash +POST /v1/api-standards +Authorization: Bearer ADMIN_KEY + +{ + "api_standard_handle": "custom", + "description": "Custom LLM API", + "key_method": "auth_bearer", + "key_field": null +} +``` + +## Instance Management + +### List Instances + +List all accessible instances (owned + shared): + +```bash +GET /v1/llm-services/alice +Authorization: Bearer alice_vdb_key +``` + +Returns instances where alice is owner or has been granted access. + +### Get Instance Details + +```bash +GET /v1/llm-services/alice/my-openai +Authorization: Bearer alice_vdb_key +``` + +Returns instance configuration (API key not included). + +### Update Instance + +```bash +PATCH /v1/llm-services/alice/my-openai + +{ + "description": "Updated description", + "api_key_encrypted": "new-api-key" +} +``` + +Only owner can update instances. + +### Delete Instance + +```bash +DELETE /v1/llm-services/alice/my-openai +Authorization: Bearer alice_vdb_key +``` + +**Constraints:** +- Cannot delete instance used by projects +- Delete projects first, then instance + +## Instance Sharing + +### Share with User + +```bash +POST /v1/llm-services/alice/my-openai/share + +{ + "share_with_handle": "bob", + "role": "reader" +} +``` + +**Shared users can:** +- Use instance in their projects +- View instance configuration +- **Cannot:** + - See API key + - Modify instance + - Delete instance + +### Unshare from User + +```bash +DELETE /v1/llm-services/alice/my-openai/share/bob +``` + +### List Shared Users + +```bash +GET /v1/llm-services/alice/my-openai/shared-with +``` + +Only owner can view shared users. + +## Instance References + +### In Projects + +Projects reference instances by owner and handle: + +```json +{ + "project_handle": "my-project", + "instance_owner": "alice", + "instance_handle": "my-openai" +} +``` + +### In Embeddings + +Embeddings reference instances by handle: + +```json +{ + "text_id": "doc1", + "instance_handle": "my-openai", + "vector": [...] +} +``` + +The instance owner is inferred from the project. + +### Shared Instance Format + +When using shared instances: + +**Own instance**: `"instance_handle": "my-openai"` +**Shared instance**: Reference via project configuration + +## Encryption + +### API Key Encryption + +API keys are encrypted using AES-256-GCM: + +**Encryption process:** +1. User provides plaintext API key +2. Server encrypts using `ENCRYPTION_KEY` from environment +3. Encrypted bytes stored in database +4. Key derivation: SHA-256 hash ensures 32-byte key + +**Decryption:** +- Only occurs internally for LLM API calls +- Never exposed via API responses +- Requires same `ENCRYPTION_KEY` + +### Security Notes + +- **Encryption key**: Set via `ENCRYPTION_KEY` environment variable +- **Key loss**: Losing encryption key means losing access to API keys +- **Key rotation**: Not currently supported +- **Backup**: Back up encryption key securely + +## LLM Processing + +### Current Status + +LLM processing (generating embeddings) is **not yet implemented**. + +### Future Implementation + +Planned features: +- Process text to generate embeddings +- Call external LLM APIs +- Store generated embeddings +- Batch processing support + +### Current Workflow + +Users must generate embeddings externally: + +1. Generate embeddings using LLM API (OpenAI, Cohere, etc.) +2. Upload pre-generated embeddings to dhamps-vdb +3. Use dhamps-vdb for storage and similarity search + +## Common Patterns + +### Per-Environment Instances + +```bash +# Development instance +PUT /v1/llm-services/alice/dev-embeddings +{ + "definition_owner": "_system", + "definition_handle": "openai-small", + "api_key_encrypted": "dev-api-key" +} + +# Production instance +PUT /v1/llm-services/alice/prod-embeddings +{ + "definition_owner": "_system", + "definition_handle": "openai-large", + "api_key_encrypted": "prod-api-key" +} +``` + +### Team Shared Instance + +```bash +# Owner creates instance +PUT /v1/llm-services/team-lead/team-embeddings +{ + "definition_owner": "_system", + "definition_handle": "openai-large", + "api_key_encrypted": "team-api-key" +} + +# Share with team members +POST /v1/llm-services/team-lead/team-embeddings/share +{"share_with_handle": "member1", "role": "reader"} + +POST /v1/llm-services/team-lead/team-embeddings/share +{"share_with_handle": "member2", "role": "reader"} + +# Members use in their projects +POST /v1/projects/member1 +{ + "project_handle": "my-project", + "instance_owner": "team-lead", + "instance_handle": "team-embeddings" +} +``` + +### Multi-Model Setup + +```bash +# Large model for important documents +PUT /v1/llm-services/alice/high-quality +{ + "definition_owner": "_system", + "definition_handle": "openai-large", + "api_key_encrypted": "api-key" +} + +# Small model for drafts +PUT /v1/llm-services/alice/fast-processing +{ + "definition_owner": "_system", + "definition_handle": "openai-small", + "api_key_encrypted": "api-key" +} +``` + +## Troubleshooting + +### Cannot Create Instance + +**Possible causes:** +- Instance handle already exists +- Referenced definition doesn't exist +- Missing required fields +- Invalid API standard + +**Solutions:** +- Choose different handle +- Verify definition: `GET /v1/llm-services/_system` +- Include all required fields +- Use valid API standard: `GET /v1/api-standards` + +### Cannot Use Instance in Project + +**Possible causes:** +- Instance doesn't exist +- Instance not owned or shared with user +- Incorrect owner/handle reference + +**Solutions:** +- Verify instance exists +- Check instance is accessible +- Confirm spelling of owner and handle + +### Dimension Mismatch + +**Error**: "dimension validation failed" + +**Cause**: Embedding dimensions don't match instance + +**Solutions:** +- Check instance dimensions +- Regenerate embeddings with correct model +- Create instance with correct dimensions + +## Next Steps + +- [Understand projects](projects/) +- [Learn about embeddings](embeddings/) +- [Explore instance management](../guides/instance-management/) diff --git a/docs/content/concepts/metadata.md b/docs/content/concepts/metadata.md new file mode 100644 index 0000000..36ddc48 --- /dev/null +++ b/docs/content/concepts/metadata.md @@ -0,0 +1,552 @@ +--- +title: "Metadata" +weight: 7 +--- + +# Metadata + +Structured JSON data attached to embeddings for organization, validation, and filtering. + +## Overview + +Metadata provides context and structure for your embeddings: + +- **Organization**: Categorize and group documents +- **Filtering**: Exclude documents in similarity searches +- **Validation**: Ensure consistent structure (optional) +- **Context**: Store additional document information + +## Metadata Structure + +### Format + +Metadata is JSON stored as JSONB in PostgreSQL: + +```json +{ + "author": "William Shakespeare", + "title": "Hamlet", + "year": 1603, + "genre": "drama" +} +``` + +### Types + +Supported JSON types: +- **String**: `"author": "Shakespeare"` +- **Number**: `"year": 1603` +- **Boolean**: `"published": true` +- **Array**: `"tags": ["tragedy", "revenge"]` +- **Object**: `"author": {"name": "...", "id": "..."}` +- **Null**: `"notes": null` + +### Nested Structure + +Complex hierarchies are supported: + +```json +{ + "document": { + "id": "W0017", + "type": "manuscript" + }, + "author": { + "name": "John Milton", + "birth_year": 1608, + "nationality": "English" + }, + "publication": { + "year": 1667, + "publisher": "First Edition", + "location": "London" + }, + "tags": ["poetry", "epic", "religious"] +} +``` + +## Metadata Schemas + +### Purpose + +JSON Schema validation ensures consistent metadata across all project embeddings. + +### Defining a Schema + +Include `metadataScheme` when creating/updating project: + +```bash +POST /v1/projects/alice + +{ + "project_handle": "research", + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"author\":{\"type\":\"string\"},\"year\":{\"type\":\"integer\"}},\"required\":[\"author\"]}" +} +``` + +### Schema Format + +Use [JSON Schema](https://json-schema.org/) (draft-07+): + +```json +{ + "type": "object", + "properties": { + "author": { + "type": "string", + "minLength": 1 + }, + "year": { + "type": "integer", + "minimum": 1000, + "maximum": 2100 + }, + "genre": { + "type": "string", + "enum": ["poetry", "prose", "drama"] + } + }, + "required": ["author", "year"] +} +``` + +### Validation Behavior + +**With schema defined:** +- All embeddings validated on upload +- Invalid metadata rejected with detailed error +- Schema enforced consistently + +**Without schema:** +- Any JSON metadata accepted +- No validation performed +- Maximum flexibility + +### Common Patterns + +See [Metadata Validation Guide](../guides/metadata-validation/) for examples. + +## Using Metadata + +### On Upload + +Include metadata with each embedding: + +```bash +POST /v1/embeddings/alice/research + +{ + "embeddings": [ + { + "text_id": "doc1", + "instance_handle": "my-embeddings", + "vector": [...], + "metadata": { + "author": "Shakespeare", + "title": "Hamlet", + "year": 1603 + } + } + ] +} +``` + +### In Responses + +Metadata returned when retrieving embeddings: + +```bash +GET /v1/embeddings/alice/research/doc1 +``` + +```json +{ + "text_id": "doc1", + "metadata": { + "author": "Shakespeare", + "title": "Hamlet", + "year": 1603 + }, + "vector": [...], + ... +} +``` + +## Metadata Filtering + +### Exclusion Filter + +Exclude documents where metadata matches value: + +```bash +GET /v1/similars/alice/research/doc1?metadata_path=author&metadata_value=Shakespeare +``` + +**Result**: Returns similar documents **excluding** those with `metadata.author == "Shakespeare"`. + +### Path Syntax + +Use JSON path notation: + +**Simple field:** +``` +metadata_path=author +``` + +**Nested field:** +``` +metadata_path=author.name +``` + +**Array element** (not currently supported): +``` +metadata_path=tags[0] +``` + +### URL Encoding + +Encode special characters: + +```bash +# Space +metadata_value=John%20Doe + +# Quotes (if needed) +metadata_value=%22quoted%20value%22 +``` + +### Use Cases + +**Exclude same work:** +```bash +?metadata_path=title&metadata_value=Hamlet +``` + +**Exclude same author:** +```bash +?metadata_path=author&metadata_value=Shakespeare +``` + +**Exclude same source:** +```bash +?metadata_path=source_id&metadata_value=corpus-a +``` + +**Exclude same category:** +```bash +?metadata_path=category&metadata_value=draft +``` + +See [Metadata Filtering Guide](../guides/metadata-filtering/) for detailed examples. + +## Validation Examples + +### Simple Schema + +```json +{ + "type": "object", + "properties": { + "author": {"type": "string"}, + "year": {"type": "integer"} + }, + "required": ["author"] +} +``` + +**Valid metadata:** +```json +{"author": "Shakespeare", "year": 1603} +{"author": "Milton"} +``` + +**Invalid metadata:** +```json +{"year": 1603} // Missing required 'author' +{"author": 123} // Wrong type (should be string) +``` + +### Schema with Constraints + +```json +{ + "type": "object", + "properties": { + "title": { + "type": "string", + "minLength": 1, + "maxLength": 200 + }, + "rating": { + "type": "number", + "minimum": 0, + "maximum": 5 + }, + "tags": { + "type": "array", + "items": {"type": "string"}, + "minItems": 1, + "maxItems": 10 + } + } +} +``` + +### Schema with Enums + +```json +{ + "type": "object", + "properties": { + "language": { + "type": "string", + "enum": ["en", "de", "fr", "es", "la"] + }, + "status": { + "type": "string", + "enum": ["draft", "review", "published"] + } + } +} +``` + +## Storage and Performance + +### Storage + +Metadata stored as JSONB in PostgreSQL: + +- **Efficient**: Binary storage format +- **Indexable**: Can create indexes on fields +- **Queryable**: Use PostgreSQL JSON operators + +### Size Considerations + +Typical metadata sizes: + +- **Simple**: 50-200 bytes +- **Moderate**: 200-1000 bytes +- **Complex**: 1-5KB +- **Very large**: >5KB (consider storing elsewhere) + +### Performance + +**Metadata filtering:** +- JSONB queries are efficient +- Add indexes for frequently filtered fields +- Keep metadata reasonably sized + +**Example index (if needed):** +```sql +CREATE INDEX idx_embeddings_author +ON embeddings ((metadata->>'author')); +``` + +## Common Patterns + +### Document Provenance + +Track document source and history: + +```json +{ + "source": { + "corpus": "Shakespeare Works", + "collection": "Tragedies", + "document_id": "hamlet", + "version": 2 + }, + "imported_at": "2024-01-15T10:30:00Z", + "imported_by": "researcher1" +} +``` + +### Hierarchical Documents + +Structure for nested documents: + +```json +{ + "work": "Paradise Lost", + "book": 1, + "line": 1, + "chapter": null, + "section": "Invocation" +} +``` + +### Multi-Language Content + +Track language and translation info: + +```json +{ + "language": "en", + "original_language": "la", + "translated_by": "John Smith", + "translation_year": 1850 +} +``` + +### Research Metadata + +Academic paper metadata: + +```json +{ + "doi": "10.1234/example.2024.001", + "authors": ["Alice Smith", "Bob Jones"], + "journal": "Digital Humanities Review", + "year": 2024, + "keywords": ["NLP", "embeddings", "RAG"] +} +``` + +## Updating Metadata + +### Current Limitation + +Metadata cannot be updated directly. To change: + +1. Delete embedding +2. Re-upload with updated metadata + +```bash +# Delete +DELETE /v1/embeddings/alice/project/doc1 + +# Re-upload with new metadata +POST /v1/embeddings/alice/project +{ + "embeddings": [{ + "text_id": "doc1", + "metadata": {...updated...}, + ... + }] +} +``` + +## Schema Updates + +### Updating Project Schema + +Use PATCH to update schema: + +```bash +PATCH /v1/projects/alice/research + +{ + "metadataScheme": "{...new schema...}" +} +``` + +### Effect on Existing Embeddings + +- **Existing embeddings**: Not revalidated +- **New embeddings**: Validated against new schema +- **Updates**: Validated against current schema + +### Migration Strategy + +When updating schema: + +1. Update project schema +2. Verify new embeddings work +3. Optionally re-upload existing embeddings + +## Validation Errors + +### Common Errors + +**Missing required field:** +```json +{ + "status": 400, + "detail": "metadata validation failed: author is required" +} +``` + +**Wrong type:** +```json +{ + "status": 400, + "detail": "metadata validation failed: year must be integer" +} +``` + +**Enum violation:** +```json +{ + "status": 400, + "detail": "metadata validation failed: genre must be one of [poetry, prose, drama]" +} +``` + +### Debugging + +To debug validation errors: + +1. Check project schema: `GET /v1/projects/owner/project` +2. Validate metadata with online tool: [jsonschemavalidator.net](https://www.jsonschemavalidator.net/) +3. Review error message for specific field +4. Update metadata or schema as needed + +## Best Practices + +### Schema Design + +- Start simple, add complexity as needed +- Use required fields for critical data +- Use enums for controlled vocabularies +- Document your schema + +### Metadata Content + +- Keep metadata focused and relevant +- Avoid redundant data +- Use consistent field names +- Consider future queries and filters + +### Performance + +- Keep metadata reasonably sized (<5KB) +- Index frequently queried fields +- Avoid deeply nested structures when possible + +## Troubleshooting + +### Validation Fails + +**Problem**: Metadata doesn't validate + +**Solutions:** +- Check project schema +- Verify metadata structure +- Test with JSON Schema validator +- Review error message details + +### Filtering Not Working + +**Problem**: Metadata filter doesn't exclude documents + +**Solutions:** +- Verify field path is correct +- Check value matches exactly (case-sensitive) +- URL-encode special characters +- Confirm metadata field exists + +### Schema Too Restrictive + +**Problem**: Cannot upload valid documents + +**Solutions:** +- Make fields optional (remove from `required`) +- Broaden type constraints +- Use `oneOf` for multiple valid formats +- Remove unnecessary validations + +## Next Steps + +- [Learn about metadata validation](../guides/metadata-validation/) +- [Explore metadata filtering](../guides/metadata-filtering/) +- [Understand similarity search](similarity-search/) diff --git a/docs/content/concepts/projects.md b/docs/content/concepts/projects.md new file mode 100644 index 0000000..18fecb2 --- /dev/null +++ b/docs/content/concepts/projects.md @@ -0,0 +1,439 @@ +--- +title: "Projects" +weight: 3 +--- + +# Projects + +Projects organize embeddings and define their configuration, including LLM service instances and optional metadata validation. + +## What is a Project? + +A project is a collection of document embeddings that share: + +- A single LLM service instance (embedding configuration) +- Optional metadata schema for validation +- Access control (ownership and sharing) +- Consistent vector dimensions + +## Project Properties + +### Core Fields + +- **project_handle**: Unique identifier within owner's namespace (3-20 characters) +- **owner**: User who owns the project +- **description**: Human-readable project description +- **instance_id**: Reference to LLM service instance (required, 1:1 relationship) +- **metadataScheme**: Optional JSON Schema for metadata validation +- **public_read**: Boolean flag for public read access +- **created_at**: Creation timestamp +- **updated_at**: Last modification timestamp + +### Unique Constraints + +Projects are uniquely identified by `(owner, project_handle)`: +- User "alice" can have project "research" +- User "bob" can also have project "research" +- Same user cannot have two projects with same handle + +## Creating Projects + +### Basic Project + +```bash +POST /v1/projects/alice + +{ + "project_handle": "literature-study", + "description": "Literary text analysis", + "instance_owner": "alice", + "instance_handle": "my-openai" +} +``` + +### Project with Metadata Schema + +```bash +POST /v1/projects/alice + +{ + "project_handle": "research-papers", + "description": "Academic papers with structured metadata", + "instance_owner": "alice", + "instance_handle": "my-embeddings", + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"author\":{\"type\":\"string\"},\"year\":{\"type\":\"integer\"},\"doi\":{\"type\":\"string\"}},\"required\":[\"author\",\"year\"]}" +} +``` + +### Public Project + +```bash +POST /v1/projects/alice + +{ + "project_handle": "open-dataset", + "description": "Publicly accessible research data", + "instance_owner": "alice", + "instance_handle": "my-embeddings", + "public_read": true +} +``` + +### Shared Project + +```bash +POST /v1/projects/alice + +{ + "project_handle": "collaborative", + "description": "Team collaboration project", + "instance_owner": "alice", + "instance_handle": "my-embeddings", + "shared_with": [ + { + "user_handle": "bob", + "role": "editor" + }, + { + "user_handle": "charlie", + "role": "reader" + } + ] +} +``` + +## Project-Instance Relationship + +### One-to-One Constraint + +Each project references exactly one LLM service instance: + +``` +Project → Instance (1:1) + ├── Defines vector dimensions + ├── Specifies embedding model + └── Contains API configuration +``` + +**Why 1:1?** +- Ensures consistent dimensions across all embeddings +- Prevents dimension mismatches in similarity searches +- Simplifies validation and error handling + +### Specifying Instance + +Use owner and handle to reference an instance: + +```json +{ + "instance_owner": "alice", + "instance_handle": "my-openai" +} +``` + +The instance can be: +- Owned by project owner: `"instance_owner": "alice"` +- Shared with project owner: `"instance_owner": "bob"` (if bob shared with alice) + +## Metadata Schemas + +### Purpose + +Metadata schemas ensure consistent, structured metadata across all embeddings in a project. + +### Schema Format + +Use JSON Schema (draft-07 or later): + +```json +{ + "type": "object", + "properties": { + "author": {"type": "string"}, + "title": {"type": "string"}, + "year": { + "type": "integer", + "minimum": 1000, + "maximum": 2100 + }, + "genre": { + "type": "string", + "enum": ["fiction", "non-fiction", "poetry"] + } + }, + "required": ["author", "title", "year"] +} +``` + +### Validation Behavior + +- **With schema**: All embeddings validated on upload +- **Without schema**: Any JSON metadata accepted +- **Validation failure**: Upload rejected with detailed error +- **Schema updates**: Only apply to new/updated embeddings + +### Example Schemas + +See [Metadata Validation Guide](../guides/metadata-validation/) for detailed examples. + +## Access Control + +### Ownership + +- **Owner**: User who created the project +- **Full control**: Read, write, delete, share, transfer +- **Cannot be removed**: Owner always has access + +### Sharing + +Projects can be shared with specific users: + +**Reader Role** +- View embeddings +- Search for similar documents +- View project metadata +- Cannot modify anything + +**Editor Role** +- All reader permissions +- Add embeddings +- Modify embeddings +- Delete embeddings +- Cannot delete project or change settings + +**Managing Sharing:** + +```bash +# Share with user +POST /v1/projects/alice/my-project/share +{ + "share_with_handle": "bob", + "role": "reader" +} + +# Unshare from user +DELETE /v1/projects/alice/my-project/share/bob + +# List shared users +GET /v1/projects/alice/my-project/shared-with +``` + +### Public Access + +Projects can allow unauthenticated read access: + +```bash +PATCH /v1/projects/alice/my-project +{ + "public_read": true +} +``` + +With `public_read: true`: +- Anyone can view embeddings (no authentication) +- Anyone can search for similar documents +- Write operations still require authentication + +See [Public Projects Guide](../guides/public-projects/) for details. + +## Project Operations + +### List Projects + +List all projects owned by a user: + +```bash +GET /v1/projects/alice +Authorization: Bearer alice_vdb_key +``` + +Returns array of project objects. + +### Get Project Details + +```bash +GET /v1/projects/alice/research +Authorization: Bearer alice_vdb_key +``` + +Returns full project object including: +- Configuration +- Instance reference +- Metadata schema +- Sharing information (owner only) + +### Update Project + +Use PATCH for partial updates: + +```bash +PATCH /v1/projects/alice/research +{ + "description": "Updated description" +} +``` + +Use PUT for full replacement: + +```bash +PUT /v1/projects/alice/research +{ + "project_handle": "research", + "description": "Complete project configuration", + "instance_owner": "alice", + "instance_handle": "my-embeddings", + "metadataScheme": "{...}" +} +``` + +### Delete Project + +```bash +DELETE /v1/projects/alice/research +Authorization: Bearer alice_vdb_key +``` + +**Cascading deletion:** +- All embeddings in project deleted +- All sharing grants removed +- Project metadata removed + +## Ownership Transfer + +Transfer project to another user: + +```bash +POST /v1/projects/alice/research/transfer-ownership +{ + "new_owner_handle": "bob" +} +``` + +**Effects:** +- Project owner changes to bob +- Project URL changes: `/v1/projects/bob/research` +- Alice loses all access (unless re-shared) +- All embeddings transferred +- Bob cannot already have project with same handle + +See [Ownership Transfer Guide](../guides/ownership-transfer/) for details. + +## Project Limits + +### Current Implementation + +No enforced limits on: +- Number of embeddings per project +- Project storage size +- Number of shared users + +### Recommended Practices + +For large projects: +- Use pagination when listing embeddings +- Batch upload embeddings +- Monitor database size +- Consider archiving old projects + +## Common Patterns + +### Research Project Workflow + +```bash +# 1. Create project +POST /v1/projects/alice +{ + "project_handle": "study-2024", + "description": "2024 Research Study", + "instance_owner": "alice", + "instance_handle": "my-embeddings" +} + +# 2. Upload data +POST /v1/embeddings/alice/study-2024 +{ ... embeddings ... } + +# 3. Share with team +POST /v1/projects/alice/study-2024/share +{"share_with_handle": "bob", "role": "reader"} + +# 4. Make public when published +PATCH /v1/projects/alice/study-2024 +{"public_read": true} +``` + +### Multi-Project Organization + +```bash +# Development project +POST /v1/projects/alice +{ + "project_handle": "dev-experiments", + "instance_owner": "alice", + "instance_handle": "dev-embeddings" +} + +# Production project +POST /v1/projects/alice +{ + "project_handle": "prod-dataset", + "instance_owner": "alice", + "instance_handle": "prod-embeddings", + "metadataScheme": "{...}" +} + +# Archive project +POST /v1/projects/alice +{ + "project_handle": "archive-2023", + "instance_owner": "alice", + "instance_handle": "archive-embeddings", + "public_read": true +} +``` + +## Troubleshooting + +### Cannot Create Project + +**Possible causes:** +- Project handle already exists for this user +- Invalid project handle format +- Instance doesn't exist or not accessible +- Missing required fields + +**Solutions:** +- Choose different project handle +- Verify instance exists: `GET /v1/llm-services/owner` +- Check instance is owned or shared with you +- Include all required fields (instance_owner, instance_handle) + +### Metadata Validation Fails + +**Possible causes:** +- Metadata doesn't match schema +- Invalid JSON Schema format +- Schema too restrictive + +**Solutions:** +- Test schema with online validator +- Verify embedding metadata matches schema +- Update schema or metadata as needed + +### Cannot Share Project + +**Possible causes:** +- Not project owner +- Target user doesn't exist +- Invalid role specified + +**Solutions:** +- Only owner can share projects +- Verify user exists: `GET /v1/users/target` +- Use valid role: "reader" or "editor" + +## Next Steps + +- [Learn about embeddings](embeddings/) +- [Explore metadata validation](../guides/metadata-validation/) +- [Understand project sharing](../guides/project-sharing/) diff --git a/docs/content/concepts/similarity-search.md b/docs/content/concepts/similarity-search.md new file mode 100644 index 0000000..6e13155 --- /dev/null +++ b/docs/content/concepts/similarity-search.md @@ -0,0 +1,445 @@ +--- +title: "Similarity Search" +weight: 6 +--- + +# Similarity Search + +Find documents with similar semantic meaning using vector similarity. + +## How it Works + +Similarity search compares embedding vectors using cosine distance: + +1. **Query vector**: Either from stored embedding or raw vector +2. **Comparison**: Calculate cosine similarity with all project embeddings +3. **Filtering**: Apply threshold and metadata filters +4. **Ranking**: Sort by similarity score (highest first) +5. **Return**: Top N most similar documents + +## Search Methods + +### Stored Document Search (GET) + +Find documents similar to an already-stored embedding: + +```bash +GET /v1/similars/alice/research/doc1?count=10&threshold=0.7 +``` + +**Use cases:** +- Find related documents +- Discover similar passages +- Identify duplicates + +### Raw Vector Search (POST) + +Search using a new embedding without storing it: + +```bash +POST /v1/similars/alice/research?count=10&threshold=0.7 + +{ + "vector": [0.023, -0.015, ..., 0.042] +} +``` + +**Use cases:** +- Query without saving +- Test embeddings +- Real-time search + +## Query Parameters + +### count + +Number of similar documents to return. + +- **Type**: Integer +- **Range**: 1-200 +- **Default**: 10 + +```bash +GET /v1/similars/alice/project/doc1?count=5 +``` + +### threshold + +Minimum similarity score (0-1). + +- **Type**: Float +- **Range**: 0.0-1.0 +- **Default**: 0.5 +- **Meaning**: 1.0 = identical, 0.0 = unrelated + +```bash +GET /v1/similars/alice/project/doc1?threshold=0.8 +``` + +### limit + +Maximum number of results (same as count). + +- **Type**: Integer +- **Range**: 1-200 +- **Default**: 10 + +### offset + +Skip first N results (pagination). + +- **Type**: Integer +- **Minimum**: 0 +- **Default**: 0 + +```bash +# First page +GET /v1/similars/alice/project/doc1?limit=10&offset=0 + +# Second page +GET /v1/similars/alice/project/doc1?limit=10&offset=10 +``` + +### metadata_path + +JSON path to metadata field for filtering. + +- **Type**: String +- **Purpose**: Specify metadata field to filter +- **Must be used with**: metadata_value + +```bash +?metadata_path=author +``` + +### metadata_value + +Value to **exclude** from results. + +- **Type**: String +- **Purpose**: Exclude documents matching this value +- **Must be used with**: metadata_path + +```bash +?metadata_path=author&metadata_value=Shakespeare +``` + +**Important**: Excludes matches, doesn't include them. + +## Similarity Scores + +### Cosine Similarity + +dhamps-vdb uses cosine similarity: + +``` +similarity = 1 - cosine_distance +``` + +**Score ranges:** +- **1.0**: Identical vectors +- **0.9-1.0**: Very similar +- **0.7-0.9**: Similar +- **0.5-0.7**: Somewhat similar +- **<0.5**: Not similar + +### Interpreting Scores + +Typical thresholds: + +- **0.9+**: Duplicates or near-duplicates +- **0.8+**: Strong semantic similarity +- **0.7+**: Related topics +- **0.5-0.7**: Weak relation +- **<0.5**: Unrelated + +Optimal threshold depends on your use case and model. + +## Metadata Filtering + +### Exclude by Field + +Exclude documents where metadata field matches value: + +```bash +# Exclude documents from same author +GET /v1/similars/alice/lit-study/hamlet-act1?metadata_path=author&metadata_value=Shakespeare +``` + +**Result**: Returns similar documents, excluding those with `metadata.author == "Shakespeare"`. + +### Nested Fields + +Use dot notation for nested metadata: + +```bash +# Exclude documents from same author.name +GET /v1/similars/alice/project/doc1?metadata_path=author.name&metadata_value=John%20Doe +``` + +### Common Patterns + +**Exclude same work:** +```bash +?metadata_path=title&metadata_value=Hamlet +``` + +**Exclude same source:** +```bash +?metadata_path=source_id&metadata_value=corpus-A +``` + +**Exclude same category:** +```bash +?metadata_path=category&metadata_value=tutorial +``` + +See [Metadata Filtering Guide](../guides/metadata-filtering/) for details. + +## Response Format + +```json +{ + "user_handle": "alice", + "project_handle": "research", + "results": [ + { + "id": "doc2", + "similarity": 0.95 + }, + { + "id": "doc5", + "similarity": 0.87 + }, + { + "id": "doc8", + "similarity": 0.82 + } + ] +} +``` + +**Fields:** +- **user_handle**: Project owner +- **project_handle**: Project identifier +- **results**: Array of similar documents + - **id**: Document text_id + - **similarity**: Similarity score (0-1) + +Results are sorted by similarity (highest first). + +## Performance + +### Query Speed + +Typical performance: +- **<10K embeddings**: <10ms +- **10K-100K embeddings**: 10-50ms +- **100K-1M embeddings**: 50-200ms +- **>1M embeddings**: 200-1000ms + +### Optimization + +**HNSW Index:** +- Faster queries than IVFFlat +- Better recall +- Larger index size + +**Query optimization:** +- Use appropriate threshold (higher = fewer results) +- Limit result count (lower = faster) +- Consider dimension reduction for large projects + +### Scaling + +For large datasets: +- Monitor query performance +- Consider read replicas +- Use connection pooling +- Cache frequent queries (application level) + +## Common Use Cases + +### RAG Workflow + +Retrieval Augmented Generation: + +```bash +# 1. User query +query="What is machine learning?" + +# 2. Generate query embedding (external) +query_vector=[...] + +# 3. Find similar documents +POST /v1/similars/alice/knowledge-base?count=5&threshold=0.7 +{"vector": $query_vector} + +# 4. Retrieve full text for top results +for each result: + GET /v1/embeddings/alice/knowledge-base/$result_id + +# 5. Send context to LLM for generation +``` + +### Duplicate Detection + +Find near-duplicate documents: + +```bash +# High threshold for duplicates +GET /v1/similars/alice/corpus/doc1?count=10&threshold=0.95 +``` + +Documents with similarity > 0.95 are likely duplicates. + +### Content Discovery + +Find related content: + +```bash +# Moderate threshold for recommendations +GET /v1/similars/alice/articles/article1?count=10&threshold=0.7&metadata_path=article_id&metadata_value=article1 +``` + +Excludes the source article itself. + +### Topic Clustering + +Find documents on similar topics: + +```bash +# For each document, find similar ones +for doc in documents: + GET /v1/similars/alice/corpus/$doc?count=20&threshold=0.8 +``` + +Group documents by similarity for clustering. + +## Dimension Consistency + +### Automatic Filtering + +Similarity queries only compare embeddings with matching dimensions: + +``` +Project embeddings: + - doc1: 3072 dimensions + - doc2: 3072 dimensions + - doc3: 1536 dimensions (different model) + +Query for doc1 similars: + → Only compares with doc2 + → Ignores doc3 (dimension mismatch) +``` + +### Multiple Instances + +Projects can have embeddings from multiple instances (if dimensions match): + +```json +{ + "text_id": "doc1", + "instance_handle": "openai-large", + "vector_dim": 3072 +} + +{ + "text_id": "doc2", + "instance_handle": "custom-model", + "vector_dim": 3072 +} +``` + +Both searchable together (same dimensions). + +## Access Control + +### Authentication + +Similarity search respects project access control: + +**Owner**: Full access +**Editor**: Can search (read permission) +**Reader**: Can search (read permission) +**Public** (if public_read=true): Can search (no auth required) + +### Public Projects + +Public projects allow unauthenticated similarity search: + +```bash +# No Authorization header needed +GET /v1/similars/alice/public-project/doc1?count=10 +``` + +See [Public Projects Guide](../guides/public-projects/). + +## Limitations + +### Current Constraints + +- **No cross-project search**: Similarity search is per-project only +- **No filtering by multiple metadata fields**: One field at a time +- **No custom distance metrics**: Cosine similarity only +- **No approximate search tuning**: Uses default HNSW parameters + +### Workarounds + +**Cross-project search:** +- Query each project separately +- Merge results in application + +**Multiple metadata filters:** +- Filter by one field in query +- Apply additional filters in application + +## Troubleshooting + +### No Results Returned + +**Possible causes:** +- Threshold too high +- No embeddings in project +- Dimension mismatch +- All results filtered by metadata + +**Solutions:** +- Lower threshold (try 0.5) +- Verify embeddings exist +- Check dimensions match +- Remove metadata filter + +### Unexpected Results + +**Possible causes:** +- Threshold too low +- Poor quality embeddings +- Incorrect model used +- Metadata filter excluding desired results + +**Solutions:** +- Increase threshold +- Regenerate embeddings +- Verify correct model/dimensions +- Adjust metadata filter + +### Slow Queries + +**Possible causes:** +- Large dataset (>100K embeddings) +- No vector index +- High result count +- Complex metadata filtering + +**Solutions:** +- Reduce result count +- Check index exists +- Optimize database +- Use read replicas + +## Next Steps + +- [Learn about metadata filtering](../guides/metadata-filtering/) +- [Understand RAG workflows](../guides/rag-workflow/) +- [Explore embeddings](embeddings/) diff --git a/docs/content/concepts/users-and-auth.md b/docs/content/concepts/users-and-auth.md new file mode 100644 index 0000000..c1c7c1f --- /dev/null +++ b/docs/content/concepts/users-and-auth.md @@ -0,0 +1,368 @@ +--- +title: "Users and Authentication" +weight: 2 +--- + +# Users and Authentication + +dhamps-vdb uses token-based authentication with API keys for all operations. + +## User Model + +### User Properties + +- **user_handle**: Unique identifier (3-20 characters, alphanumeric + underscore) +- **name**: Full name (optional) +- **email**: Email address (unique, required) +- **vdb_key**: API key (SHA-256 hash, 64 characters) +- **created_at**: Timestamp of creation +- **updated_at**: Timestamp of last update + +### Special Users + +**`_system` User** + +- Created automatically during database migration +- Owns system-wide LLM service definitions +- Cannot be used for authentication +- Provides default configurations for all users + +## Authentication Flow + +### API Key Authentication + +All requests (except public endpoints) require authentication: + +```http +GET /v1/projects/alice/my-project +Authorization: Bearer 024v2013621509245f2e24... +``` + +### Authentication Process + +1. Client sends API key in `Authorization` header with `Bearer` prefix +2. Server extracts key and looks up user in database +3. If user found, request proceeds with user context +4. If not found or missing, returns `401 Unauthorized` + +### Admin Authentication + +Administrative operations require the admin API key: + +```bash +curl -X POST http://localhost:8880/v1/users \ + -H "Authorization: Bearer YOUR_ADMIN_KEY" \ + -H "Content-Type: application/json" \ + -d '{"user_handle":"alice","email":"alice@example.com"}' +``` + +Admin key is set via `SERVICE_ADMINKEY` environment variable. + +## User Creation + +### By Admin + +Only admin users can create new users: + +```bash +POST /v1/users +Authorization: Bearer ADMIN_KEY + +{ + "user_handle": "researcher1", + "name": "Research User", + "email": "researcher@example.com" +} +``` + +**Response:** + +```json +{ + "user_handle": "researcher1", + "name": "Research User", + "email": "researcher@example.com", + "vdb_key": "024v2013621509245f2e24abcdef...", + "created_at": "2024-01-15T10:30:00Z" +} +``` + +**Important:** Save the `vdb_key` immediately - it cannot be recovered later. + +### User Handle Restrictions + +- Must be 3-20 characters +- Alphanumeric characters and underscores only +- Must be unique +- Cannot be `_system` + +## User Management + +### Retrieve User Information + +Users can view their own information: + +```bash +GET /v1/users/alice +Authorization: Bearer alice_vdb_key +``` + +Admins can view any user: + +```bash +GET /v1/users/alice +Authorization: Bearer ADMIN_KEY +``` + +### List All Users + +Only admins can list all users: + +```bash +GET /v1/users +Authorization: Bearer ADMIN_KEY +``` + +Returns array of user handles (not full user objects). + +### Update User + +Users can update their own information: + +```bash +PATCH /v1/users/alice +Authorization: Bearer alice_vdb_key + +{ + "name": "Alice Smith-Jones", + "email": "alice.smith@example.com" +} +``` + +### Delete User + +Users can delete their own account: + +```bash +DELETE /v1/users/alice +Authorization: Bearer alice_vdb_key +``` + +Admins can delete any user: + +```bash +DELETE /v1/users/alice +Authorization: Bearer ADMIN_KEY +``` + +**Cascading Deletion:** +- All user's projects are deleted +- All user's LLM service instances are deleted +- All embeddings in user's projects are deleted +- Sharing grants from this user to others are removed + +## Authorization Model + +### Resource Ownership + +Users own three types of resources: + +1. **Projects**: Collections of embeddings +2. **LLM Service Instances**: Embedding configurations with API keys +3. **LLM Service Definitions**: Reusable configuration templates (optional) + +### Access Levels + +**Owner** +- Full control over resource +- Can read, write, delete +- Can share with others +- Can transfer ownership (projects only) + +**Editor** (via sharing) +- Read and write access +- Cannot delete resource +- Cannot modify sharing +- Cannot change project settings + +**Reader** (via sharing) +- Read-only access +- Can view embeddings +- Can search for similar documents +- Cannot modify anything + +**Public** (if project.public_read = true) +- Unauthenticated read access +- Can view embeddings +- Can search for similar documents +- Cannot write or modify + +## Security Best Practices + +### API Key Management + +**Storage** +- Store API keys securely (e.g., environment variables, secret managers) +- Never commit API keys to version control +- Use different keys for development and production + +**Rotation** +- Currently, API keys cannot be rotated +- To change a key, delete and recreate the user +- Plan key rotation strategy before production deployment + +**Transmission** +- Always use HTTPS in production +- API keys are transmitted in Authorization header +- Never pass API keys in URL query parameters + +### User Key vs. LLM API Keys + +dhamps-vdb handles two types of keys: + +1. **User API keys** (`vdb_key`): Authenticate users to dhamps-vdb + - Stored as SHA-256 hash in database + - Never encrypted (one-way hash) + - Returned only once on user creation + +2. **LLM API keys** (`api_key_encrypted`): Authenticate to LLM services + - Stored encrypted (AES-256-GCM) in database + - Never returned in API responses + - Used internally for LLM processing + +## Multi-User Workflows + +### Collaboration Pattern + +1. **Admin** creates user accounts for team members +2. **Project Owner** creates project with embeddings +3. **Owner** shares project with collaborators +4. **Readers** can search and view embeddings +5. **Editors** can add/modify embeddings + +### Organization Pattern + +1. **Admin** creates organizational users +2. **Each user** creates LLM service instances with their own API keys +3. **Users** create projects using their instances +4. **Projects** shared within organization as needed + +### Public Access Pattern + +1. **User** creates project with research data +2. **User** sets `public_read: true` +3. **Anyone** can access embeddings and search without authentication +4. **Only owner** can modify project + +## User Limits + +### Current Implementation + +No enforced limits on: +- Number of projects per user +- Number of embeddings per project +- Number of LLM service instances per user +- Storage size per user + +### Recommended Limits + +For production deployments, consider implementing: +- Rate limiting per API key +- Storage quotas per user +- Maximum project count per user + +## Example Workflows + +### Create and Use User Account + +```bash +# Admin creates user +curl -X POST http://localhost:8880/v1/users \ + -H "Authorization: Bearer $ADMIN_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "user_handle": "researcher1", + "email": "researcher@example.com", + "name": "Research User" + }' + +# Save returned vdb_key +export USER_KEY="returned-vdb-key" + +# User verifies access +curl -X GET http://localhost:8880/v1/users/researcher1 \ + -H "Authorization: Bearer $USER_KEY" + +# User creates project +curl -X POST http://localhost:8880/v1/projects/researcher1 \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "my-project", + "description": "My research project" + }' +``` + +### Share Resources + +```bash +# User shares project with colleague +curl -X POST http://localhost:8880/v1/projects/researcher1/my-project/share \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "share_with_handle": "colleague1", + "role": "reader" + }' + +# Colleague accesses shared project +curl -X GET http://localhost:8880/v1/projects/researcher1/my-project \ + -H "Authorization: Bearer $COLLEAGUE_KEY" +``` + +## Troubleshooting + +### 401 Unauthorized + +**Possible causes:** +- Missing Authorization header +- Incorrect API key +- Expired or invalid key +- Using user key instead of admin key (or vice versa) + +**Solution:** +- Verify API key is correct +- Check Authorization header format: `Bearer KEY` +- Ensure operation matches key type (admin vs. user) + +### 403 Forbidden + +**Possible causes:** +- User doesn't own resource +- User not granted access to shared resource +- Insufficient permissions (reader trying to edit) + +**Solution:** +- Verify resource ownership +- Check sharing grants +- Ensure user has required role (editor for writes) + +### User Creation Failed + +**Possible causes:** +- User handle already exists +- Email already registered +- Invalid user handle format +- Not using admin key + +**Solution:** +- Choose different user handle +- Use unique email address +- Check user handle format (3-20 chars, alphanumeric + underscore) +- Verify using admin API key + +## Next Steps + +- [Learn about projects](projects/) +- [Understand LLM services](llm-services/) +- [Explore project sharing](../guides/project-sharing/) diff --git a/docs/content/deployment/_index.md b/docs/content/deployment/_index.md new file mode 100644 index 0000000..f423702 --- /dev/null +++ b/docs/content/deployment/_index.md @@ -0,0 +1,35 @@ +--- +title: "Deployment" +weight: 5 +--- + +# Deployment Guide + +Deploy dhamps-vdb to production environments. + +## Deployment Options + +dhamps-vdb can be deployed in several ways: + +- **Docker Compose** - Simplest option, includes PostgreSQL +- **Docker with External Database** - Production-ready setup +- **Standalone Binary** - For custom environments +- **Kubernetes** - For orchestrated deployments + +## Production Considerations + +When deploying to production: + +- Use strong, randomly generated keys +- Enable HTTPS/TLS for all API endpoints +- Configure database backups +- Set up monitoring and logging +- Restrict network access to database +- Use environment variables for sensitive configuration + +## Guides + +- [Docker Deployment](docker/) - Complete Docker guide +- [Database Setup](database/) - PostgreSQL configuration +- [Environment Variables](environment-variables/) - All configuration options +- [Security](security/) - Security best practices diff --git a/docs/content/deployment/database.md b/docs/content/deployment/database.md new file mode 100644 index 0000000..ba9be71 --- /dev/null +++ b/docs/content/deployment/database.md @@ -0,0 +1,494 @@ +--- +title: "Database Setup" +weight: 2 +--- + +# Database Setup + +dhamps-vdb requires PostgreSQL 11 or later with the pgvector extension. + +## Requirements + +- **PostgreSQL**: Version 11 or higher +- **pgvector**: Extension for vector similarity search +- **Storage**: Depends on your embeddings volume (estimate: 4 bytes × dimensions × embeddings count) + +## Installing PostgreSQL with pgvector + +### Using Docker (Recommended) + +The easiest way to get PostgreSQL with pgvector: + +```bash +docker run -d \ + --name postgres-pgvector \ + -p 5432:5432 \ + -e POSTGRES_PASSWORD=secure_password \ + -v postgres_data:/var/lib/postgresql/data \ + pgvector/pgvector:0.7.4-pg16 +``` + +### On Ubuntu/Debian + +```bash +# Add PostgreSQL APT repository +sudo apt install curl ca-certificates +sudo install -d /usr/share/postgresql-common/pgdg +sudo curl -o /usr/share/postgresql-common/pgdg/apt.postgresql.org.asc \ + https://www.postgresql.org/media/keys/ACCC4CF8.asc +echo "deb [signed-by=/usr/share/postgresql-common/pgdg/apt.postgresql.org.asc] \ + https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" | \ + sudo tee /etc/apt/sources.list.d/pgdg.list + +# Install PostgreSQL +sudo apt update +sudo apt install postgresql-16 postgresql-contrib-16 + +# Install pgvector +sudo apt install postgresql-16-pgvector +``` + +### On macOS + +```bash +# Using Homebrew +brew install postgresql@16 +brew install pgvector + +# Start PostgreSQL +brew services start postgresql@16 +``` + +### On RHEL/CentOS/Fedora + +```bash +# Install PostgreSQL +sudo dnf install postgresql16-server postgresql16-contrib + +# Install pgvector (build from source) +sudo dnf install postgresql16-devel git gcc make +git clone https://github.com/pgvector/pgvector.git +cd pgvector +make +sudo make install PG_CONFIG=/usr/pgsql-16/bin/pg_config +``` + +## Database Configuration + +### Step 1: Create Database + +Connect to PostgreSQL as superuser: + +```bash +# Local connection +sudo -u postgres psql + +# Remote connection +psql -h db.example.com -U postgres +``` + +Create the database: + +```sql +-- Create database +CREATE DATABASE dhamps_vdb; +``` + +### Step 2: Create User + +Create a dedicated user for dhamps-vdb: + +```sql +-- Create user with password +CREATE USER dhamps_user WITH PASSWORD 'secure_password_here'; +``` + +**Best practices:** +- Use strong, randomly generated password (e.g., `openssl rand -base64 32`) +- Store password securely (password manager, secrets management) +- Rotate passwords regularly + +### Step 3: Grant Privileges + +Grant necessary permissions: + +```sql +-- Grant database privileges +GRANT ALL PRIVILEGES ON DATABASE dhamps_vdb TO dhamps_user; + +-- Connect to the database +\c dhamps_vdb + +-- Grant schema privileges +GRANT ALL ON SCHEMA public TO dhamps_user; + +-- For PostgreSQL 15+, also grant: +GRANT CREATE ON DATABASE dhamps_vdb TO dhamps_user; +``` + +### Step 4: Enable pgvector Extension + +Still connected to the `dhamps_vdb` database: + +```sql +-- Enable pgvector extension +CREATE EXTENSION IF NOT EXISTS vector; + +-- Verify extension is installed +\dx +``` + +Expected output should include: + +``` +List of installed extensions + Name | Version | Schema | Description +---------+---------+------------+------------------------------ + vector | 0.7.4 | public | vector data type and ivfflat and hnsw access methods +``` + +### Step 5: Verify Setup + +Test the setup: + +```sql +-- Test vector creation +SELECT '[1,2,3]'::vector; + +-- Should return: [1,2,3] + +-- Test vector distance +SELECT '[1,2,3]'::vector <=> '[4,5,6]'::vector AS distance; + +-- Should return a float (distance value) +``` + +Exit psql: + +```sql +\q +``` + +## Connection String Format + +dhamps-vdb connects using these environment variables: + +```bash +SERVICE_DBHOST=localhost # Database hostname +SERVICE_DBPORT=5432 # Database port +SERVICE_DBUSER=dhamps_user # Database username +SERVICE_DBPASSWORD=password # Database password +SERVICE_DBNAME=dhamps_vdb # Database name +``` + +The connection string format used internally: + +``` +postgresql://username:password@host:port/database?sslmode=disable +``` + +## Production Configuration + +### PostgreSQL Tuning + +Edit `postgresql.conf` for better vector search performance: + +```ini +# Memory settings +shared_buffers = 4GB # 25% of RAM +effective_cache_size = 12GB # 75% of RAM +maintenance_work_mem = 1GB +work_mem = 50MB + +# Parallel query settings +max_parallel_workers_per_gather = 4 +max_parallel_workers = 8 + +# Connection settings +max_connections = 100 + +# For pgvector specifically +shared_preload_libraries = 'vector' # Add if not present + +# Write-ahead log +wal_level = replica # For replication +max_wal_senders = 3 # For replicas +``` + +Restart PostgreSQL after changes: + +```bash +# On systemd systems +sudo systemctl restart postgresql + +# On Docker +docker restart postgres-container +``` + +### Connection Pooling + +For high-load scenarios, use connection pooling with PgBouncer: + +```bash +# Install PgBouncer +sudo apt install pgbouncer + +# Configure /etc/pgbouncer/pgbouncer.ini +[databases] +dhamps_vdb = host=localhost port=5432 dbname=dhamps_vdb + +[pgbouncer] +listen_addr = 127.0.0.1 +listen_port = 6432 +auth_type = md5 +auth_file = /etc/pgbouncer/userlist.txt +pool_mode = transaction +max_client_conn = 1000 +default_pool_size = 20 +``` + +Connect dhamps-vdb to PgBouncer instead of PostgreSQL directly. + +### SSL/TLS Encryption + +Enable SSL in `postgresql.conf`: + +```ini +ssl = on +ssl_cert_file = '/etc/postgresql/server.crt' +ssl_key_file = '/etc/postgresql/server.key' +ssl_ca_file = '/etc/postgresql/ca.crt' +``` + +Update connection string: + +```bash +# Update dhamps-vdb configuration +SERVICE_DBHOST=db.example.com +# Change connection to use SSL (requires code modification or PostgreSQL parameter) +``` + +### Network Security + +Restrict access in `pg_hba.conf`: + +``` +# TYPE DATABASE USER ADDRESS METHOD + +# Allow dhamps_user from application server only +host dhamps_vdb dhamps_user 10.0.1.0/24 md5 + +# Allow localhost connections +local all all peer +host all all 127.0.0.1/32 md5 +host all all ::1/128 md5 +``` + +Reload PostgreSQL: + +```bash +sudo systemctl reload postgresql +``` + +## Backup and Recovery + +### Automated Backups + +Daily backup script: + +```bash +#!/bin/bash +# /usr/local/bin/backup-dhamps-vdb.sh + +BACKUP_DIR="/backups/dhamps-vdb" +DATE=$(date +%Y%m%d_%H%M%S) +DB_NAME="dhamps_vdb" + +# Create backup +pg_dump -U dhamps_user -h localhost $DB_NAME | gzip > "$BACKUP_DIR/dhamps-vdb-$DATE.sql.gz" + +# Keep last 30 days +find $BACKUP_DIR -name "dhamps-vdb-*.sql.gz" -mtime +30 -delete + +# Verify backup +if [ $? -eq 0 ]; then + echo "Backup successful: $DATE" +else + echo "Backup failed: $DATE" >&2 + exit 1 +fi +``` + +Set up cron job: + +```bash +# Run daily at 2 AM +0 2 * * * /usr/local/bin/backup-dhamps-vdb.sh +``` + +### Restore from Backup + +```bash +# Decompress and restore +gunzip -c /backups/dhamps-vdb/dhamps-vdb-20240208.sql.gz | \ + psql -U dhamps_user -h localhost dhamps_vdb +``` + +### Point-in-Time Recovery (PITR) + +Enable WAL archiving in `postgresql.conf`: + +```ini +wal_level = replica +archive_mode = on +archive_command = 'test ! -f /archive/%f && cp %p /archive/%f' +``` + +## Monitoring + +### Check Database Size + +```sql +-- Database size +SELECT pg_size_pretty(pg_database_size('dhamps_vdb')); + +-- Table sizes +SELECT + schemaname, + tablename, + pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size +FROM pg_tables +WHERE schemaname = 'public' +ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC; +``` + +### Check Connection Status + +```sql +-- Active connections +SELECT count(*) FROM pg_stat_activity WHERE datname = 'dhamps_vdb'; + +-- Connection details +SELECT + pid, + usename, + application_name, + client_addr, + state, + query +FROM pg_stat_activity +WHERE datname = 'dhamps_vdb'; +``` + +### Monitor Performance + +```sql +-- Enable pg_stat_statements +CREATE EXTENSION IF NOT EXISTS pg_stat_statements; + +-- View slow queries +SELECT + query, + calls, + total_time, + mean_time, + max_time +FROM pg_stat_statements +WHERE query NOT LIKE '%pg_stat_statements%' +ORDER BY mean_time DESC +LIMIT 10; +``` + +## Troubleshooting + +### Cannot Connect to Database + +```bash +# Check if PostgreSQL is running +sudo systemctl status postgresql + +# Check PostgreSQL logs +sudo tail -f /var/log/postgresql/postgresql-16-main.log + +# Test connection +psql -h localhost -U dhamps_user -d dhamps_vdb +``` + +### pgvector Extension Not Found + +```sql +-- Check available extensions +SELECT * FROM pg_available_extensions WHERE name = 'vector'; + +-- If not listed, install pgvector package +-- See installation section above +``` + +### Permission Denied + +```sql +-- Reconnect as superuser +\c dhamps_vdb postgres + +-- Re-grant privileges +GRANT ALL PRIVILEGES ON DATABASE dhamps_vdb TO dhamps_user; +GRANT ALL ON SCHEMA public TO dhamps_user; +GRANT ALL ON ALL TABLES IN SCHEMA public TO dhamps_user; +GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO dhamps_user; + +-- For future objects +ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO dhamps_user; +ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO dhamps_user; +``` + +### Out of Disk Space + +```bash +# Check disk usage +df -h + +# Check PostgreSQL data directory +du -sh /var/lib/postgresql/16/main/ + +# Clean up old WAL files (if safe) +# Check archive status first +``` + +### Slow Queries + +```sql +-- Check missing indexes +SELECT + schemaname, + tablename, + attname, + n_distinct, + correlation +FROM pg_stats +WHERE schemaname = 'public' +ORDER BY n_distinct DESC; + +-- Analyze tables +ANALYZE; + +-- Vacuum tables +VACUUM ANALYZE; +``` + +## Migration from Other Databases + +dhamps-vdb is designed specifically for PostgreSQL with pgvector. Migration from other databases requires: + +1. Export data from source database +2. Set up PostgreSQL with pgvector +3. Transform data to match dhamps-vdb schema +4. Import using dhamps-vdb API or direct SQL + +## Further Reading + +- [PostgreSQL Documentation](https://www.postgresql.org/docs/) +- [pgvector GitHub](https://github.com/pgvector/pgvector) +- [PostgreSQL Tuning Guide](https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server) +- [Environment Variables](../environment-variables/) +- [Docker Deployment](../docker/) +- [Security Guide](../security/) diff --git a/docs/content/deployment/docker.md b/docs/content/deployment/docker.md new file mode 100644 index 0000000..7c7ea6b --- /dev/null +++ b/docs/content/deployment/docker.md @@ -0,0 +1,297 @@ +--- +title: "Docker Deployment" +weight: 1 +--- + +# Docker Deployment + +This guide covers production-focused Docker deployment for dhamps-vdb. + +## Overview + +For detailed Docker setup instructions, see [Getting Started with Docker](../../getting-started/docker/). This page focuses on production deployment considerations. + +## Production Deployment + +### Prerequisites + +- Docker Engine 20.10+ +- Docker Compose 2.0+ (or docker-compose 1.29+) +- PostgreSQL 11+ with pgvector extension (included in compose setup) + +### Quick Production Setup + +```bash +# Clone repository +git clone https://github.com/mpilhlt/dhamps-vdb.git +cd dhamps-vdb + +# Generate secure keys +./docker-setup.sh + +# Review and customize .env +nano .env + +# Deploy +docker-compose up -d +``` + +## Production Considerations + +### Use Reverse Proxy + +Always run behind a reverse proxy (nginx, Traefik, Caddy) for: + +- **HTTPS/TLS termination** +- **Request filtering** +- **Rate limiting** +- **Load balancing** + +Example nginx configuration: + +```nginx +upstream dhamps-vdb { + server localhost:8880; +} + +server { + listen 443 ssl http2; + server_name api.example.com; + + ssl_certificate /path/to/cert.pem; + ssl_certificate_key /path/to/key.pem; + + location / { + proxy_pass http://dhamps-vdb; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + } +} +``` + +### Container Resource Limits + +Set resource limits in `docker-compose.yml`: + +```yaml +services: + dhamps-vdb: + deploy: + resources: + limits: + cpus: '2' + memory: 2G + reservations: + cpus: '0.5' + memory: 512M + + postgres: + deploy: + resources: + limits: + cpus: '2' + memory: 4G + reservations: + cpus: '1' + memory: 2G +``` + +### Use Specific Image Tags + +Avoid `latest` in production: + +```yaml +services: + postgres: + image: pgvector/pgvector:0.7.4-pg16 # Specific version + + dhamps-vdb: + image: dhamps-vdb:v0.1.0 # Tag your builds +``` + +### Health Monitoring + +The image includes health checks. Monitor with: + +```bash +# Check health status +docker inspect --format='{{.State.Health.Status}}' dhamps-vdb + +# View health check logs +docker inspect --format='{{range .State.Health.Log}}{{.Output}}{{end}}' dhamps-vdb + +# Integrate with monitoring (Prometheus, etc.) +``` + +### Logging + +Configure logging drivers in `docker-compose.yml`: + +```yaml +services: + dhamps-vdb: + logging: + driver: "json-file" + options: + max-size: "10m" + max-file: "3" +``` + +Or use centralized logging: + +```yaml +services: + dhamps-vdb: + logging: + driver: "syslog" + options: + syslog-address: "tcp://logserver:514" +``` + +## External Database Deployment + +For production, consider using a managed PostgreSQL service or separate database server. + +### Requirements + +- PostgreSQL 11+ with pgvector extension +- Network connectivity from container to database +- Database user with appropriate privileges + +See [Database Setup](../database/) for detailed instructions. + +### Configuration + +Update `.env`: + +```bash +SERVICE_DBHOST=db.example.com +SERVICE_DBPORT=5432 +SERVICE_DBUSER=dhamps_user +SERVICE_DBPASSWORD=secure_password +SERVICE_DBNAME=dhamps_vdb +``` + +Modify `docker-compose.yml` to remove postgres service: + +```yaml +services: + dhamps-vdb: + build: . + ports: + - "8880:8880" + env_file: .env + restart: unless-stopped +``` + +## Scaling and High Availability + +### Horizontal Scaling + +Run multiple instances behind a load balancer: + +```yaml +services: + dhamps-vdb: + build: . + deploy: + replicas: 3 + restart_policy: + condition: on-failure +``` + +**Note:** All instances must connect to the same database. + +### Database Replication + +For high availability: + +- Use PostgreSQL replication (streaming or logical) +- Consider read replicas for read-heavy workloads +- Point write operations to primary, reads to replicas + +## Backup Strategy + +### Database Backups + +```bash +# Automated daily backups +0 2 * * * docker-compose exec -T postgres pg_dump -U postgres dhamps_vdb | gzip > /backups/dhamps-vdb-$(date +\%Y\%m\%d).sql.gz + +# Keep last 30 days +find /backups -name "dhamps-vdb-*.sql.gz" -mtime +30 -delete +``` + +### Volume Backups + +```bash +# Backup Docker volume +docker run --rm -v dhamps-vdb_postgres_data:/data -v /backups:/backup alpine tar czf /backup/postgres-data-$(date +\%Y\%m\%d).tar.gz /data +``` + +### Environment Configuration Backups + +```bash +# Backup .env (securely!) +gpg --encrypt --recipient admin@example.com .env > .env.gpg +``` + +## Troubleshooting + +### Container Performance Issues + +```bash +# Check resource usage +docker stats + +# Check container logs +docker-compose logs -f --tail=100 dhamps-vdb + +# Check slow queries (if database is slow) +docker-compose exec postgres psql -U postgres -d dhamps_vdb -c "SELECT * FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10;" +``` + +### Network Connectivity Issues + +```bash +# Test database connection from container +docker-compose exec dhamps-vdb nc -zv postgres 5432 + +# Check DNS resolution +docker-compose exec dhamps-vdb nslookup postgres + +# Test API from inside container +docker-compose exec dhamps-vdb wget -O- http://localhost:8880/docs +``` + +### Update and Rollback + +```bash +# Update to new version +docker-compose pull +docker-compose up -d + +# Rollback if needed +docker-compose down +docker-compose up -d --force-recreate +``` + +## Security Best Practices + +- See [Security Guide](../security/) for comprehensive security recommendations +- Use [Environment Variables Guide](../environment-variables/) for proper configuration +- Never expose database port publicly +- Use strong, randomly generated keys (see `./docker-setup.sh`) +- Keep Docker and images updated +- Run containers as non-root (already configured) + +## Further Reading + +- [Docker Official Documentation](https://docs.docker.com/) +- [PostgreSQL Docker Hub](https://hub.docker.com/_/postgres) +- [pgvector Extension](https://github.com/pgvector/pgvector) +- [Database Setup Guide](../database/) +- [Environment Variables Reference](../environment-variables/) +- [Security Best Practices](../security/) diff --git a/docs/content/deployment/environment-variables.md b/docs/content/deployment/environment-variables.md new file mode 100644 index 0000000..6e143f0 --- /dev/null +++ b/docs/content/deployment/environment-variables.md @@ -0,0 +1,493 @@ +--- +title: "Environment Variables" +weight: 3 +--- + +# Environment Variables + +Complete reference for all environment variables used by dhamps-vdb. + +## Overview + +dhamps-vdb is configured entirely through environment variables. These can be set: + +1. **In a `.env` file** (recommended for Docker) +2. **As system environment variables** +3. **Via command-line flags** (some variables only) + +## Required Variables + +These variables **must** be set for dhamps-vdb to function: + +### SERVICE_ADMINKEY + +Admin API key for administrative operations. + +- **Type:** String +- **Required:** Yes +- **Default:** None +- **Environment Variable:** `SERVICE_ADMINKEY` +- **Command-line Flag:** `--admin-key` + +**Description:** Master API key with full administrative privileges. Used to create users, manage global resources, and perform administrative operations. + +**Example:** + +```bash +SERVICE_ADMINKEY=Ch4ngeM3SecureAdminKey! +``` + +**Security:** +- Generate with: `openssl rand -base64 32` +- Never commit to version control +- Rotate regularly +- Store securely (password manager, secrets vault) + +**Usage:** + +```bash +curl -X POST http://localhost:8880/v1/users \ + -H "Authorization: Bearer $SERVICE_ADMINKEY" \ + -H "Content-Type: application/json" \ + -d '{"user_handle": "alice", "full_name": "Alice Smith"}' +``` + +### ENCRYPTION_KEY + +Encryption key for protecting user API keys in the database. + +- **Type:** String (32+ characters) +- **Required:** Yes +- **Default:** None +- **Environment Variable:** `ENCRYPTION_KEY` + +**Description:** AES-256 encryption key used to encrypt user API keys before storing them in the database. Must be at least 32 characters. The key is hashed with SHA-256 to ensure exactly 32 bytes for AES-256 encryption. + +**Example:** + +```bash +ENCRYPTION_KEY=a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6 +``` + +**Security:** +- Generate with: `openssl rand -hex 32` +- Minimum 32 characters required +- Never commit to version control +- **CRITICAL:** If lost, all stored API keys become unrecoverable +- Backup securely and separately from database +- Never change in production (will invalidate all existing API keys) + +**Technical Details:** +- Uses AES-256-GCM for encryption +- SHA-256 hash ensures correct key length +- Each encryption uses unique nonce +- Encrypted keys stored as base64 in database + +## Optional Variables + +### SERVICE_DEBUG + +Enable debug logging. + +- **Type:** Boolean +- **Required:** No +- **Default:** `true` +- **Environment Variable:** `SERVICE_DEBUG` +- **Command-line Flag:** `-d`, `--debug` + +**Description:** Enables verbose debug logging including request details, SQL queries, and internal operations. + +**Example:** + +```bash +SERVICE_DEBUG=false # Production (less verbose) +SERVICE_DEBUG=true # Development (verbose) +``` + +**Impact:** +- `true`: Detailed logs, useful for debugging +- `false`: Minimal logs, recommended for production + +### SERVICE_HOST + +Hostname or IP address to bind the service to. + +- **Type:** String +- **Required:** No +- **Default:** `localhost` +- **Environment Variable:** `SERVICE_HOST` +- **Command-line Flag:** `--host` + +**Description:** Network interface to listen on. Use `0.0.0.0` to listen on all interfaces (required for Docker). + +**Examples:** + +```bash +SERVICE_HOST=localhost # Local development only +SERVICE_HOST=0.0.0.0 # Listen on all interfaces (Docker) +SERVICE_HOST=10.0.1.5 # Specific interface +``` + +**Security:** +- Use `localhost` for development +- Use `0.0.0.0` for Docker/production with firewall +- Never expose directly to internet without reverse proxy + +### SERVICE_PORT + +Port number for the API service. + +- **Type:** Integer +- **Required:** No +- **Default:** `8880` +- **Environment Variable:** `SERVICE_PORT` +- **Command-line Flag:** `-p`, `--port` + +**Description:** TCP port the service listens on. + +**Example:** + +```bash +SERVICE_PORT=8880 # Default +SERVICE_PORT=8080 # Alternative +SERVICE_PORT=3000 # Custom +``` + +**Notes:** +- Ports below 1024 require root/admin privileges +- Ensure port is not already in use +- Update firewall rules accordingly + +## Database Variables + +### SERVICE_DBHOST + +Database hostname or IP address. + +- **Type:** String +- **Required:** No +- **Default:** `localhost` +- **Environment Variable:** `SERVICE_DBHOST` +- **Command-line Flag:** `--db-host` + +**Description:** PostgreSQL server hostname. + +**Examples:** + +```bash +SERVICE_DBHOST=localhost # Local PostgreSQL +SERVICE_DBHOST=postgres # Docker Compose service name +SERVICE_DBHOST=db.example.com # Remote database +SERVICE_DBHOST=10.0.1.100 # Database IP address +``` + +### SERVICE_DBPORT + +Database port number. + +- **Type:** Integer +- **Required:** No +- **Default:** `5432` +- **Environment Variable:** `SERVICE_DBPORT` +- **Command-line Flag:** `--db-port` + +**Description:** PostgreSQL server port. + +**Example:** + +```bash +SERVICE_DBPORT=5432 # Default PostgreSQL port +SERVICE_DBPORT=5433 # Alternative port +``` + +### SERVICE_DBUSER + +Database username. + +- **Type:** String +- **Required:** No +- **Default:** `postgres` +- **Environment Variable:** `SERVICE_DBUSER` +- **Command-line Flag:** `--db-user` + +**Description:** PostgreSQL user for database connections. + +**Example:** + +```bash +SERVICE_DBUSER=postgres # Default superuser +SERVICE_DBUSER=dhamps_user # Dedicated user (recommended) +``` + +**Security:** +- Create dedicated user (not superuser) for production +- Use principle of least privilege +- See [Database Setup](../database/) for user creation + +### SERVICE_DBPASSWORD + +Database password. + +- **Type:** String +- **Required:** No +- **Default:** `password` +- **Environment Variable:** `SERVICE_DBPASSWORD` +- **Command-line Flag:** `--db-password` + +**Description:** Password for database authentication. + +**Example:** + +```bash +SERVICE_DBPASSWORD=secure_database_password_here +``` + +**Security:** +- Use strong, randomly generated password +- Never use default password in production +- Never commit to version control +- Store in secrets management system +- Rotate regularly + +### SERVICE_DBNAME + +Database name. + +- **Type:** String +- **Required:** No +- **Default:** `postgres` +- **Environment Variable:** `SERVICE_DBNAME` +- **Command-line Flag:** `--db-name` + +**Description:** Name of the PostgreSQL database. + +**Example:** + +```bash +SERVICE_DBNAME=dhamps_vdb # Production database +SERVICE_DBNAME=dhamps_test # Testing database +``` + +## Configuration Examples + +### Development Setup + +```bash +# .env file for development +SERVICE_DEBUG=true +SERVICE_HOST=localhost +SERVICE_PORT=8880 +SERVICE_DBHOST=localhost +SERVICE_DBPORT=5432 +SERVICE_DBUSER=postgres +SERVICE_DBPASSWORD=postgres +SERVICE_DBNAME=dhamps_vdb +SERVICE_ADMINKEY=dev-admin-key-not-for-production +ENCRYPTION_KEY=dev-encryption-key-min-32-chars-long +``` + +### Docker Compose Setup + +```bash +# .env file for Docker Compose +SERVICE_DEBUG=false +SERVICE_HOST=0.0.0.0 +SERVICE_PORT=8880 +SERVICE_DBHOST=postgres +SERVICE_DBPORT=5432 +SERVICE_DBUSER=postgres +SERVICE_DBPASSWORD=secure_db_password_here +SERVICE_DBNAME=dhamps_vdb +SERVICE_ADMINKEY=generated_admin_key_from_setup_script +ENCRYPTION_KEY=generated_encryption_key_from_setup_script +``` + +### Production Setup + +```bash +# .env file for production (or use secrets management) +SERVICE_DEBUG=false +SERVICE_HOST=0.0.0.0 +SERVICE_PORT=8880 +SERVICE_DBHOST=db.internal.example.com +SERVICE_DBPORT=5432 +SERVICE_DBUSER=dhamps_prod_user +SERVICE_DBPASSWORD= +SERVICE_DBNAME=dhamps_vdb_prod +SERVICE_ADMINKEY= +ENCRYPTION_KEY= +``` + +## Setting Environment Variables + +### Using .env File (Recommended) + +Create a `.env` file in the project root: + +```bash +# Copy template +cp template.env .env + +# Edit with your values +nano .env +``` + +The application automatically loads `.env` on startup. + +### Using System Environment Variables + +```bash +# Linux/macOS +export SERVICE_ADMINKEY="your-admin-key" +export ENCRYPTION_KEY="your-encryption-key" + +# Windows PowerShell +$env:SERVICE_ADMINKEY = "your-admin-key" +$env:ENCRYPTION_KEY = "your-encryption-key" + +# Windows Command Prompt +set SERVICE_ADMINKEY=your-admin-key +set ENCRYPTION_KEY=your-encryption-key +``` + +### Using Docker + +With docker run: + +```bash +docker run -d \ + -e SERVICE_ADMINKEY=admin-key \ + -e ENCRYPTION_KEY=encryption-key \ + -e SERVICE_DBHOST=db-host \ + dhamps-vdb:latest +``` + +With docker-compose.yml: + +```yaml +services: + dhamps-vdb: + image: dhamps-vdb:latest + environment: + SERVICE_DEBUG: "false" + SERVICE_HOST: "0.0.0.0" + SERVICE_PORT: "8880" + SERVICE_DBHOST: "postgres" + env_file: + - .env # Load additional variables from file +``` + +### Using Command-Line Flags + +Some variables support command-line flags: + +```bash +./dhamps-vdb \ + --debug \ + --host 0.0.0.0 \ + --port 8880 \ + --admin-key your-admin-key \ + --db-host localhost \ + --db-port 5432 \ + --db-user postgres \ + --db-password password \ + --db-name dhamps_vdb +``` + +**Note:** ENCRYPTION_KEY must be set as environment variable, not flag. + +## Validation and Troubleshooting + +### Missing Required Variables + +If required variables are not set, the service will fail to start: + +``` +Error: SERVICE_ADMINKEY environment variable is not set +``` + +**Solution:** Set the missing variable. + +### Invalid ENCRYPTION_KEY + +If ENCRYPTION_KEY is too short or invalid: + +``` +Error: ENCRYPTION_KEY environment variable is not set +``` + +**Solution:** Ensure key is at least 32 characters: + +```bash +openssl rand -hex 32 +``` + +### Database Connection Failure + +``` +Error: failed to connect to database +``` + +**Check:** +- `SERVICE_DBHOST` is correct and reachable +- `SERVICE_DBPORT` is correct +- `SERVICE_DBUSER` and `SERVICE_DBPASSWORD` are valid +- `SERVICE_DBNAME` exists +- PostgreSQL is running +- Firewall allows connection + +### Testing Configuration + +```bash +# Start service with debug logging +SERVICE_DEBUG=true ./dhamps-vdb + +# Check if service starts successfully +curl http://localhost:8880/docs + +# Test admin authentication +curl -X GET http://localhost:8880/v1/users \ + -H "Authorization: Bearer $SERVICE_ADMINKEY" +``` + +## Security Best Practices + +1. **Never commit `.env` files** to version control + - Already in `.gitignore` + - Use `.env.example` templates + +2. **Generate secure random keys** + ```bash + openssl rand -base64 32 # Admin key + openssl rand -hex 32 # Encryption key + ``` + +3. **Use secrets management** in production + - HashiCorp Vault + - AWS Secrets Manager + - Azure Key Vault + - Docker Secrets (Swarm) + - Kubernetes Secrets + +4. **Rotate keys regularly** + - Admin key: Every 90 days + - Database password: Every 90 days + - **Encryption key:** Cannot be rotated without re-encrypting all API keys + +5. **Principle of least privilege** + - Use dedicated database user (not superuser) + - Grant only necessary permissions + - Restrict network access + +6. **Monitor and audit** + - Log admin key usage + - Monitor failed authentication attempts + - Review access patterns + +## Further Reading + +- [Docker Deployment Guide](../docker/) +- [Database Setup](../database/) +- [Security Best Practices](../security/) +- [Configuration Guide](../../getting-started/configuration/) +- [Quick Start](../../getting-started/quick-start/) diff --git a/docs/content/deployment/security.md b/docs/content/deployment/security.md new file mode 100644 index 0000000..7de673a --- /dev/null +++ b/docs/content/deployment/security.md @@ -0,0 +1,741 @@ +--- +title: "Security Best Practices" +weight: 4 +--- + +# Security Best Practices + +Comprehensive guide for securing your dhamps-vdb production deployment. + +## Security Overview + +dhamps-vdb handles sensitive data including embeddings, metadata, and API credentials. This guide covers essential security measures for production deployments. + +## HTTPS/TLS Configuration + +### Why HTTPS is Required + +- **Encryption in transit**: Protects API keys and data from interception +- **Authentication**: Verifies server identity +- **Compliance**: Required by most security standards + +### Using a Reverse Proxy + +**Never expose dhamps-vdb directly to the internet.** Always use a reverse proxy with TLS termination. + +#### Nginx with Let's Encrypt + +Install Certbot: + +```bash +sudo apt install nginx certbot python3-certbot-nginx +``` + +Configure nginx (`/etc/nginx/sites-available/dhamps-vdb`): + +```nginx +upstream dhamps_backend { + server 127.0.0.1:8880; + keepalive 32; +} + +server { + listen 80; + server_name api.example.com; + + # Redirect HTTP to HTTPS + return 301 https://$server_name$request_uri; +} + +server { + listen 443 ssl http2; + server_name api.example.com; + + # SSL certificates (managed by Certbot) + ssl_certificate /etc/letsencrypt/live/api.example.com/fullchain.pem; + ssl_certificate_key /etc/letsencrypt/live/api.example.com/privkey.pem; + + # Modern SSL configuration + ssl_protocols TLSv1.2 TLSv1.3; + ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384'; + ssl_prefer_server_ciphers off; + ssl_session_cache shared:SSL:10m; + ssl_session_timeout 10m; + + # HSTS (HTTP Strict Transport Security) + add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always; + + # Security headers + add_header X-Frame-Options "DENY" always; + add_header X-Content-Type-Options "nosniff" always; + add_header X-XSS-Protection "1; mode=block" always; + add_header Referrer-Policy "strict-origin-when-cross-origin" always; + + # Proxy settings + location / { + proxy_pass http://dhamps_backend; + proxy_http_version 1.1; + + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + proxy_set_header Connection ""; + + # Timeouts + proxy_connect_timeout 60s; + proxy_send_timeout 60s; + proxy_read_timeout 60s; + + # Buffer settings + proxy_buffering on; + proxy_buffer_size 4k; + proxy_buffers 8 4k; + } + + # Rate limiting (optional, see below) + limit_req zone=api_limit burst=20 nodelay; + limit_req_status 429; +} +``` + +Enable site and get certificate: + +```bash +sudo ln -s /etc/nginx/sites-available/dhamps-vdb /etc/nginx/sites-enabled/ +sudo certbot --nginx -d api.example.com +sudo systemctl reload nginx +``` + +Auto-renewal: + +```bash +# Certbot creates a systemd timer automatically +sudo systemctl status certbot.timer + +# Test renewal +sudo certbot renew --dry-run +``` + +#### Traefik (Docker) + +`docker-compose.yml`: + +```yaml +version: '3.8' + +services: + traefik: + image: traefik:v2.10 + command: + - "--providers.docker=true" + - "--entrypoints.web.address=:80" + - "--entrypoints.websecure.address=:443" + - "--certificatesresolvers.letsencrypt.acme.email=admin@example.com" + - "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json" + - "--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web" + ports: + - "80:80" + - "443:443" + volumes: + - /var/run/docker.sock:/var/run/docker.sock:ro + - ./letsencrypt:/letsencrypt + + dhamps-vdb: + build: . + labels: + - "traefik.enable=true" + - "traefik.http.routers.dhamps-vdb.rule=Host(`api.example.com`)" + - "traefik.http.routers.dhamps-vdb.entrypoints=websecure" + - "traefik.http.routers.dhamps-vdb.tls.certresolver=letsencrypt" + # Redirect HTTP to HTTPS + - "traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https" + - "traefik.http.routers.dhamps-vdb-http.rule=Host(`api.example.com`)" + - "traefik.http.routers.dhamps-vdb-http.entrypoints=web" + - "traefik.http.routers.dhamps-vdb-http.middlewares=redirect-to-https" +``` + +#### Caddy (Automatic HTTPS) + +`Caddyfile`: + +``` +api.example.com { + reverse_proxy localhost:8880 + + # Automatic HTTPS via Let's Encrypt + tls admin@example.com + + # Security headers + header { + Strict-Transport-Security "max-age=31536000; includeSubDomains" + X-Frame-Options "DENY" + X-Content-Type-Options "nosniff" + X-XSS-Protection "1; mode=block" + } +} +``` + +## API Key Management + +### Admin Key Security + +The `SERVICE_ADMINKEY` has full control over the system. + +**Best practices:** + +1. **Generate securely:** + ```bash + openssl rand -base64 32 + ``` + +2. **Store securely:** + - Password manager (1Password, Bitwarden) + - Secrets manager (Vault, AWS Secrets Manager) + - Never in version control + - Never in logs or error messages + +3. **Rotate regularly:** + - Every 90 days minimum + - After team member departure + - After suspected compromise + +4. **Audit usage:** + - Log all admin operations + - Monitor for unusual activity + - Review regularly + +### User API Keys + +User API keys are automatically generated and encrypted before database storage. + +**Security features:** + +- **One-time display:** Keys shown only at creation +- **Encrypted storage:** AES-256-GCM encryption +- **Non-recoverable:** Lost keys cannot be retrieved + +**For users:** + +1. **Secure storage:** + - Use environment variables + - Never hardcode in applications + - Never commit to repositories + +2. **Use HTTPS:** + - Always transmit over TLS + - Validate server certificates + +3. **Rotate if compromised:** + - Delete old user and create new one + - Update all client applications + +### Bearer Token Authentication + +All API requests require authentication: + +```bash +curl -X GET https://api.example.com/v1/projects/alice \ + -H "Authorization: Bearer your-api-key-here" +``` + +**Security:** + +- Use `Authorization` header (not query parameters) +- Never log full API keys +- Validate on every request +- Use HTTPS to prevent interception + +## Encryption Key Security + +The `ENCRYPTION_KEY` protects all user API keys in the database. + +### Critical Security Points + +⚠️ **CRITICAL:** If the encryption key is lost or changed, **all user API keys become permanently unrecoverable.** + +**Best practices:** + +1. **Generate securely:** + ```bash + openssl rand -hex 32 + ``` + +2. **Backup separately:** + - Store in multiple secure locations + - Separate from database backups + - Document recovery procedure + - Test recovery process + +3. **Never rotate in production:** + - Cannot be changed without re-encrypting all keys + - Requires database migration + - Risk of data loss + +4. **Protect at rest:** + - Encrypt .env files: `gpg --encrypt .env` + - Use secrets management (Vault, AWS Secrets Manager) + - Restrict file permissions: `chmod 600 .env` + +5. **Protect in transit:** + - Never send over unencrypted channels + - Use secure channels for team sharing + - Avoid email/chat + +### Encryption Details + +- **Algorithm:** AES-256-GCM +- **Key derivation:** SHA-256 hash of input key +- **Nonce:** Unique per encryption +- **Authentication:** GCM provides authentication +- **Storage format:** Base64-encoded ciphertext + +### Disaster Recovery + +Document and test recovery procedure: + +```markdown +## Encryption Key Recovery Procedure + +1. Retrieve backup encryption key from [location] +2. Verify key integrity: [checksum/hash] +3. Update deployment configuration +4. Restart services +5. Verify user authentication works +6. Document incident +``` + +## Database Security + +### Network Security + +1. **Restrict access:** + ```sql + -- In pg_hba.conf + # Allow only from application server + host dhamps_vdb dhamps_user 10.0.1.0/24 md5 + ``` + +2. **Firewall rules:** + ```bash + # UFW example + sudo ufw allow from 10.0.1.0/24 to any port 5432 + sudo ufw deny 5432 + ``` + +3. **Use VPC/private network:** + - Keep database on private network + - No public internet exposure + - VPN for remote administration + +### Database Authentication + +1. **Strong passwords:** + ```bash + # Generate secure password + openssl rand -base64 32 + ``` + +2. **Dedicated user:** + ```sql + -- Not superuser + CREATE USER dhamps_user WITH PASSWORD 'secure_password'; + GRANT ALL PRIVILEGES ON DATABASE dhamps_vdb TO dhamps_user; + ``` + +3. **SSL/TLS connections:** + ```ini + # postgresql.conf + ssl = on + ssl_cert_file = 'server.crt' + ssl_key_file = 'server.key' + ssl_ca_file = 'ca.crt' + ``` + +### Database Encryption + +1. **Encryption at rest:** + - Use encrypted filesystems (LUKS, dm-crypt) + - Cloud provider encryption (AWS RDS encryption) + - Transparent Data Encryption (TDE) + +2. **Backup encryption:** + ```bash + # Encrypt backup + pg_dump -U postgres dhamps_vdb | gzip | \ + gpg --encrypt --recipient admin@example.com > backup.sql.gz.gpg + ``` + +### Database Auditing + +Enable audit logging: + +```sql +-- Install pgaudit +CREATE EXTENSION pgaudit; + +-- Configure logging +ALTER SYSTEM SET pgaudit.log = 'write, ddl'; +ALTER SYSTEM SET pgaudit.log_catalog = off; +ALTER SYSTEM SET pgaudit.log_parameter = on; + +-- Reload config +SELECT pg_reload_conf(); +``` + +## Network Security + +### Firewall Configuration + +```bash +# UFW example +sudo ufw default deny incoming +sudo ufw default allow outgoing + +# Allow SSH +sudo ufw allow 22/tcp + +# Allow HTTP/HTTPS (reverse proxy) +sudo ufw allow 80/tcp +sudo ufw allow 443/tcp + +# Application port (if not behind proxy) +# sudo ufw allow from trusted_network to any port 8880 + +# Database (from app server only) +sudo ufw allow from 10.0.1.0/24 to any port 5432 + +sudo ufw enable +``` + +### Network Segmentation + +Deploy in isolated networks: + +``` +Internet + ↓ +[Reverse Proxy] ← (DMZ/Public subnet) + ↓ +[dhamps-vdb] ← (Application subnet) + ↓ +[PostgreSQL] ← (Database subnet) +``` + +### Rate Limiting + +Protect against abuse and DoS attacks. + +**Nginx:** + +```nginx +# Define rate limit zone +http { + limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s; + limit_req_zone $binary_remote_addr zone=api_burst:10m rate=100r/s; +} + +server { + location /v1/ { + limit_req zone=api_limit burst=20 nodelay; + limit_req_status 429; + proxy_pass http://dhamps_backend; + } +} +``` + +**Application-level** (future enhancement): +- Implement token bucket or leaky bucket +- Per-user rate limits +- Different limits for different endpoints + +## Backup Strategies + +### Database Backups + +1. **Automated daily backups:** + ```bash + #!/bin/bash + # /usr/local/bin/backup-dhamps.sh + + DATE=$(date +%Y%m%d_%H%M%S) + BACKUP_DIR="/backups/dhamps-vdb" + + # Create backup + pg_dump -U dhamps_user dhamps_vdb | gzip > \ + "$BACKUP_DIR/db-$DATE.sql.gz" + + # Encrypt backup + gpg --encrypt --recipient admin@example.com \ + "$BACKUP_DIR/db-$DATE.sql.gz" + + # Remove unencrypted backup + rm "$BACKUP_DIR/db-$DATE.sql.gz" + + # Verify backup + gpg --decrypt "$BACKUP_DIR/db-$DATE.sql.gz.gpg" | gunzip | head -n 5 + + # Upload to offsite storage + aws s3 cp "$BACKUP_DIR/db-$DATE.sql.gz.gpg" \ + s3://backups/dhamps-vdb/ --storage-class GLACIER + + # Cleanup old local backups (keep 7 days) + find $BACKUP_DIR -name "db-*.sql.gz.gpg" -mtime +7 -delete + ``` + +2. **Cron schedule:** + ```cron + 0 2 * * * /usr/local/bin/backup-dhamps.sh + ``` + +3. **Test restores regularly:** + ```bash + # Monthly restore test + 0 3 1 * * /usr/local/bin/test-restore.sh + ``` + +### Configuration Backups + +```bash +# Backup environment configuration +cp .env .env.backup +gpg --encrypt --recipient admin@example.com .env.backup + +# Backup with timestamp +tar czf config-$(date +%Y%m%d).tar.gz .env docker-compose.yml +gpg --encrypt --recipient admin@example.com config-*.tar.gz +``` + +### Backup Retention Policy + +- **Daily backups:** Keep 7 days locally +- **Weekly backups:** Keep 4 weeks offsite +- **Monthly backups:** Keep 12 months in cold storage +- **Yearly backups:** Keep 7 years (compliance dependent) + +### Offsite Backups + +Always maintain offsite backups: + +- **Cloud storage:** AWS S3, Azure Blob Storage, GCP Cloud Storage +- **Different geographic region** +- **Encrypted before upload** +- **Test restoration procedures** + +## Monitoring and Alerting + +### What to Monitor + +1. **Authentication failures:** + ```bash + # Parse logs for 401 responses + grep "401" /var/log/nginx/access.log | wc -l + ``` + +2. **Unusual API usage:** + - Spike in requests + - Requests to non-existent endpoints + - Large data transfers + +3. **Database health:** + - Connection count + - Query performance + - Disk usage + - Replication lag (if applicable) + +4. **System resources:** + - CPU usage + - Memory usage + - Disk I/O + - Network throughput + +### Log Management + +1. **Centralized logging:** + - ELK Stack (Elasticsearch, Logstash, Kibana) + - Splunk + - Graylog + - CloudWatch Logs (AWS) + +2. **Log retention:** + - Application logs: 30 days minimum + - Access logs: 90 days minimum + - Audit logs: 1 year minimum (compliance dependent) + +3. **Log security:** + - Never log API keys in full (mask: `api_key=abc...xyz`) + - Never log passwords + - Encrypt archived logs + - Restrict access to logs + +### Alerting + +Set up alerts for: + +- Failed authentication attempts (>10/minute) +- Database connection failures +- Disk space >80% full +- Service downtime +- Unusual network traffic +- SSL certificate expiration (30 days before) + +## Security Checklist + +Use this checklist for production deployments: + +### Pre-Deployment + +- [ ] Strong, random `SERVICE_ADMINKEY` generated +- [ ] Strong, random `ENCRYPTION_KEY` generated (32+ chars) +- [ ] Strong database password set +- [ ] `.env` file encrypted or in secrets manager +- [ ] `.env` not in version control +- [ ] Encryption key backed up separately from database + +### Network & Access + +- [ ] HTTPS/TLS configured with valid certificate +- [ ] Reverse proxy deployed (nginx/Traefik/Caddy) +- [ ] Firewall configured and enabled +- [ ] Database on private network only +- [ ] Rate limiting configured +- [ ] HSTS header enabled +- [ ] Security headers configured + +### Database + +- [ ] PostgreSQL 11+ with pgvector installed +- [ ] Dedicated database user (not superuser) +- [ ] Strong database password +- [ ] SSL/TLS for database connections +- [ ] Database firewall rules configured +- [ ] pg_hba.conf restricts access by IP +- [ ] Regular backups configured +- [ ] Backup encryption enabled +- [ ] Restore procedure tested + +### Application + +- [ ] Service runs as non-root user +- [ ] Debug logging disabled (`SERVICE_DEBUG=false`) +- [ ] Resource limits configured +- [ ] Health checks enabled +- [ ] Logging configured +- [ ] Monitoring configured +- [ ] Alerting configured + +### Operations + +- [ ] Backup procedure documented +- [ ] Restore procedure documented and tested +- [ ] Disaster recovery plan created +- [ ] Security incident response plan created +- [ ] Key rotation schedule defined +- [ ] Access control documented (who has admin key) +- [ ] Regular security updates scheduled + +### Compliance + +- [ ] Data retention policy defined +- [ ] Privacy policy reviewed +- [ ] Terms of service reviewed +- [ ] GDPR compliance (if applicable) +- [ ] Data processing agreements in place +- [ ] Audit logging enabled + +## Incident Response + +### Suspected API Key Compromise + +1. **Immediate actions:** + - Identify compromised user + - Delete user (invalidates API key) + - Create new user with new API key + - Review access logs for unauthorized access + +2. **Investigation:** + - Determine scope of compromise + - Check for data exfiltration + - Document incident + +3. **Notification:** + - Notify affected parties (if required) + - Document lessons learned + - Update security procedures + +### Database Breach + +1. **Immediate actions:** + - Isolate database server + - Revoke network access + - Change all database credentials + - Activate incident response team + +2. **Assessment:** + - Determine data accessed + - Check encryption effectiveness + - Review audit logs + +3. **Recovery:** + - Restore from clean backup + - Apply security patches + - Update firewall rules + - Notify affected parties (if required) + - Document and learn + +### Encryption Key Loss + +⚠️ **CRITICAL SITUATION:** If encryption key is lost, all user API keys are unrecoverable. + +1. **Recovery attempt:** + - Check all backup locations + - Review documentation + - Contact all team members + +2. **If unrecoverable:** + - All users must be deleted and recreated + - New API keys issued to all users + - All client applications must be updated + - Communicate timeline to users + +3. **Prevention:** + - Review backup procedures + - Add redundant backup locations + - Document key locations + - Test recovery process + +## Security Updates + +### Regular Updates + +- **Weekly:** Review security advisories +- **Monthly:** Apply security patches +- **Quarterly:** Security audit and penetration testing +- **Yearly:** Full security review and compliance audit + +### Update Procedure + +1. **Test in staging environment** +2. **Backup production database and configuration** +3. **Schedule maintenance window** +4. **Apply updates** +5. **Verify functionality** +6. **Monitor for issues** + +### Subscribe to Security Advisories + +- PostgreSQL security announcements +- pgvector security updates +- Docker security advisories +- Go security advisories +- Operating system security updates + +## Further Reading + +- [OWASP Top 10](https://owasp.org/www-project-top-ten/) +- [PostgreSQL Security](https://www.postgresql.org/docs/current/security.html) +- [Docker Security Best Practices](https://docs.docker.com/engine/security/) +- [Let's Encrypt Documentation](https://letsencrypt.org/docs/) +- [Environment Variables Reference](../environment-variables/) +- [Database Setup Guide](../database/) +- [Docker Deployment](../docker/) diff --git a/docs/content/development/_index.md b/docs/content/development/_index.md new file mode 100644 index 0000000..367e09a --- /dev/null +++ b/docs/content/development/_index.md @@ -0,0 +1,47 @@ +--- +title: "Development" +weight: 6 +--- + +# Development Guide + +Information for developers contributing to dhamps-vdb. + +## Getting Started with Development + +This section covers: + +- Setting up a development environment +- Running tests +- Understanding the codebase architecture +- Contributing guidelines +- Performance optimization + +## Project Structure + +``` +dhamps-vdb/ +├── main.go # Application entry point +├── internal/ +│ ├── auth/ # Authentication logic +│ ├── database/ # Database layer (sqlc) +│ ├── handlers/ # HTTP handlers +│ └── models/ # Data models +├── testdata/ # Test fixtures +└── docs/ # Documentation +``` + +## Development Workflow + +1. Make changes to code +2. Generate sqlc code if database queries changed: `sqlc generate` +3. Run tests: `go test -v ./...` +4. Build: `go build -o dhamps-vdb main.go` +5. Submit pull request + +## Resources + +- [Testing](testing/) - How to run tests +- [Contributing](contributing/) - Contribution guidelines +- [Architecture](architecture/) - Technical deep-dive +- [Performance](performance/) - Optimization notes diff --git a/docs/content/development/architecture.md b/docs/content/development/architecture.md new file mode 100644 index 0000000..f4ed2bb --- /dev/null +++ b/docs/content/development/architecture.md @@ -0,0 +1,814 @@ +--- +title: "Architecture" +weight: 3 +--- + +# Technical Architecture + +This document provides a technical deep-dive into dhamps-vdb's architecture for developers who want to understand or modify the codebase. + +## Project Structure + +``` +dhamps-vdb/ +├── main.go # Application entry point +├── go.mod # Go module definition +├── go.sum # Dependency checksums +├── sqlc.yaml # sqlc configuration +├── template.env # Environment template +├── .env # Local config (gitignored) +│ +├── api/ +│ └── openapi.yml # OpenAPI spec (not actively maintained) +│ +├── internal/ # Internal packages (non-importable) +│ ├── auth/ # Authentication logic +│ │ └── authenticate.go # Bearer token validation +│ │ +│ ├── crypto/ # Encryption utilities +│ │ └── crypto.go # AES-256-GCM encryption +│ │ +│ ├── database/ # Database layer +│ │ ├── database.go # Connection pool management +│ │ ├── migrations.go # Migration runner +│ │ ├── db.go # Generated by sqlc +│ │ ├── models.go # Generated by sqlc +│ │ ├── queries.sql.go # Generated by sqlc +│ │ │ +│ │ ├── migrations/ # SQL migrations +│ │ │ ├── 001_create_initial_scheme.sql +│ │ │ ├── 002_create_emb_index.sql +│ │ │ ├── 003_add_public_read_flag.sql +│ │ │ ├── 004_refactor_llm_services_architecture.sql +│ │ │ ├── tern.conf.tpl # Template for tern +│ │ │ └── tern.conf # Generated (gitignored) +│ │ │ +│ │ └── queries/ # SQL queries for sqlc +│ │ └── queries.sql # All database queries +│ │ +│ ├── handlers/ # HTTP request handlers +│ │ ├── handlers.go # Common handler utilities +│ │ ├── users.go # User endpoints +│ │ ├── projects.go # Project endpoints +│ │ ├── llm_services.go # LLM service endpoints +│ │ ├── api_standards.go # API standard endpoints +│ │ ├── embeddings.go # Embedding endpoints +│ │ ├── similars.go # Similarity search endpoints +│ │ ├── admin.go # Admin endpoints +│ │ │ +│ │ └── *_test.go # Test files +│ │ ├── users_test.go +│ │ ├── projects_test.go +│ │ ├── projects_sharing_test.go +│ │ ├── embeddings_test.go +│ │ ├── llm_services_test.go +│ │ ├── editor_permissions_test.go +│ │ └── handlers_test.go +│ │ +│ └── models/ # Data models and options +│ ├── options.go # CLI/environment options +│ ├── users.go # User models +│ ├── projects.go # Project models +│ ├── instances.go # LLM instance models (new) +│ ├── api_standards.go # API standard models +│ ├── embeddings.go # Embedding models +│ ├── similars.go # Similarity search models +│ └── admin.go # Admin operation models +│ +├── testdata/ # Test fixtures +│ ├── postgres/ # PostgreSQL test data +│ │ ├── enable-vector.sql +│ │ └── users.yml +│ │ +│ └── *.json # JSON test fixtures +│ ├── valid_user.json +│ ├── valid_embeddings.json +│ ├── valid_api_standard_*.json +│ └── valid_llm_service_*.json +│ +└── docs/ # Documentation + ├── content/ # Hugo content + └── *.md # Additional docs +``` + +## Code Organization + +### 1. Entry Point (`main.go`) + +The application entry point handles: + +```go +func main() { + // 1. Parse CLI options and environment variables + cli := huma.NewCLI(func(hooks huma.Hooks, opts *models.Options) { + + // 2. Initialize database connection pool + pool := database.InitDB(opts) + defer pool.Close() + + // 3. Run database migrations + database.RunMigrations(pool, opts) + + // 4. Create HTTP router and Huma API + router := http.NewServeMux() + api := humago.New(router, huma.DefaultConfig("dhamps-vdb", "0.1.0")) + + // 5. Register all routes + handlers.AddRoutes(pool, keyGen, api) + + // 6. Start HTTP server + server := &http.Server{ + Addr: fmt.Sprintf("%s:%d", opts.Host, opts.Port), + Handler: router, + } + server.ListenAndServe() + }) + + cli.Run() +} +``` + +### 2. Handlers (`internal/handlers/`) + +Handlers process HTTP requests using the Huma framework pattern: + +```go +// Example: Create User Handler +func RegisterUsersRoutes(pool *pgxpool.Pool, keyGen RandomKeyGenerator, api huma.API) { + + // Define request/response types + type CreateUserInput struct { + Body models.CreateUserRequest + } + + type CreateUserOutput struct { + Body models.User + } + + // Register operation with Huma + huma.Register(api, huma.Operation{ + OperationID: "create-user", + Method: http.MethodPost, + Path: "/v1/users", + Summary: "Create a new user", + Description: "Admin only. Creates user and returns API key.", + Security: []map[string][]string{{"bearer": {}}}, + }, func(ctx context.Context, input *CreateUserInput) (*CreateUserOutput, error) { + + // 1. Get authenticated user from context + authUser := auth.GetAuthenticatedUser(ctx) + + // 2. Check authorization (admin only) + if !authUser.IsAdmin { + return nil, huma.Error403Forbidden("admin access required") + } + + // 3. Validate input + if err := input.Body.Validate(); err != nil { + return nil, huma.Error400BadRequest("invalid input", err) + } + + // 4. Business logic + user, apiKey, err := createUserWithKey(ctx, pool, keyGen, &input.Body) + if err != nil { + return nil, handleDatabaseError(err) + } + + // 5. Return response + return &CreateUserOutput{Body: *user}, nil + }) +} +``` + +**Handler file organization:** + +- `handlers.go` - Common utilities (context keys, error handling) +- `users.go` - User CRUD operations +- `projects.go` - Project CRUD and sharing +- `llm_services.go` - LLM service/instance management +- `embeddings.go` - Embedding CRUD operations +- `similars.go` - Similarity search +- `admin.go` - Administrative operations + +### 3. Models (`internal/models/`) + +Models define request/response structures: + +```go +// User model +type User struct { + UserHandle string `json:"user_handle" example:"alice"` + Name string `json:"name" example:"Alice Smith"` + Email string `json:"email" example:"alice@example.com"` + CreatedAt time.Time `json:"created_at"` + UpdatedAt time.Time `json:"updated_at"` +} + +// Request model with validation +type CreateUserRequest struct { + UserHandle string `json:"user_handle" minLength:"1" maxLength:"50" pattern:"^[a-z0-9_-]+$"` + Name string `json:"name" minLength:"1" maxLength:"100"` + Email string `json:"email" format:"email"` +} + +func (r *CreateUserRequest) Validate() error { + if r.UserHandle == "" { + return fmt.Errorf("user_handle is required") + } + if r.UserHandle == "_system" { + return fmt.Errorf("_system is a reserved handle") + } + return nil +} +``` + +**Model conventions:** + +- Request models: `Create*Request`, `Update*Request` +- Response models: `*Response`, `*Output` +- Database models: Match database schema (generated by sqlc) + +### 4. Database Layer (`internal/database/`) + +#### Connection Management (`database.go`) + +```go +func InitDB(opts *models.Options) *pgxpool.Pool { + connString := fmt.Sprintf( + "postgres://%s:%s@%s:%d/%s", + opts.DBUser, opts.DBPassword, + opts.DBHost, opts.DBPort, opts.DBName, + ) + + config, err := pgxpool.ParseConfig(connString) + if err != nil { + log.Fatal(err) + } + + // Configure connection pool + config.MaxConns = 20 + config.MinConns = 5 + config.MaxConnIdleTime = time.Minute * 5 + + pool, err := pgxpool.NewWithConfig(context.Background(), config) + if err != nil { + log.Fatal(err) + } + + return pool +} +``` + +#### Migrations (`migrations.go`) + +Uses [tern](https://github.com/jackc/tern) for versioned migrations: + +```go +func RunMigrations(pool *pgxpool.Pool, opts *models.Options) error { + // Create tern configuration + config := createTernConfig(opts) + + // Initialize migrator + migrator, err := migrate.NewMigrator(context.Background(), pool, "schema_version") + if err != nil { + return err + } + + // Run pending migrations + err = migrator.Migrate(context.Background()) + if err != nil { + return fmt.Errorf("migration failed: %w", err) + } + + return nil +} +``` + +Migration files are numbered sequentially: + +```sql +-- 001_create_initial_scheme.sql +CREATE TABLE users ( + user_id SERIAL PRIMARY KEY, + user_handle TEXT UNIQUE NOT NULL, + vdb_key TEXT UNIQUE NOT NULL, + ... +); + +CREATE TABLE projects ( + project_id SERIAL PRIMARY KEY, + project_handle TEXT NOT NULL, + owner TEXT NOT NULL REFERENCES users(user_handle), + ... +); + +-- 002_create_emb_index.sql +CREATE INDEX embedding_vector_idx +ON embeddings +USING hnsw (vector vector_cosine_ops); +``` + +#### SQLC Queries (`queries/queries.sql`) + +Write SQL, generate type-safe Go code: + +```sql +-- name: UpsertUser :one +INSERT INTO users ( + user_handle, name, email, vdb_key, created_at, updated_at +) VALUES ( + $1, $2, $3, $4, NOW(), NOW() +) +ON CONFLICT (user_handle) DO UPDATE SET + name = EXCLUDED.name, + email = EXCLUDED.email, + updated_at = NOW() +RETURNING *; + +-- name: GetUserByHandle :one +SELECT * FROM users WHERE user_handle = $1; + +-- name: GetAllUsers :many +SELECT user_handle, name, email, created_at, updated_at +FROM users +ORDER BY user_handle ASC; + +-- name: DeleteUser :exec +DELETE FROM users WHERE user_handle = $1; +``` + +Generated Go code (`queries.sql.go`): + +```go +// Generated by sqlc +func (q *Queries) UpsertUser(ctx context.Context, arg UpsertUserParams) (User, error) { + row := q.db.QueryRow(ctx, upsertUser, + arg.UserHandle, + arg.Name, + arg.Email, + arg.VdbKey, + ) + var i User + err := row.Scan( + &i.UserID, + &i.UserHandle, + &i.Name, + &i.Email, + &i.VdbKey, + &i.CreatedAt, + &i.UpdatedAt, + ) + return i, err +} +``` + +**sqlc configuration (`sqlc.yaml`):** + +```yaml +version: "2" +sql: + - engine: "postgresql" + queries: "internal/database/queries/queries.sql" + schema: "internal/database/migrations/" + gen: + go: + package: "database" + out: "internal/database" + emit_json_tags: true + emit_db_tags: true + emit_prepared_queries: false + emit_interface: false +``` + +**Regenerate code after query changes:** + +```bash +sqlc generate --no-remote +``` + +### 5. Authentication (`internal/auth/`) + +Token-based authentication using Bearer tokens: + +```go +// Middleware checks Authorization header +func AuthMiddleware(pool *pgxpool.Pool, adminKey string) func(http.Handler) http.Handler { + return func(next http.Handler) http.Handler { + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + + // Extract Bearer token + authHeader := r.Header.Get("Authorization") + token := strings.TrimPrefix(authHeader, "Bearer ") + + if token == "" { + // Allow public access for certain endpoints + ctx := context.WithValue(r.Context(), "user", nil) + next.ServeHTTP(w, r.WithContext(ctx)) + return + } + + // Check admin key + if token == adminKey { + user := &AuthUser{Handle: "_admin", IsAdmin: true} + ctx := context.WithValue(r.Context(), "user", user) + next.ServeHTTP(w, r.WithContext(ctx)) + return + } + + // Look up user by API key hash + hash := hashAPIKey(token) + user, err := db.GetUserByKeyHash(r.Context(), pool, hash) + if err != nil { + http.Error(w, "Unauthorized", http.StatusUnauthorized) + return + } + + authUser := &AuthUser{Handle: user.UserHandle, IsAdmin: false} + ctx := context.WithValue(r.Context(), "user", authUser) + next.ServeHTTP(w, r.WithContext(ctx)) + }) + } +} + +// Helper to get authenticated user from context +func GetAuthenticatedUser(ctx context.Context) *AuthUser { + user, _ := ctx.Value("user").(*AuthUser) + return user +} +``` + +### 6. Encryption (`internal/crypto/`) + +AES-256-GCM encryption for sensitive data (API keys): + +```go +// Encrypt data with AES-256-GCM +func Encrypt(plaintext []byte, key []byte) ([]byte, error) { + // Ensure key is 32 bytes + keyHash := sha256.Sum256(key) + + // Create AES cipher + block, err := aes.NewCipher(keyHash[:]) + if err != nil { + return nil, err + } + + // Create GCM + gcm, err := cipher.NewGCM(block) + if err != nil { + return nil, err + } + + // Generate random nonce + nonce := make([]byte, gcm.NonceSize()) + if _, err := io.ReadFull(rand.Reader, nonce); err != nil { + return nil, err + } + + // Encrypt and append nonce + ciphertext := gcm.Seal(nonce, nonce, plaintext, nil) + return ciphertext, nil +} + +func Decrypt(ciphertext []byte, key []byte) ([]byte, error) { + // Derive key + keyHash := sha256.Sum256(key) + + block, err := aes.NewCipher(keyHash[:]) + if err != nil { + return nil, err + } + + gcm, err := cipher.NewGCM(block) + if err != nil { + return nil, err + } + + // Extract nonce + nonceSize := gcm.NonceSize() + nonce, ciphertext := ciphertext[:nonceSize], ciphertext[nonceSize:] + + // Decrypt + plaintext, err := gcm.Open(nil, nonce, ciphertext, nil) + if err != nil { + return nil, err + } + + return plaintext, nil +} +``` + +## Huma Framework Integration + +dhamps-vdb uses [Huma](https://huma.rocks/) for API development: + +### Benefits + +1. **Automatic OpenAPI generation** - No manual spec maintenance +2. **Request/response validation** - Type-safe with JSON schema +3. **Error handling** - Standardized error responses +4. **Documentation** - Interactive docs at `/docs` + +### Operation Registration Pattern + +```go +huma.Register(api, huma.Operation{ + OperationID: "get-project", + Method: http.MethodGet, + Path: "/v1/projects/{owner}/{project}", + Summary: "Get project details", + Description: "Returns full project information including metadata schema", + Tags: []string{"Projects"}, + Security: []map[string][]string{{"bearer": {}}}, + MaxBodyBytes: 1024, // Limit request size + DefaultStatus: http.StatusOK, + Errors: []int{400, 401, 403, 404, 500}, +}, handlerFunction) +``` + +### Input/Output Patterns + +```go +// Path parameters, query parameters, and body +type GetSimilarInput struct { + Owner string `path:"owner" doc:"Project owner"` + Project string `path:"project" doc:"Project handle"` + TextID string `path:"text_id" doc:"Text identifier"` + Threshold float32 `query:"threshold" default:"0.5" doc:"Similarity threshold"` + Limit int `query:"limit" default:"10" maximum:"200" doc:"Max results"` +} + +// Response with status code +type GetSimilarOutput struct { + Status int + Body models.SimilarResponse +} +``` + +### Error Response Pattern + +```go +// Standard error responses +return nil, huma.Error400BadRequest("validation failed", err) +return nil, huma.Error401Unauthorized("invalid credentials") +return nil, huma.Error403Forbidden("insufficient permissions") +return nil, huma.Error404NotFound("project not found") +return nil, huma.Error500InternalServerError("database error", err) + +// Custom error with details +return nil, huma.NewError(400, "Dimension Mismatch", + fmt.Sprintf("expected %d dimensions, got %d", expected, actual)) +``` + +## Design Patterns + +### 1. Repository Pattern (via sqlc) + +Database access is centralized in generated queries: + +```go +// Don't write SQL in handlers +// Bad: +rows, err := pool.Query(ctx, "SELECT * FROM users") + +// Good: Use generated functions +users, err := db.GetAllUsers(ctx) +``` + +### 2. Dependency Injection + +Pass dependencies explicitly: + +```go +// Inject pool into handlers +func RegisterUsersRoutes(pool *pgxpool.Pool, keyGen RandomKeyGenerator, api huma.API) { + // Routes have access to pool +} + +// Store in context for handler access +ctx = context.WithValue(ctx, PoolKey, pool) +``` + +### 3. Interface-Based Testing + +Use interfaces for testability: + +```go +// Production: Real random key generator +type StandardKeyGen struct{} +func (s StandardKeyGen) RandomKey(len int) (string, error) { + b := make([]byte, len) + _, err := rand.Read(b) + return hex.EncodeToString(b), err +} + +// Testing: Deterministic key generator +type MockKeyGen struct { + Keys []string + idx int +} +func (m *MockKeyGen) RandomKey(len int) (string, error) { + key := m.Keys[m.idx] + m.idx++ + return key, nil +} +``` + +### 4. Validation at Multiple Layers + +```go +// 1. Huma validates request structure +type CreateEmbeddingInput struct { + Body models.Embedding `maxLength:"1000000"` +} + +// 2. Model validates business rules +func (e *Embedding) Validate() error { + if e.TextID == "" { + return fmt.Errorf("text_id required") + } + return nil +} + +// 3. Handler validates against database state +func CreateEmbedding(ctx context.Context, input *CreateEmbeddingInput) error { + // Check project exists + // Validate dimensions match LLM service + // Validate metadata against schema + // Then insert +} +``` + +### 5. Error Wrapping + +Provide context while preserving original error: + +```go +user, err := db.GetUserByHandle(ctx, handle) +if err != nil { + if errors.Is(err, pgx.ErrNoRows) { + return nil, huma.Error404NotFound("user not found") + } + return nil, fmt.Errorf("failed to retrieve user %s: %w", handle, err) +} +``` + +## Internal Packages + +Go's `internal/` directory enforces package privacy: + +```go +// This import works within dhamps-vdb +import "github.com/mpilhlt/dhamps-vdb/internal/database" + +// This import would FAIL from external projects +// Enforced by Go compiler +``` + +Benefits: +- Clear API boundaries +- Implementation details hidden +- Refactoring without breaking external users + +## Performance Considerations + +### Connection Pooling + +```go +config.MaxConns = 20 // Max concurrent connections +config.MinConns = 5 // Keep-alive connections +config.MaxConnIdleTime = 5 * time.Minute +``` + +### Query Optimization + +Use UNION ALL for better performance: + +```go +// Instead of LEFT JOIN with OR conditions +query := ` + SELECT * FROM projects WHERE owner = $1 + UNION ALL + SELECT p.* FROM projects p + INNER JOIN projects_shared_with ps USING (project_id) + WHERE ps.user_handle = $1 + ORDER BY owner, project_handle +` +``` + +### Index Strategy + +```sql +-- Dimension filtering (very important) +CREATE INDEX ON embeddings(project_id, vector_dim); + +-- Vector similarity (HNSW for accuracy) +CREATE INDEX ON embeddings USING hnsw (vector vector_cosine_ops); + +-- Access lookups +CREATE INDEX ON projects_shared_with(user_handle); +CREATE INDEX ON projects(owner, project_handle); +``` + +See [Performance Guide](../performance/) for detailed optimization strategies. + +## Testing Architecture + +### Test Organization + +```go +// handlers_test.go - Setup and utilities +func setupTestDB(t *testing.T) *pgxpool.Pool { } +func createTestUser(t *testing.T, pool *pgxpool.Pool) *models.User { } + +// users_test.go - User-specific tests +func TestCreateUser(t *testing.T) { } +func TestGetUser(t *testing.T) { } + +// projects_sharing_test.go - Sharing feature tests +func TestShareProject(t *testing.T) { } +``` + +### Testcontainers Integration + +```go +func setupTestDB(t *testing.T) *pgxpool.Pool { + ctx := context.Background() + + // Start PostgreSQL with pgvector + req := testcontainers.ContainerRequest{ + Image: "pgvector/pgvector:0.7.4-pg16", + ExposedPorts: []string{"5432/tcp"}, + Env: map[string]string{ + "POSTGRES_PASSWORD": "password", + "POSTGRES_DB": "testdb", + }, + } + + container, err := testcontainers.GenericContainer(ctx, + testcontainers.GenericContainerRequest{ + ContainerRequest: req, + Started: true, + }) + require.NoError(t, err) + + // Connect and migrate + pool := connectAndMigrate(ctx, container) + return pool +} +``` + +See [Testing Guide](../testing/) for comprehensive testing documentation. + +## Build and Deployment + +### Building + +```bash +# Development build +go build -o dhamps-vdb main.go + +# Production build with optimizations +go build -ldflags="-s -w" -o dhamps-vdb main.go + +# Cross-compilation +GOOS=linux GOARCH=amd64 go build -o dhamps-vdb-linux main.go +``` + +### Docker Build + +Multi-stage build for minimal image: + +```dockerfile +# Build stage +FROM golang:1.21-alpine AS builder +WORKDIR /app +COPY go.mod go.sum ./ +RUN go mod download +COPY . . +RUN go build -ldflags="-s -w" -o dhamps-vdb main.go + +# Runtime stage +FROM alpine:latest +RUN apk --no-cache add ca-certificates +WORKDIR /root/ +COPY --from=builder /app/dhamps-vdb . +EXPOSE 8880 +CMD ["./dhamps-vdb"] +``` + +### Configuration + +Application reads configuration from: + +1. **Command line flags**: `--port 8080` +2. **Environment variables**: `SERVICE_PORT=8080` +3. **`.env` file**: `SERVICE_PORT=8080` + +Priority: CLI flags > Environment > .env file > Defaults + +## Next Steps + +- [Testing Guide](testing/) - Learn how to test changes +- [Contributing Guide](contributing/) - Start contributing +- [Performance Guide](performance/) - Optimize queries and indexes diff --git a/docs/content/development/contributing.md b/docs/content/development/contributing.md new file mode 100644 index 0000000..43b778a --- /dev/null +++ b/docs/content/development/contributing.md @@ -0,0 +1,683 @@ +--- +title: "Contributing" +weight: 2 +--- + +# Contributing Guide + +Thank you for your interest in contributing to dhamps-vdb! This guide will help you get started. + +## Development Setup + +### Prerequisites + +- **Go 1.21+**: [Download Go](https://go.dev/dl/) +- **PostgreSQL 16+**: With pgvector extension +- **sqlc**: For generating type-safe database code +- **Docker/Podman**: For running tests +- **Git**: For version control + +### Clone and Build + +```bash +# Clone repository +git clone https://github.com/mpilhlt/dhamps-vdb.git +cd dhamps-vdb + +# Install dependencies +go get ./... + +# Generate sqlc code +sqlc generate --no-remote + +# Build application +go build -o build/dhamps-vdb main.go + +# Or run directly +go run main.go +``` + +### Environment Setup + +Create a `.env` file for local development: + +```bash +# Copy template +cp template.env .env + +# Edit with your settings +SERVICE_DEBUG=true +SERVICE_HOST=localhost +SERVICE_PORT=8880 +SERVICE_ADMINKEY=your-secure-admin-key-here + +# Database settings +SERVICE_DBHOST=localhost +SERVICE_DBPORT=5432 +SERVICE_DBUSER=postgres +SERVICE_DBPASSWORD=password +SERVICE_DBNAME=dhamps_vdb_dev + +# Encryption key (32+ characters) +ENCRYPTION_KEY=your-secure-encryption-key-min-32-chars +``` + +**Important**: Never commit `.env` files (already in `.gitignore`) + +### Database Setup + +For local development: + +```bash +# Start PostgreSQL with pgvector +podman run -p 5432:5432 \ + -e POSTGRES_PASSWORD=password \ + -e POSTGRES_DB=dhamps_vdb_dev \ + pgvector/pgvector:0.7.4-pg16 + +# Or use docker-compose +docker-compose up -d + +# Application auto-migrates on startup +go run main.go +``` + +### Verify Setup + +```bash +# Check application starts +go run main.go + +# In another terminal, test API +curl http://localhost:8880/docs + +# Run tests +go test -v ./... +``` + +## Code Style + +### Go Formatting + +dhamps-vdb follows standard Go conventions: + +```bash +# Format all code +go fmt ./... + +# Check for common issues +go vet ./... + +# Run linter (if installed) +golangci-lint run +``` + +### Code Organization + +Follow existing patterns: + +```go +// Package comment at top of file +package handlers + +import ( + // Standard library first + "context" + "fmt" + + // External packages + "github.com/danielgtaylor/huma/v2" + "github.com/jackc/pgx/v5/pgxpool" + + // Internal packages + "github.com/mpilhlt/dhamps-vdb/internal/database" + "github.com/mpilhlt/dhamps-vdb/internal/models" +) + +// Exported function with doc comment +// GetUsers retrieves all users from the database +func GetUsers(ctx context.Context, pool *pgxpool.Pool) ([]models.User, error) { + // Implementation +} +``` + +### Naming Conventions + +**Files:** +- `handlers/users.go` - Implementation +- `handlers/users_test.go` - Tests +- `handlers/users_sharing_test.go` - Specific feature tests + +**Functions:** +- `GetUsers()` - List/retrieve multiple +- `GetUser()` - Retrieve single +- `CreateUser()` - Create new +- `UpdateUser()` - Update existing +- `DeleteUser()` - Delete +- `LinkUserToProject()` - Create association +- `IsProjectOwner()` - Boolean check + +**Database Queries (in `queries.sql`):** +- `-- name: GetAllUsers :many` +- `-- name: RetrieveUserByHandle :one` +- `-- name: UpsertUser :one` +- `-- name: DeleteUser :exec` + +### Error Handling + +```go +// Return errors, don't panic +func CreateUser(ctx context.Context, pool *pgxpool.Pool, user models.User) error { + if user.Handle == "" { + return fmt.Errorf("user handle is required") + } + + // Wrap errors with context + err := db.InsertUser(ctx, user) + if err != nil { + return fmt.Errorf("failed to insert user: %w", err) + } + + return nil +} + +// Use specific error responses in handlers +func handleCreateUser(ctx context.Context, input *CreateUserInput) (*CreateUserOutput, error) { + err := CreateUser(ctx, pool, input.Body) + if err != nil { + return nil, huma.Error400BadRequest("invalid user data", err) + } + + return &CreateUserOutput{Body: result}, nil +} +``` + +### Comments + +Comment public APIs and complex logic: + +```go +// GetAccessibleProjects returns all projects the user can access. +// This includes projects owned by the user and projects shared with them. +// Results are paginated using limit and offset. +func GetAccessibleProjects(ctx context.Context, pool *pgxpool.Pool, + userHandle string, limit, offset int) ([]models.Project, error) { + + // Use UNION ALL for better query performance + // See docs/PERFORMANCE_OPTIMIZATION.md for details + query := ` + SELECT * FROM projects WHERE owner = $1 + UNION ALL + SELECT p.* FROM projects p + INNER JOIN projects_shared_with ps ON p.project_id = ps.project_id + WHERE ps.user_handle = $1 + ` + + // Implementation... +} +``` + +Don't over-comment obvious code: + +```go +// Bad - obvious +// i is set to 0 +i := 0 + +// Good - explains why +// Start from second element (first is header) +i := 1 +``` + +## Git Workflow + +### Branching Strategy + +```bash +# Create feature branch from main +git checkout main +git pull origin main +git checkout -b feature/your-feature-name + +# Create fix branch +git checkout -b fix/issue-description + +# Create docs branch +git checkout -b docs/update-contributing-guide +``` + +Branch naming: +- `feature/*` - New features +- `fix/*` - Bug fixes +- `docs/*` - Documentation updates +- `test/*` - Test improvements +- `refactor/*` - Code refactoring + +### Commit Messages + +Write clear, descriptive commit messages: + +```bash +# Good commit messages +git commit -m "Add project sharing functionality" +git commit -m "Fix dimension validation for embeddings" +git commit -m "Update contributing guide with git workflow" + +# Multi-line for complex changes +git commit -m "Refactor similarity search query + +- Use UNION ALL for better performance +- Add dimension filtering to subqueries +- Update tests to verify performance improvement + +Closes #123" +``` + +**Commit message format:** +- First line: Brief summary (50 chars or less) +- Blank line +- Detailed explanation if needed +- Reference issues: `Closes #123` or `Fixes #456` + +### Pull Request Process + +1. **Create feature branch** + ```bash + git checkout -b feature/my-feature + ``` + +2. **Make changes and commit** + ```bash + git add . + git commit -m "Add feature description" + ``` + +3. **Keep branch updated** + ```bash + git fetch origin + git rebase origin/main + ``` + +4. **Push to your fork** + ```bash + git push origin feature/my-feature + ``` + +5. **Create Pull Request** + - Go to GitHub repository + - Click "New Pull Request" + - Select your branch + - Fill in PR template + +6. **PR Review Process** + - Automated tests run + - Code review by maintainers + - Address feedback + - Merge when approved + +### PR Title Format + +``` +[Type] Brief description + +Examples: +[Feature] Add project ownership transfer +[Fix] Correct dimension validation in embeddings +[Docs] Update API documentation for similarity search +[Test] Add integration tests for sharing workflow +[Refactor] Simplify authentication middleware +``` + +### PR Description Template + +```markdown +## Description +Brief description of changes + +## Motivation +Why is this change needed? + +## Changes +- Change 1 +- Change 2 +- Change 3 + +## Testing +How was this tested? +- [ ] Unit tests pass +- [ ] Integration tests pass +- [ ] Manual testing performed + +## Related Issues +Closes #123 +Relates to #456 + +## Checklist +- [ ] Code follows style guidelines +- [ ] Tests added/updated +- [ ] Documentation updated +- [ ] No breaking changes (or clearly documented) +``` + +## Testing Requirements + +All contributions must include tests: + +### For New Features + +```go +// Add tests in same package +// File: handlers/projects.go +func CreateProject(...) { ... } + +// File: handlers/projects_test.go +func TestCreateProject(t *testing.T) { + // Test happy path + // Test error cases + // Test edge cases +} +``` + +### For Bug Fixes + +```go +// Add regression test +func TestBugFix_Issue123(t *testing.T) { + // Reproduce the bug + // Verify it's fixed +} +``` + +### Test Coverage + +Aim for reasonable coverage: + +```bash +# Check coverage +go test -cover ./... + +# Generate coverage report +go test -coverprofile=coverage.out ./... +go tool cover -html=coverage.out +``` + +Target coverage: **70%+** for new code + +### Running Tests + +```bash +# Before submitting PR +go test -v ./... + +# With race detection +go test -race ./... + +# Specific package +go test -v ./internal/handlers +``` + +See [Testing Guide](../testing/) for detailed information. + +## Documentation Updates + +### When to Update Docs + +Update documentation for: + +- **New features**: Add usage examples +- **API changes**: Update endpoint documentation +- **Breaking changes**: Clearly document migration path +- **Configuration**: New environment variables or options + +### Documentation Structure + +``` +docs/content/ +├── getting-started/ # Installation, quickstart +├── concepts/ # Core concepts +├── guides/ # How-to guides +├── api/ # API reference +└── development/ # Development docs (this section) +``` + +### Adding Documentation + +```bash +# Create new doc file +cd docs/content/guides +cat > new-guide.md <1000) + - Many shared relationships (>100 shares per user) + - Query time consistently >100ms + +**Always profile first:** + +```sql +EXPLAIN ANALYZE +SELECT instances.*, ... +FROM instances +LEFT JOIN instances_shared_with ON ... +WHERE instances.owner = 'alice' OR instances_shared_with.user_handle = 'alice'; +``` + +## Index Optimization + +### Current Indexes + +From migration `004_refactor_llm_services_architecture.sql`: + +```sql +-- API Standards / Definitions +CREATE INDEX idx_definitions_handle + ON definitions(definition_handle); + +CREATE INDEX idx_definitions_owner_handle + ON definitions(owner, definition_handle); + +-- Instances +CREATE INDEX idx_instances_handle + ON instances(instance_handle); + +-- Sharing (implicit from PRIMARY KEY) +-- instances_shared_with(instance_id, user_handle) +``` + +### Recommended Additional Indexes + +#### 1. Owner Lookups + +```sql +-- For queries: WHERE instances.owner = ? +CREATE INDEX idx_instances_owner + ON instances(owner); +``` + +**Benefit:** Fast retrieval of user's owned instances + +**Use case:** +```sql +SELECT * FROM instances WHERE owner = 'alice'; +``` + +#### 2. Shared Instance Lookups + +```sql +-- For queries: WHERE user_handle = ? +CREATE INDEX idx_instances_shared_user + ON instances_shared_with(user_handle); +``` + +**Benefit:** Fast retrieval of instances shared with user + +**Use case:** +```sql +SELECT i.* FROM instances i +INNER JOIN instances_shared_with isw ON i.instance_id = isw.instance_id +WHERE isw.user_handle = 'bob'; +``` + +#### 3. Composite Owner+Handle Index + +```sql +-- For unique constraint and lookups +CREATE UNIQUE INDEX idx_instances_owner_handle + ON instances(owner, instance_handle); +``` + +**Benefit:** Enforces uniqueness and enables index-only scans + +**Use case:** +```sql +SELECT * FROM instances +WHERE owner = 'alice' AND instance_handle = 'my-service'; +``` + +#### 4. Embedding Dimension Filtering + +```sql +-- Critical for similarity search performance +CREATE INDEX idx_embeddings_project_dim + ON embeddings(project_id, vector_dim); +``` + +**Benefit:** Filters embeddings by dimension before vector comparison + +**Use case:** +```sql +SELECT * FROM embeddings +WHERE project_id = 123 + AND vector_dim = 1536 +ORDER BY vector <=> $1::vector +LIMIT 10; +``` + +#### 5. Project Ownership + +```sql +CREATE INDEX idx_projects_owner + ON projects(owner); + +CREATE UNIQUE INDEX idx_projects_owner_handle + ON projects(owner, project_handle); +``` + +### Index Analysis + +Check if indexes are being used: + +```sql +-- Analyze query plan +EXPLAIN ANALYZE +SELECT * FROM instances WHERE owner = 'alice'; + +-- Check index usage statistics +SELECT + schemaname, + tablename, + indexname, + idx_scan, + idx_tup_read, + idx_tup_fetch +FROM pg_stat_user_indexes +WHERE schemaname = 'public' +ORDER BY idx_scan DESC; + +-- Find unused indexes +SELECT + schemaname, + tablename, + indexname +FROM pg_stat_user_indexes +WHERE idx_scan = 0 + AND schemaname = 'public'; +``` + +### Index Maintenance + +```sql +-- Update statistics for query planner +ANALYZE instances; +ANALYZE instances_shared_with; +ANALYZE embeddings; + +-- Rebuild index if fragmented +REINDEX INDEX idx_instances_owner; + +-- Check index size +SELECT + indexname, + pg_size_pretty(pg_relation_size(indexrelid)) AS size +FROM pg_stat_user_indexes +WHERE schemaname = 'public' +ORDER BY pg_relation_size(indexrelid) DESC; +``` + +## Vector Index Optimization + +### HNSW vs IVFFlat + +**Current implementation uses HNSW:** + +```sql +CREATE INDEX embedding_vector_idx +ON embeddings +USING hnsw (vector vector_cosine_ops); +``` + +#### HNSW (Hierarchical Navigable Small World) + +**Pros:** +- Better recall (finds more similar results) +- Better query performance +- No training required + +**Cons:** +- Slower index build time +- Higher memory usage +- Larger index size + +**Configuration options:** + +```sql +-- Default: m=16, ef_construction=64 +CREATE INDEX embedding_vector_idx +ON embeddings +USING hnsw (vector vector_cosine_ops) +WITH (m = 16, ef_construction = 64); + +-- Higher quality (slower build): m=32, ef_construction=128 +CREATE INDEX embedding_vector_idx_hq +ON embeddings +USING hnsw (vector vector_cosine_ops) +WITH (m = 32, ef_construction = 128); +``` + +**Parameters:** +- `m`: Number of connections per layer (default 16, range 2-100) +- `ef_construction`: Size of candidate list during build (default 64, range 4-1000) +- Higher values = better recall but slower build and more memory + +#### IVFFlat (Inverted File with Flat compression) + +**Alternative for very large datasets:** + +```sql +CREATE INDEX embedding_vector_idx_ivf +ON embeddings +USING ivfflat (vector vector_cosine_ops) +WITH (lists = 100); +``` + +**Pros:** +- Faster index build +- Lower memory usage +- Smaller index size + +**Cons:** +- Requires training (ANALYZE before creating index) +- Lower recall than HNSW +- Need to tune `lists` parameter + +**When to use:** +- Dataset >1M embeddings +- Build time is critical +- Memory constrained environment + +**Configuration:** + +```sql +-- Rule of thumb: lists = sqrt(total_rows) +-- For 100K embeddings: lists = 316 +-- For 1M embeddings: lists = 1000 + +-- Train the index +ANALYZE embeddings; + +-- Create with appropriate lists +CREATE INDEX embedding_vector_idx_ivf +ON embeddings +USING ivfflat (vector vector_cosine_ops) +WITH (lists = 1000); + +-- Set probes at query time +SET ivfflat.probes = 10; -- Default is 1, higher = better recall +``` + +### Query-Time Optimization + +For HNSW, set `hnsw.ef_search`: + +```sql +-- Default: ef_search = 40 +-- Higher = better recall but slower queries +SET hnsw.ef_search = 100; + +SELECT vector <=> $1::vector AS distance +FROM embeddings +WHERE project_id = 123 +ORDER BY distance +LIMIT 10; +``` + +### Dimension-Specific Indexes + +For multi-dimensional projects: + +```sql +-- Separate indexes per common dimension +CREATE INDEX idx_embeddings_768_vector +ON embeddings USING hnsw (vector vector_cosine_ops) +WHERE vector_dim = 768; + +CREATE INDEX idx_embeddings_1536_vector +ON embeddings USING hnsw (vector vector_cosine_ops) +WHERE vector_dim = 1536; + +CREATE INDEX idx_embeddings_3072_vector +ON embeddings USING hnsw (vector vector_cosine_ops) +WHERE vector_dim = 3072; +``` + +**Benefit:** Smaller indexes = faster queries for specific dimensions + +## Caching Strategies + +### 1. System Definitions Cache + +System definitions rarely change: + +```go +var ( + systemDefsCache []models.Definition + systemDefsCacheMu sync.RWMutex + systemDefsCacheTTL = 5 * time.Minute + systemDefsCacheExp time.Time +) + +func GetSystemDefinitions(ctx context.Context, pool *pgxpool.Pool) ([]models.Definition, error) { + systemDefsCacheMu.RLock() + if time.Now().Before(systemDefsCacheExp) && systemDefsCache != nil { + defer systemDefsCacheMu.RUnlock() + return systemDefsCache, nil + } + systemDefsCacheMu.RUnlock() + + // Fetch from database + defs, err := db.GetDefinitionsByOwner(ctx, "_system") + if err != nil { + return nil, err + } + + // Update cache + systemDefsCacheMu.Lock() + systemDefsCache = defs + systemDefsCacheExp = time.Now().Add(systemDefsCacheTTL) + systemDefsCacheMu.Unlock() + + return defs, nil +} +``` + +### 2. User Instances Cache + +Cache user's instance list with short TTL: + +```go +type InstanceCache struct { + data map[string][]models.Instance + mu sync.RWMutex + ttl time.Duration + exp map[string]time.Time +} + +func (c *InstanceCache) Get(userHandle string) ([]models.Instance, bool) { + c.mu.RLock() + defer c.mu.RUnlock() + + if exp, ok := c.exp[userHandle]; ok && time.Now().Before(exp) { + return c.data[userHandle], true + } + return nil, false +} + +func (c *InstanceCache) Set(userHandle string, instances []models.Instance) { + c.mu.Lock() + defer c.mu.Unlock() + + c.data[userHandle] = instances + c.exp[userHandle] = time.Now().Add(c.ttl) +} + +func (c *InstanceCache) Invalidate(userHandle string) { + c.mu.Lock() + defer c.mu.Unlock() + + delete(c.data, userHandle) + delete(c.exp, userHandle) +} +``` + +**Usage:** + +```go +var instanceCache = &InstanceCache{ + data: make(map[string][]models.Instance), + exp: make(map[string]time.Time), + ttl: 30 * time.Second, +} + +func GetUserInstances(ctx context.Context, pool *pgxpool.Pool, userHandle string) ([]models.Instance, error) { + // Check cache + if instances, ok := instanceCache.Get(userHandle); ok { + return instances, nil + } + + // Query database + instances, err := db.GetAccessibleInstances(ctx, userHandle) + if err != nil { + return nil, err + } + + // Cache results + instanceCache.Set(userHandle, instances) + return instances, nil +} +``` + +### 3. Project Metadata Cache + +Cache project metadata including schema: + +```go +type ProjectCache struct { + projects map[string]*models.Project // key: "owner/handle" + mu sync.RWMutex + ttl time.Duration +} + +func (c *ProjectCache) Get(owner, handle string) (*models.Project, bool) { + key := fmt.Sprintf("%s/%s", owner, handle) + c.mu.RLock() + defer c.mu.RUnlock() + + project, ok := c.projects[key] + return project, ok +} +``` + +### 4. Redis-Based Caching + +For distributed deployments: + +```go +import "github.com/go-redis/redis/v8" + +type RedisCache struct { + client *redis.Client + ttl time.Duration +} + +func (c *RedisCache) GetProject(ctx context.Context, owner, handle string) (*models.Project, error) { + key := fmt.Sprintf("project:%s:%s", owner, handle) + + data, err := c.client.Get(ctx, key).Bytes() + if err == redis.Nil { + return nil, nil // Not in cache + } else if err != nil { + return nil, err + } + + var project models.Project + err = json.Unmarshal(data, &project) + return &project, err +} + +func (c *RedisCache) SetProject(ctx context.Context, project *models.Project) error { + key := fmt.Sprintf("project:%s:%s", project.Owner, project.Handle) + data, err := json.Marshal(project) + if err != nil { + return err + } + + return c.client.Set(ctx, key, data, c.ttl).Err() +} +``` + +### Cache Invalidation + +Always invalidate cache on updates: + +```go +func UpdateProject(ctx context.Context, pool *pgxpool.Pool, project *models.Project) error { + // Update database + err := db.UpdateProject(ctx, project) + if err != nil { + return err + } + + // Invalidate cache + projectCache.Invalidate(project.Owner, project.Handle) + + return nil +} +``` + +## Connection Pool Optimization + +### Pool Configuration + +```go +func InitDB(opts *models.Options) *pgxpool.Pool { + config, err := pgxpool.ParseConfig(connString) + if err != nil { + log.Fatal(err) + } + + // Connection pool settings + config.MaxConns = 20 // Max concurrent connections + config.MinConns = 5 // Keep-alive connections + config.MaxConnLifetime = time.Hour // Recycle connections + config.MaxConnIdleTime = 5 * time.Minute // Close idle connections + config.HealthCheckPeriod = time.Minute // Health check frequency + + // Statement cache + config.ConnConfig.StatementCacheCapacity = 100 + + pool, err := pgxpool.NewWithConfig(context.Background(), config) + return pool +} +``` + +### Pool Sizing + +**General rule:** + +``` +MaxConns = (available_cores * 2) + effective_spindle_count +``` + +**Example scenarios:** + +- **4-core CPU, SSD**: MaxConns = 10-20 +- **8-core CPU, SSD**: MaxConns = 20-40 +- **Under heavy load**: Start conservative, increase based on monitoring + +### Monitor Pool Usage + +```go +func monitorPool(pool *pgxpool.Pool) { + ticker := time.NewTicker(30 * time.Second) + + for range ticker.C { + stat := pool.Stat() + log.Printf("Pool stats: total=%d, idle=%d, acquired=%d, waiting=%d", + stat.TotalConns(), + stat.IdleConns(), + stat.AcquiredConns(), + stat.MaxConns()-stat.TotalConns(), + ) + } +} +``` + +## Performance Testing + +### Load Testing Setup + +Use [vegeta](https://github.com/tsenart/vegeta) for HTTP load testing: + +```bash +# Install vegeta +go install github.com/tsenart/vegeta@latest + +# Create targets file +cat > targets.txt < old.txt +# Make changes +go test -bench=. -benchmem ./... > new.txt +benchcmp old.txt new.txt +``` + +### Database Performance Testing + +Test with realistic data: + +```sql +-- Generate test data +INSERT INTO embeddings (project_id, text_id, vector, vector_dim, metadata) +SELECT + 1, + 'doc_' || generate_series, + array_fill(random()::real, ARRAY[1536])::vector, + 1536, + '{"author": "test"}'::jsonb +FROM generate_series(1, 100000); + +-- Test query performance +EXPLAIN (ANALYZE, BUFFERS) +SELECT text_id, vector <=> $1::vector AS distance +FROM embeddings +WHERE project_id = 1 AND vector_dim = 1536 +ORDER BY distance +LIMIT 10; +``` + +## Metrics and Monitoring + +### Application Metrics + +Track key metrics: + +```go +type Metrics struct { + QueryDuration prometheus.Histogram + QueryCount prometheus.Counter + CacheHits prometheus.Counter + CacheMisses prometheus.Counter + PoolWaitDuration prometheus.Histogram +} + +func recordQueryMetrics(start time.Time, query string) { + duration := time.Since(start).Seconds() + metrics.QueryDuration.Observe(duration) + metrics.QueryCount.Inc() +} +``` + +### PostgreSQL Metrics + +Monitor database performance: + +```sql +-- Slow queries +SELECT + query, + calls, + total_time, + mean_time, + max_time +FROM pg_stat_statements +ORDER BY mean_time DESC +LIMIT 10; + +-- Table statistics +SELECT + schemaname, + tablename, + n_tup_ins, + n_tup_upd, + n_tup_del, + n_live_tup, + n_dead_tup +FROM pg_stat_user_tables; + +-- Index usage +SELECT + schemaname, + tablename, + indexname, + idx_scan, + idx_tup_read, + idx_tup_fetch +FROM pg_stat_user_indexes +ORDER BY idx_scan DESC; +``` + +### Performance Targets + +Based on typical usage: + +| Operation | Target | Acceptable | Action Required | +|-----------|--------|------------|-----------------| +| Single instance lookup | < 10ms | < 50ms | > 50ms | +| List accessible instances (<100) | < 50ms | < 100ms | > 100ms | +| Create/update instance | < 100ms | < 200ms | > 200ms | +| Similarity search (10 results) | < 50ms | < 100ms | > 100ms | +| Similarity search (100 results) | < 100ms | < 200ms | > 200ms | +| Embedding insert (single) | < 50ms | < 100ms | > 100ms | +| Embedding batch (100) | < 500ms | < 1000ms | > 1000ms | + +## Implementation Priority + +### High Priority + +1. **Profile current performance** with realistic data +2. **Add dimension filtering index**: `idx_embeddings_project_dim` +3. **Monitor slow queries** with pg_stat_statements + +### Medium Priority + +1. **Implement UNION ALL optimization** if GetAllAccessibleInstances > 100ms +2. **Add caching for system definitions** +3. **Optimize connection pool settings** based on load + +### Low Priority + +1. **Add Redis caching layer** for high-traffic deployments +2. **Implement application metrics** with Prometheus +3. **Add additional indexes** based on actual query patterns +4. **Tune HNSW parameters** for specific use cases + +## General Best Practices + +### 1. Measure Before Optimizing + +```bash +# Profile in production +EXPLAIN ANALYZE your_query; + +# Load test +vegeta attack -duration=60s -rate=100 + +# Benchmark +go test -bench=. -benchmem +``` + +### 2. Start Conservative + +- Don't optimize prematurely +- Use simple queries first +- Add complexity only when needed + +### 3. Monitor Continuously + +- Track query performance +- Monitor connection pool +- Watch index usage +- Alert on slow queries + +### 4. Document Optimizations + +- Note why optimization was needed +- Include before/after metrics +- Document trade-offs made + +## Further Reading + +- [PostgreSQL Performance Tuning](https://wiki.postgresql.org/wiki/Performance_Optimization) +- [pgvector Performance Guide](https://github.com/pgvector/pgvector#performance) +- [Go Database/SQL Tutorial](https://go.dev/doc/database/querying) +- [Connection Pool Best Practices](https://github.com/jackc/pgx/wiki/Connection-Pool-Best-Practices) diff --git a/docs/content/development/testing.md b/docs/content/development/testing.md new file mode 100644 index 0000000..b59372d --- /dev/null +++ b/docs/content/development/testing.md @@ -0,0 +1,510 @@ +--- +title: "Testing" +weight: 1 +--- + +# Testing Guide + +This guide covers how to run and write tests for dhamps-vdb. + +## Running Tests + +dhamps-vdb uses integration tests that spin up real PostgreSQL containers using [testcontainers](https://testcontainers.com/guides/getting-started-with-testcontainers-for-go/). This approach ensures tests run against actual database instances with pgvector support. + +### Prerequisites + +**Using Podman (Recommended for Linux):** + +```bash +# Start podman socket +systemctl --user start podman.socket + +# Export DOCKER_HOST for testcontainers +export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/podman/podman.sock +``` + +**Using Docker:** + +Testcontainers works with Docker out of the box. Ensure Docker daemon is running. + +### Running All Tests + +```bash +# Run all tests with verbose output +go test -v ./... + +# Run tests without verbose output +go test ./... + +# Run tests with coverage +go test -cover ./... + +# Generate coverage report +go test -coverprofile=coverage.out ./... +go tool cover -html=coverage.out +``` + +### Running Specific Tests + +```bash +# Run tests in a specific package +go test -v ./internal/handlers + +# Run a specific test function +go test -v ./internal/handlers -run TestCreateUser + +# Run tests matching a pattern +go test -v ./... -run ".*Sharing.*" +``` + +### Test Containers + +Tests automatically manage PostgreSQL containers with pgvector: + +- Container is started before tests run +- Database schema is migrated automatically +- Container is cleaned up after tests complete +- Each test suite gets a fresh database state + +## Test Structure + +### Package Organization + +Tests are organized alongside the code they test: + +``` +internal/ +├── handlers/ +│ ├── users.go # Implementation +│ ├── users_test.go # Unit/integration tests +│ ├── projects.go +│ ├── projects_test.go +│ ├── projects_sharing_test.go +│ ├── embeddings.go +│ └── embeddings_test.go +├── database/ +│ └── queries.sql # SQL queries for sqlc +└── models/ + ├── users.go + └── projects.go +``` + +### Test Files + +Test files follow Go conventions: + +- **Filename**: `*_test.go` +- **Package**: Same as code under test (e.g., `package handlers`) +- **Test functions**: `func TestFunctionName(t *testing.T)` + +### Test Fixtures + +Test data is stored in `testdata/`: + +``` +testdata/ +├── postgres/ +│ ├── enable-vector.sql # Database initialization +│ └── users.yml +├── valid_embeddings.json +├── valid_user.json +├── valid_api_standard_openai_v1.json +├── valid_llm_service_openai-large-full.json +└── invalid_embeddings.json +``` + +## Writing Tests + +### Basic Test Structure + +```go +package handlers + +import ( + "context" + "testing" + + "github.com/mpilhlt/dhamps-vdb/internal/database" + "github.com/stretchr/testify/assert" +) + +func TestCreateUser(t *testing.T) { + // Setup: Initialize database pool and test data + ctx := context.Background() + pool := setupTestDatabase(t) + defer pool.Close() + + // Execute: Call function under test + result, err := CreateUser(ctx, pool, userData) + + // Assert: Verify results + assert.NoError(t, err) + assert.Equal(t, "testuser", result.UserHandle) +} +``` + +### Integration Test Example + +```go +func TestProjectSharingWorkflow(t *testing.T) { + pool := setupTestDatabase(t) + defer pool.Close() + + // Create test users + alice := createTestUser(t, pool, "alice") + bob := createTestUser(t, pool, "bob") + + // Create project as Alice + project := createTestProject(t, pool, alice, "test-project") + + // Share project with Bob + err := shareProject(t, pool, alice, project.ID, bob.Handle, "reader") + assert.NoError(t, err) + + // Verify Bob can access project + projects := getAccessibleProjects(t, pool, bob) + assert.Contains(t, projects, project) +} +``` + +### Testing with Testcontainers + +```go +func setupTestDatabase(t *testing.T) *pgxpool.Pool { + ctx := context.Background() + + // Create PostgreSQL container with pgvector + req := testcontainers.ContainerRequest{ + Image: "pgvector/pgvector:0.7.4-pg16", + ExposedPorts: []string{"5432/tcp"}, + Env: map[string]string{ + "POSTGRES_PASSWORD": "password", + "POSTGRES_DB": "testdb", + }, + WaitStrategy: wait.ForLog("database system is ready to accept connections"), + } + + container, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{ + ContainerRequest: req, + Started: true, + }) + require.NoError(t, err) + + // Get connection details + host, _ := container.Host(ctx) + port, _ := container.MappedPort(ctx, "5432") + + // Connect and migrate + connString := fmt.Sprintf("postgres://postgres:password@%s:%s/testdb", host, port.Port()) + pool := connectAndMigrate(t, connString) + + return pool +} +``` + +### Validation Testing + +Test dimension and metadata validation: + +```go +func TestEmbeddingDimensionValidation(t *testing.T) { + pool := setupTestDatabase(t) + defer pool.Close() + + // Create LLM service with 1536 dimensions + llmService := createTestLLMService(t, pool, "openai", 1536) + + // Try to insert embedding with wrong dimensions + embedding := models.Embedding{ + TextID: "doc1", + Vector: make([]float32, 768), // Wrong size! + VectorDim: 768, + } + + err := insertEmbedding(t, pool, embedding) + assert.Error(t, err) + assert.Contains(t, err.Error(), "dimension mismatch") +} + +func TestMetadataSchemaValidation(t *testing.T) { + pool := setupTestDatabase(t) + defer pool.Close() + + // Create project with metadata schema + schema := `{"type":"object","properties":{"author":{"type":"string"}},"required":["author"]}` + project := createProjectWithSchema(t, pool, "alice", "test", schema) + + // Valid metadata should succeed + validMeta := `{"author":"John Doe"}` + err := insertEmbeddingWithMetadata(t, pool, project.ID, validMeta) + assert.NoError(t, err) + + // Invalid metadata should fail + invalidMeta := `{"year":2024}` // Missing required 'author' + err = insertEmbeddingWithMetadata(t, pool, project.ID, invalidMeta) + assert.Error(t, err) +} +``` + +### Table-Driven Tests + +For testing multiple scenarios: + +```go +func TestSimilaritySearch(t *testing.T) { + tests := []struct { + name string + threshold float32 + limit int + expected int + }{ + {"high threshold", 0.9, 10, 2}, + {"medium threshold", 0.7, 10, 5}, + {"low threshold", 0.5, 10, 8}, + {"with limit", 0.5, 3, 3}, + } + + pool := setupTestDatabase(t) + defer pool.Close() + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + results := searchSimilar(t, pool, "doc1", tt.threshold, tt.limit) + assert.Len(t, results, tt.expected) + }) + } +} +``` + +### Cleanup Testing + +Verify database cleanup: + +```go +func TestDatabaseCleanup(t *testing.T) { + pool := setupTestDatabase(t) + defer pool.Close() + + // Create test data + user := createTestUser(t, pool, "alice") + project := createTestProject(t, pool, user, "test") + createTestEmbeddings(t, pool, project, 10) + + // Delete user (should cascade) + err := deleteUser(t, pool, user.Handle) + assert.NoError(t, err) + + // Verify all related data is deleted + assertTableEmpty(t, pool, "users") + assertTableEmpty(t, pool, "projects") + assertTableEmpty(t, pool, "embeddings") +} +``` + +## Test Helpers + +Create helper functions to reduce boilerplate: + +```go +// setupTestDatabase initializes a test database with migrations +func setupTestDatabase(t *testing.T) *pgxpool.Pool { + // Implementation... +} + +// createTestUser creates a user for testing +func createTestUser(t *testing.T, pool *pgxpool.Pool, handle string) *models.User { + // Implementation... +} + +// createTestProject creates a project for testing +func createTestProject(t *testing.T, pool *pgxpool.Pool, owner *models.User, handle string) *models.Project { + // Implementation... +} + +// assertTableEmpty verifies a table has no rows +func assertTableEmpty(t *testing.T, pool *pgxpool.Pool, tableName string) { + var count int + err := pool.QueryRow(context.Background(), + fmt.Sprintf("SELECT COUNT(*) FROM %s", tableName)).Scan(&count) + require.NoError(t, err) + assert.Equal(t, 0, count, "table %s should be empty", tableName) +} +``` + +## CI/CD Integration + +### GitHub Actions + +Example GitHub Actions workflow: + +```yaml +name: Tests + +on: [push, pull_request] + +jobs: + test: + runs-on: ubuntu-latest + + steps: + - uses: actions/checkout@v3 + + - name: Set up Go + uses: actions/setup-go@v4 + with: + go-version: '1.21' + + - name: Run tests + run: | + go test -v -race -coverprofile=coverage.txt ./... + + - name: Upload coverage + uses: codecov/codecov-action@v3 + with: + files: ./coverage.txt +``` + +### Local CI Testing + +Test as CI would: + +```bash +# Clean test with race detection +go clean -testcache +go test -v -race ./... + +# Test with coverage +go test -coverprofile=coverage.out ./... + +# Check for test caching issues +go test -count=1 ./... +``` + +## Best Practices + +### 1. **Use testcontainers for Real Databases** +- Don't mock the database layer +- Test against actual PostgreSQL with pgvector +- Catch SQL-specific issues + +### 2. **Isolate Tests** +- Each test should be independent +- Use transactions or cleanup between tests +- Don't rely on test execution order + +### 3. **Test Validation Logic** +- Test dimension validation +- Test metadata schema validation +- Test authorization checks + +### 4. **Test Error Conditions** +- Invalid input +- Missing resources (404) +- Unauthorized access (403) +- Constraint violations + +### 5. **Keep Tests Fast** +- Use parallel tests where possible: `t.Parallel()` +- Reuse test database containers when appropriate +- Avoid unnecessary sleeps + +### 6. **Use Descriptive Names** +```go +// Good +func TestProjectSharingWithReaderRole(t *testing.T) + +// Less clear +func TestSharing(t *testing.T) +``` + +### 7. **Assert Meaningfully** +```go +// Good - specific assertion +assert.Equal(t, "alice", project.Owner) + +// Less helpful +assert.True(t, project.Owner == "alice") +``` + +## Debugging Tests + +### Verbose Output + +```bash +# See all test output +go test -v ./internal/handlers + +# See SQL queries (if logging enabled) +SERVICE_DEBUG=true go test -v ./... +``` + +### Run Single Test + +```bash +# Focus on one failing test +go test -v ./internal/handlers -run TestCreateProject +``` + +### Keep Test Database Running + +For manual inspection, prevent container cleanup: + +```go +func TestWithDebugContainer(t *testing.T) { + container := setupContainer(t) + // Comment out: defer container.Terminate(ctx) + + host, port := getContainerDetails(container) + t.Logf("Database running at %s:%s", host, port) + + // Run test... + + // Container stays running for manual inspection + time.Sleep(time.Hour) +} +``` + +Then connect with psql: + +```bash +psql -h localhost -p -U postgres -d testdb +``` + +## Common Issues + +### Container Startup Failures + +```bash +# Check Docker/Podman is running +systemctl --user status podman.socket + +# Check for port conflicts +netstat -tulpn | grep 5432 + +# Clean up containers +podman rm -f $(podman ps -aq) +``` + +### Test Timeouts + +Increase timeout for slow containers: + +```go +WaitStrategy: wait.ForLog("ready").WithStartupTimeout(2 * time.Minute) +``` + +### Permission Errors + +Ensure test user has proper permissions: + +```sql +GRANT ALL PRIVILEGES ON DATABASE testdb TO testuser; +GRANT ALL ON SCHEMA public TO testuser; +``` + +## Further Reading + +- [Go Testing Documentation](https://pkg.go.dev/testing) +- [Testcontainers for Go](https://golang.testcontainers.org/) +- [Testify Assertions](https://github.com/stretchr/testify) +- [Table-Driven Tests in Go](https://dave.cheney.net/2019/05/07/prefer-table-driven-tests) diff --git a/docs/content/getting-started/_index.md b/docs/content/getting-started/_index.md new file mode 100644 index 0000000..2b30682 --- /dev/null +++ b/docs/content/getting-started/_index.md @@ -0,0 +1,31 @@ +--- +title: "Getting Started" +weight: 1 +--- + +# Getting Started with dhamps-vdb + +This section helps you get dhamps-vdb up and running quickly. Whether you're using Docker or compiling from source, you'll find everything you need to start using the vector database API. + +## What You'll Learn + +- How to install and run dhamps-vdb +- How to configure the service for your environment +- Basic usage patterns and workflows +- Creating your first project and embeddings + +## Prerequisites + +Before you begin, ensure you have: + +- PostgreSQL 11+ with pgvector extension (or use the provided Docker setup) +- Go 1.21+ (if compiling from source) +- Docker and Docker Compose (for containerized deployment) + +## Quick Links + +- [Installation](installation/) - Compile and install dhamps-vdb +- [Docker Deployment](docker/) - Run with Docker (recommended) +- [Configuration](configuration/) - Environment variables and options +- [Quick Start](quick-start/) - Your first API requests +- [First Project](first-project/) - Complete walkthrough diff --git a/docs/content/getting-started/configuration.md b/docs/content/getting-started/configuration.md new file mode 100644 index 0000000..8997b24 --- /dev/null +++ b/docs/content/getting-started/configuration.md @@ -0,0 +1,156 @@ +--- +title: "Configuration" +weight: 2 +--- + +# Configuration + +Configure dhamps-vdb using environment variables or command-line options. + +## Environment Variables + +All configuration can be set via environment variables. Use a `.env` file to keep sensitive information secure. + +### Service Configuration + +| Variable | Description | Default | Required | +|----------|-------------|---------|----------| +| `SERVICE_DEBUG` | Enable debug logging | `true` | No | +| `SERVICE_HOST` | Hostname to listen on | `localhost` | No | +| `SERVICE_PORT` | Port to listen on | `8880` | No | + +### Database Configuration + +| Variable | Description | Default | Required | +|----------|-------------|---------|----------| +| `SERVICE_DBHOST` | Database hostname | `localhost` | Yes | +| `SERVICE_DBPORT` | Database port | `5432` | No | +| `SERVICE_DBUSER` | Database username | `postgres` | Yes | +| `SERVICE_DBPASSWORD` | Database password | `password` | Yes | +| `SERVICE_DBNAME` | Database name | `postgres` | Yes | + +### Security Configuration + +| Variable | Description | Default | Required | +|----------|-------------|---------|----------| +| `SERVICE_ADMINKEY` | Admin API key for administrative operations | - | Yes | +| `ENCRYPTION_KEY` | Encryption key for API keys (32+ characters) | - | Yes | + +## Configuration File + +Create a `.env` file in the project root: + +```bash +# Service Configuration +SERVICE_DEBUG=false +SERVICE_HOST=0.0.0.0 +SERVICE_PORT=8880 + +# Database Configuration +SERVICE_DBHOST=localhost +SERVICE_DBPORT=5432 +SERVICE_DBUSER=dhamps_user +SERVICE_DBPASSWORD=secure_password +SERVICE_DBNAME=dhamps_vdb + +# Security +SERVICE_ADMINKEY=your-secure-admin-key-here +ENCRYPTION_KEY=your-32-character-encryption-key-minimum +``` + +## Command-Line Options + +You can also provide configuration via command-line flags: + +```bash +./dhamps-vdb \ + --debug \ + -p 8880 \ + --db-host localhost \ + --db-port 5432 \ + --db-user dhamps_user \ + --db-password secure_password \ + --db-name dhamps_vdb \ + --admin-key your-admin-key +``` + +## Generating Secure Keys + +### Admin Key + +Generate a secure admin key: + +```bash +openssl rand -base64 32 +``` + +### Encryption Key + +Generate a secure encryption key (minimum 32 characters): + +```bash +openssl rand -hex 32 +``` + +## Configuration Priority + +Configuration is loaded in the following order (later sources override earlier ones): + +1. Default values (from `options.go`) +2. Environment variables +3. `.env` file +4. Command-line flags + +## Security Best Practices + +- **Never commit `.env` files** to version control +- Use **strong, randomly generated keys** for production +- Ensure `.env` file permissions are restrictive (`chmod 600 .env`) +- **Store encryption key securely** - losing it means losing access to encrypted API keys +- Use different keys for development and production environments + +## Example Configuration + +### Development + +```bash +# .env (development) +SERVICE_DEBUG=true +SERVICE_HOST=localhost +SERVICE_PORT=8880 +SERVICE_DBHOST=localhost +SERVICE_DBPORT=5432 +SERVICE_DBUSER=postgres +SERVICE_DBPASSWORD=password +SERVICE_DBNAME=dhamps_vdb_dev +SERVICE_ADMINKEY=dev-admin-key-change-me +ENCRYPTION_KEY=dev-encryption-key-32-chars-min +``` + +### Production + +```bash +# .env (production) +SERVICE_DEBUG=false +SERVICE_HOST=0.0.0.0 +SERVICE_PORT=8880 +SERVICE_DBHOST=prod-db.example.com +SERVICE_DBPORT=5432 +SERVICE_DBUSER=dhamps_prod +SERVICE_DBPASSWORD=$(cat /run/secrets/db_password) +SERVICE_DBNAME=dhamps_vdb +SERVICE_ADMINKEY=$(cat /run/secrets/admin_key) +ENCRYPTION_KEY=$(cat /run/secrets/encryption_key) +``` + +## Validation + +The service validates configuration on startup and will exit with an error if required variables are missing. + +## Next Steps + +After configuration: + +1. [Run the Quick Start tutorial](quick-start/) +2. [Create your first project](first-project/) +3. Review [deployment options](../deployment/) diff --git a/docs/content/getting-started/docker.md b/docs/content/getting-started/docker.md new file mode 100644 index 0000000..82504d8 --- /dev/null +++ b/docs/content/getting-started/docker.md @@ -0,0 +1,537 @@ +--- +title: "Docker Deployment" +weight: 2 +--- + +# Docker Deployment + +Deploy dhamps-vdb using Docker containers. This is the recommended approach for most users. + +## Quick Start + +The fastest way to get dhamps-vdb running with Docker: + +```bash +# Clone the repository +git clone https://github.com/mpilhlt/dhamps-vdb.git +cd dhamps-vdb + +# Run automated setup (generates secure keys) +./docker-setup.sh + +# Start services with docker-compose +docker-compose up -d + +# Check logs +docker-compose logs -f dhamps-vdb + +# Access the API +curl http://localhost:8880/docs +``` + +## What's Included + +The Docker Compose setup includes: + +- **dhamps-vdb**: The vector database API service +- **PostgreSQL 16**: Database with pgvector extension +- **Persistent storage**: Named volume for database data + +## Configuration Files + +### .env File + +All configuration is managed through environment variables. Copy the template: + +```bash +cp .env.docker.template .env +``` + +Edit `.env` to set required values: + +```bash +# Admin API key for administrative operations +SERVICE_ADMINKEY=your-secure-admin-key-here + +# Encryption key for API keys (32+ characters) +ENCRYPTION_KEY=your-secure-encryption-key-min-32-chars + +# Database password +SERVICE_DBPASSWORD=secure-database-password + +# Optional: Debug mode +SERVICE_DEBUG=false + +# Optional: Change ports +API_PORT=8880 +POSTGRES_PORT=5432 +``` + +### docker-compose.yml + +The compose file defines two services: + +```yaml +services: + postgres: + image: pgvector/pgvector:0.7.4-pg16 + # PostgreSQL with pgvector support + + dhamps-vdb: + build: . + # The API service + depends_on: + - postgres +``` + +## Deployment Options + +### Option 1: Docker Compose with Included Database (Recommended) + +Use the provided `docker-compose.yml`: + +```bash +docker-compose up -d +``` + +**Advantages:** +- Everything included +- Automatic networking +- Data persistence +- Easy to manage + +**Use when:** +- Getting started +- Development/testing +- Small to medium deployments + +### Option 2: Standalone Container with External Database + +Run only the dhamps-vdb container: + +```bash +# Build the image +docker build -t dhamps-vdb:latest . + +# Run the container +docker run -d \ + --name dhamps-vdb \ + -p 8880:8880 \ + -e SERVICE_DBHOST=your-db-host \ + -e SERVICE_DBPORT=5432 \ + -e SERVICE_DBUSER=dbuser \ + -e SERVICE_DBPASSWORD=dbpass \ + -e SERVICE_DBNAME=dhamps_vdb \ + -e SERVICE_ADMINKEY=admin-key \ + -e ENCRYPTION_KEY=encryption-key \ + dhamps-vdb:latest +``` + +**Use when:** +- You have an existing PostgreSQL server +- Production deployments +- Need database separation + +### Option 3: Docker Compose with External Database + +Modify `docker-compose.yml` to remove the postgres service: + +```yaml +services: + dhamps-vdb: + build: . + ports: + - "${API_PORT:-8880}:8880" + environment: + SERVICE_DBHOST: external-db.example.com + # ... other variables +``` + +## Building the Image + +### Standard Build + +```bash +docker build -t dhamps-vdb:latest . +``` + +### Custom Tag + +```bash +docker build -t dhamps-vdb:v0.1.0 . +``` + +### Clean Build (No Cache) + +```bash +docker build --no-cache -t dhamps-vdb:latest . +``` + +### Multi-Stage Build Details + +The Dockerfile uses multi-stage builds for efficiency: + +1. **Builder stage**: Compiles Go code with sqlc generation +2. **Runtime stage**: Minimal Alpine image with only the binary + +Result: Small, secure image (~20MB vs 800MB+) + +## Managing Services + +### Start Services + +```bash +# Start in background +docker-compose up -d + +# Start with logs visible +docker-compose up + +# Rebuild and start +docker-compose up -d --build +``` + +### View Logs + +```bash +# All logs +docker-compose logs + +# Follow logs in real-time +docker-compose logs -f + +# Specific service +docker-compose logs -f dhamps-vdb +docker-compose logs -f postgres +``` + +### Stop Services + +```bash +# Stop containers (keeps data) +docker-compose stop + +# Stop and remove containers (keeps data) +docker-compose down + +# Stop and remove everything including data +docker-compose down -v +``` + +### Restart Services + +```bash +# Restart all +docker-compose restart + +# Restart specific service +docker-compose restart dhamps-vdb +``` + +## Data Persistence + +### Docker Volumes + +The compose file creates a named volume: + +```yaml +volumes: + postgres_data: +``` + +This ensures database data persists across container restarts. + +### View Volumes + +```bash +docker volume ls +``` + +### Inspect Volume + +```bash +docker volume inspect dhamps-vdb_postgres_data +``` + +### Backup Database + +```bash +# Create backup +docker-compose exec postgres pg_dump -U postgres dhamps_vdb > backup.sql + +# Restore from backup +docker-compose exec -T postgres psql -U postgres dhamps_vdb < backup.sql +``` + +## Networking + +### Access from Host + +The API is accessible at: + +``` +http://localhost:8880 +``` + +### Access from Other Containers + +Use the service name as hostname: + +``` +http://dhamps-vdb:8880 +``` + +### Custom Network + +To use an existing Docker network: + +```yaml +networks: + default: + external: + name: your-network-name +``` + +## Security + +### Required Environment Variables + +Two critical environment variables must be set: + +1. **SERVICE_ADMINKEY**: Admin API key +2. **ENCRYPTION_KEY**: For encrypting user API keys (32+ chars) + +### Generating Secure Keys + +```bash +# Admin key +openssl rand -base64 32 + +# Encryption key +openssl rand -hex 32 +``` + +### Production Checklist + +- [ ] Use strong, randomly generated keys +- [ ] Never commit `.env` to version control +- [ ] Run behind reverse proxy (nginx, Traefik) +- [ ] Enable HTTPS/TLS +- [ ] Restrict database network access +- [ ] Set resource limits +- [ ] Enable logging and monitoring +- [ ] Use specific image tags (not `latest`) +- [ ] Regular security updates +- [ ] Backup database regularly + +## Verification + +### Check Service Status + +```bash +# Check if containers are running +docker-compose ps + +# Expected: both services "running" or "healthy" +``` + +### Test API Access + +```bash +# Get OpenAPI documentation +curl http://localhost:8880/docs + +# Should return HTML page +``` + +### Test Database Connection + +```bash +# Connect to PostgreSQL +docker-compose exec postgres psql -U postgres -d dhamps_vdb + +# Check pgvector extension +\dx + +# Should show vector extension +``` + +### Create Test User + +```bash +curl -X POST http://localhost:8880/v1/users \ + -H "Authorization: Bearer YOUR_ADMIN_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "user_handle": "testuser", + "full_name": "Test User" + }' +``` + +## Troubleshooting + +### Container Won't Start + +Check logs: +```bash +docker-compose logs dhamps-vdb +``` + +Common issues: +- Missing `SERVICE_ADMINKEY` or `ENCRYPTION_KEY` +- Database connection failure +- Port already in use + +### Database Connection Errors + +```bash +# Check postgres health +docker-compose ps + +# Check database logs +docker-compose logs postgres + +# Test connection +docker-compose exec postgres psql -U postgres -d dhamps_vdb -c "SELECT 1;" +``` + +### Can't Connect to API + +```bash +# Check if container is running +docker ps + +# Check port mapping +docker port dhamps-vdb + +# Test from inside container +docker-compose exec dhamps-vdb wget -O- http://localhost:8880/docs + +# Test from host +curl http://localhost:8880/docs +``` + +### Permission Issues + +The container runs as non-root user `appuser` (UID 1000). If you have permission errors: + +```bash +# Check volume permissions +docker volume inspect dhamps-vdb_postgres_data +``` + +### Reset Everything + +```bash +# Stop and remove everything +docker-compose down -v + +# Remove images +docker rmi dhamps-vdb:latest +docker rmi pgvector/pgvector:0.7.4-pg16 + +# Start fresh +docker-compose up -d --build +``` + +### Build Failures + +If Docker build fails with network errors: + +```bash +# Try with host network +docker build --network=host -t dhamps-vdb:latest . +``` + +## Advanced Configuration + +### Resource Limits + +Add to `docker-compose.yml`: + +```yaml +services: + dhamps-vdb: + deploy: + resources: + limits: + cpus: '2' + memory: 2G + reservations: + cpus: '0.5' + memory: 512M +``` + +### Health Checks + +The Dockerfile includes a health check: + +```dockerfile +HEALTHCHECK --interval=30s --timeout=3s \ + CMD wget --no-verbose --tries=1 --spider http://localhost:8880/ || exit 1 +``` + +View health status: + +```bash +docker inspect --format='{{.State.Health.Status}}' dhamps-vdb +``` + +### Custom Dockerfile Builds + +You can customize the build: + +```bash +docker build \ + --build-arg GO_VERSION=1.24 \ + -t dhamps-vdb:custom . +``` + +## External Database Setup + +If using an external PostgreSQL database: + +### Prepare Database + +```sql +-- Create database +CREATE DATABASE dhamps_vdb; + +-- Create user +CREATE USER dhamps_user WITH PASSWORD 'secure_password'; + +-- Grant privileges +GRANT ALL PRIVILEGES ON DATABASE dhamps_vdb TO dhamps_user; + +-- Connect to database +\c dhamps_vdb + +-- Grant schema permissions +GRANT ALL ON SCHEMA public TO dhamps_user; + +-- Enable pgvector +CREATE EXTENSION IF NOT EXISTS vector; +``` + +### Configure dhamps-vdb + +Update `.env`: + +```bash +SERVICE_DBHOST=external-db.example.com +SERVICE_DBPORT=5432 +SERVICE_DBUSER=dhamps_user +SERVICE_DBPASSWORD=secure_password +SERVICE_DBNAME=dhamps_vdb +``` + +Then run only the dhamps-vdb service or use a standalone container. + +## Next Steps + +After successful deployment: + +1. [Configure the service](../configuration/) +2. [Create your first user](../quick-start/) +3. [Set up a project](../first-project/) +4. Review [security best practices](../../deployment/security/) diff --git a/docs/content/getting-started/first-project.md b/docs/content/getting-started/first-project.md new file mode 100644 index 0000000..a29c571 --- /dev/null +++ b/docs/content/getting-started/first-project.md @@ -0,0 +1,415 @@ +--- +title: "First Project" +weight: 4 +--- + +# First Project + +Step-by-step guide to creating your first complete project in dhamps-vdb. + +## Overview + +This guide walks you through creating a complete RAG (Retrieval Augmented Generation) workflow: + +1. Set up authentication +2. Configure an LLM service +3. Create a project with metadata validation +4. Upload document embeddings +5. Search for similar documents +6. Share your project with collaborators + +## Step 1: Authentication Setup + +### Get Your API Key + +If you're an admin, create your first user: + +```bash +curl -X POST http://localhost:8880/v1/users \ + -H "Authorization: Bearer YOUR_ADMIN_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "user_handle": "researcher1", + "name": "Research User", + "email": "researcher@example.com" + }' +``` + +Save the returned `vdb_key` to a variable: + +```bash +export USER_KEY="your-returned-vdb-key" +``` + +### Verify Authentication + +Test your API key: + +```bash +curl -X GET http://localhost:8880/v1/users/researcher1 \ + -H "Authorization: Bearer $USER_KEY" +``` + +## Step 2: Configure LLM Service + +### Option A: Use System Definition + +List available system definitions: + +```bash +curl -X GET http://localhost:8880/v1/llm-services/_system \ + -H "Authorization: Bearer $USER_KEY" +``` + +Create an instance from a system definition: + +```bash +curl -X PUT http://localhost:8880/v1/llm-services/researcher1/my-embeddings \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "definition_owner": "_system", + "definition_handle": "openai-large", + "description": "My OpenAI embeddings instance", + "api_key_encrypted": "sk-proj-your-openai-api-key" + }' +``` + +### Option B: Create Custom Instance + +Create a standalone instance with custom configuration: + +```bash +curl -X PUT http://localhost:8880/v1/llm-services/researcher1/custom-embeddings \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "endpoint": "https://api.openai.com/v1/embeddings", + "api_standard": "openai", + "model": "text-embedding-3-small", + "dimensions": 1536, + "description": "Custom OpenAI small embeddings", + "api_key_encrypted": "sk-proj-your-api-key" + }' +``` + +## Step 3: Create Project with Metadata Schema + +Define a metadata schema to ensure consistent document metadata: + +```bash +curl -X POST http://localhost:8880/v1/projects/researcher1 \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "literature-analysis", + "description": "Literary texts for research analysis", + "instance_owner": "researcher1", + "instance_handle": "my-embeddings", + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"author\":{\"type\":\"string\"},\"title\":{\"type\":\"string\"},\"year\":{\"type\":\"integer\"},\"genre\":{\"type\":\"string\",\"enum\":[\"poetry\",\"prose\",\"drama\"]},\"language\":{\"type\":\"string\"}},\"required\":[\"author\",\"title\",\"year\"]}" + }' +``` + +This schema requires `author`, `title`, and `year` fields, with optional `genre` and `language` fields. + +## Step 4: Upload Document Embeddings + +### Prepare Your Data + +Create a file `embeddings.json` with your document embeddings: + +```json +{ + "embeddings": [ + { + "text_id": "hamlet-act1-scene1", + "instance_handle": "my-embeddings", + "text": "Who's there? Nay, answer me: stand, and unfold yourself.", + "vector": [0.023, -0.015, 0.087, ...], + "vector_dim": 3072, + "metadata": { + "author": "William Shakespeare", + "title": "Hamlet", + "year": 1603, + "genre": "drama", + "language": "English" + } + }, + { + "text_id": "paradise-lost-book1-line1", + "instance_handle": "my-embeddings", + "text": "Of Man's first disobedience, and the fruit...", + "vector": [0.045, -0.032, 0.091, ...], + "vector_dim": 3072, + "metadata": { + "author": "John Milton", + "title": "Paradise Lost", + "year": 1667, + "genre": "poetry", + "language": "English" + } + } + ] +} +``` + +### Upload Embeddings + +```bash +curl -X POST http://localhost:8880/v1/embeddings/researcher1/literature-analysis \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d @embeddings.json +``` + +### Verify Upload + +List all embeddings: + +```bash +curl -X GET "http://localhost:8880/v1/embeddings/researcher1/literature-analysis?limit=10" \ + -H "Authorization: Bearer $USER_KEY" +``` + +Get a specific embedding: + +```bash +curl -X GET http://localhost:8880/v1/embeddings/researcher1/literature-analysis/hamlet-act1-scene1 \ + -H "Authorization: Bearer $USER_KEY" +``` + +## Step 5: Search Similar Documents + +### Basic Similarity Search + +Find passages similar to Hamlet Act 1: + +```bash +curl -X GET "http://localhost:8880/v1/similars/researcher1/literature-analysis/hamlet-act1-scene1?count=5&threshold=0.7" \ + -H "Authorization: Bearer $USER_KEY" +``` + +**Response:** + +```json +{ + "user_handle": "researcher1", + "project_handle": "literature-analysis", + "results": [ + { + "id": "hamlet-act2-scene1", + "similarity": 0.89 + }, + { + "id": "macbeth-act1-scene3", + "similarity": 0.82 + }, + { + "id": "othello-act3-scene3", + "similarity": 0.76 + } + ] +} +``` + +### Search with Metadata Filtering + +Exclude passages from the same work: + +```bash +curl -X GET "http://localhost:8880/v1/similars/researcher1/literature-analysis/hamlet-act1-scene1?count=5&metadata_path=title&metadata_value=Hamlet" \ + -H "Authorization: Bearer $USER_KEY" +``` + +This excludes all documents where `metadata.title` equals "Hamlet". + +### Search with Raw Embeddings + +Search using a new embedding without storing it: + +```bash +curl -X POST "http://localhost:8880/v1/similars/researcher1/literature-analysis?count=5&threshold=0.7" \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [0.034, -0.021, 0.092, ...] + }' +``` + +## Step 6: Share Your Project + +### Share with Collaborators + +Grant read-only access to another user: + +```bash +curl -X POST http://localhost:8880/v1/projects/researcher1/literature-analysis/share \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "share_with_handle": "colleague1", + "role": "reader" + }' +``` + +Grant edit access: + +```bash +curl -X POST http://localhost:8880/v1/projects/researcher1/literature-analysis/share \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "share_with_handle": "colleague2", + "role": "editor" + }' +``` + +### Make Project Public + +Enable public read access (no authentication required): + +```bash +curl -X PATCH http://localhost:8880/v1/projects/researcher1/literature-analysis \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "public_read": true + }' +``` + +Now anyone can read embeddings and search without authentication: + +```bash +# No Authorization header needed +curl -X GET http://localhost:8880/v1/embeddings/researcher1/literature-analysis/hamlet-act1-scene1 +``` + +### View Shared Users + +List all users with access to your project: + +```bash +curl -X GET http://localhost:8880/v1/projects/researcher1/literature-analysis/shared-with \ + -H "Authorization: Bearer $USER_KEY" +``` + +## Step 7: Manage Your Project + +### Update Project Description + +```bash +curl -X PATCH http://localhost:8880/v1/projects/researcher1/literature-analysis \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "description": "Updated: Shakespearean and Renaissance literature analysis" + }' +``` + +### Update Metadata Schema + +```bash +curl -X PATCH http://localhost:8880/v1/projects/researcher1/literature-analysis \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"author\":{\"type\":\"string\"},\"title\":{\"type\":\"string\"},\"year\":{\"type\":\"integer\"},\"genre\":{\"type\":\"string\"},\"language\":{\"type\":\"string\"},\"act\":{\"type\":\"integer\"},\"scene\":{\"type\":\"integer\"}},\"required\":[\"author\",\"title\",\"year\"]}" + }' +``` + +### Delete Specific Embeddings + +```bash +curl -X DELETE http://localhost:8880/v1/embeddings/researcher1/literature-analysis/hamlet-act1-scene1 \ + -H "Authorization: Bearer $USER_KEY" +``` + +### Delete All Embeddings + +```bash +curl -X DELETE http://localhost:8880/v1/embeddings/researcher1/literature-analysis \ + -H "Authorization: Bearer $USER_KEY" +``` + +## Common Patterns + +### Batch Upload Script + +```bash +#!/bin/bash + +USER_KEY="your-vdb-key" +PROJECT="researcher1/literature-analysis" +API_URL="http://localhost:8880" + +# Process multiple files +for file in data/*.json; do + echo "Uploading $file..." + curl -X POST "$API_URL/v1/embeddings/$PROJECT" \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d @"$file" +done +``` + +### Search and Filter Workflow + +```bash +# 1. Find similar documents +SIMILAR=$(curl -s -X GET "$API_URL/v1/similars/$PROJECT/doc1?count=20" \ + -H "Authorization: Bearer $USER_KEY") + +# 2. Extract IDs +IDS=$(echo $SIMILAR | jq -r '.results[].id') + +# 3. Retrieve full embeddings for similar documents +for id in $IDS; do + curl -X GET "$API_URL/v1/embeddings/$PROJECT/$id" \ + -H "Authorization: Bearer $USER_KEY" +done +``` + +## Troubleshooting + +### Validation Errors + +If metadata validation fails: + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "metadata validation failed for text_id 'doc1': year is required" +} +``` + +Check your metadata schema and ensure all required fields are present. + +### Dimension Mismatches + +If vector dimensions don't match: + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "dimension validation failed: expected 3072 dimensions, got 1536" +} +``` + +Verify your LLM service configuration and embedding dimensions. + +### Authentication Errors + +If you get 401 Unauthorized: + +- Check your API key is correct +- Ensure `Authorization: Bearer` prefix is included +- Verify the user owns the resource or has been granted access + +## Next Steps + +- [Learn about metadata validation](../guides/metadata-validation/) +- [Explore batch operations](../guides/batch-operations/) +- [Understand similarity search](../concepts/similarity-search/) +- [Review API documentation](../api/) diff --git a/docs/content/getting-started/installation.md b/docs/content/getting-started/installation.md new file mode 100644 index 0000000..ae9bcb1 --- /dev/null +++ b/docs/content/getting-started/installation.md @@ -0,0 +1,129 @@ +--- +title: "Installation" +weight: 1 +--- + +# Installation + +Install dhamps-vdb by compiling from source. + +## Prerequisites + +- **Go 1.21 or later** +- **PostgreSQL 11+** with pgvector extension +- **sqlc** for code generation + +## Quick Install + +```bash +# Clone the repository +git clone https://github.com/mpilhlt/dhamps-vdb.git +cd dhamps-vdb + +# Install dependencies and generate code +go get ./... +sqlc generate --no-remote + +# Build the binary +go build -o build/dhamps-vdb main.go +``` + +## Detailed Steps + +### 1. Install Dependencies + +Download all Go module dependencies: + +```bash +go get ./... +``` + +### 2. Generate Database Code + +Generate type-safe database queries using sqlc: + +```bash +sqlc generate --no-remote +``` + +This creates Go code from SQL queries in `internal/database/queries/`. + +### 3. Build the Application + +Compile the application: + +```bash +go build -o build/dhamps-vdb main.go +``` + +The binary will be created at `build/dhamps-vdb`. + +## Running Without Building + +You can run the application directly without building a binary: + +```bash +go run main.go +``` + +This is useful during development but slower than running a pre-built binary. + +## Verify Installation + +Check that the binary was created successfully: + +```bash +./build/dhamps-vdb --help +``` + +You should see the available command-line options. + +## Next Steps + +After installation, you need to: + +1. [Set up the database](../deployment/database/) +2. [Configure environment variables](configuration/) +3. [Run the service](quick-start/) + +## System Requirements + +- **Memory**: Minimum 512MB RAM (2GB+ recommended for production) +- **Disk**: Minimal (< 50MB for binary, database size varies) +- **CPU**: Any modern CPU (multi-core recommended for concurrent requests) + +## Troubleshooting + +### sqlc Command Not Found + +Install sqlc: + +```bash +go install github.com/sqlc-dev/sqlc/cmd/sqlc@latest +``` + +Make sure `$GOPATH/bin` is in your PATH. + +### Build Errors + +Ensure you're using Go 1.21 or later: + +```bash +go version +``` + +Clean the build cache if you encounter issues: + +```bash +go clean -cache +go build -o build/dhamps-vdb main.go +``` + +### Missing Dependencies + +Force update all dependencies: + +```bash +go mod download +go get -u ./... +``` diff --git a/docs/content/getting-started/quick-start.md b/docs/content/getting-started/quick-start.md new file mode 100644 index 0000000..5237db1 --- /dev/null +++ b/docs/content/getting-started/quick-start.md @@ -0,0 +1,292 @@ +--- +title: "Quick Start" +weight: 3 +--- + +# Quick Start + +Complete walkthrough from installation to searching similar documents using curl. + +## Prerequisites + +- dhamps-vdb installed and running +- Admin API key configured +- PostgreSQL with pgvector ready + +## 1. Create a User + +Create a new user with the admin API key: + +```bash +curl -X POST http://localhost:8880/v1/users \ + -H "Authorization: Bearer YOUR_ADMIN_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "user_handle": "alice", + "name": "Alice Smith", + "email": "alice@example.com" + }' +``` + +**Response:** + +```json +{ + "user_handle": "alice", + "name": "Alice Smith", + "email": "alice@example.com", + "vdb_key": "024v2013621509245f2e24...", + "created_at": "2024-01-15T10:30:00Z" +} +``` + +**Save the `vdb_key`** - it cannot be recovered later. + +## 2. Create an LLM Service Instance + +Create an LLM service configuration: + +```bash +curl -X PUT http://localhost:8880/v1/llm-services/alice/my-openai \ + -H "Authorization: Bearer alice_vdb_key" \ + -H "Content-Type: application/json" \ + -d '{ + "endpoint": "https://api.openai.com/v1/embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "description": "OpenAI large embeddings", + "api_key_encrypted": "sk-proj-your-openai-key" + }' +``` + +**Response:** + +```json +{ + "instance_id": 1, + "instance_handle": "my-openai", + "owner": "alice", + "endpoint": "https://api.openai.com/v1/embeddings", + "model": "text-embedding-3-large", + "dimensions": 3072 +} +``` + +## 3. Create a Project + +Create a project to organize your embeddings: + +```bash +curl -X POST http://localhost:8880/v1/projects/alice \ + -H "Authorization: Bearer alice_vdb_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "research-docs", + "description": "Research document embeddings", + "instance_owner": "alice", + "instance_handle": "my-openai" + }' +``` + +**Response:** + +```json +{ + "project_id": 1, + "project_handle": "research-docs", + "owner": "alice", + "description": "Research document embeddings", + "instance_id": 1, + "created_at": "2024-01-15T10:35:00Z" +} +``` + +## 4. Upload Embeddings + +Upload document embeddings to your project: + +```bash +curl -X POST http://localhost:8880/v1/embeddings/alice/research-docs \ + -H "Authorization: Bearer alice_vdb_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [ + { + "text_id": "doc1", + "instance_handle": "my-openai", + "text": "Introduction to machine learning", + "vector": [0.1, 0.2, 0.3, ..., 0.5], + "vector_dim": 3072, + "metadata": { + "title": "ML Intro", + "author": "Alice", + "year": 2024 + } + }, + { + "text_id": "doc2", + "instance_handle": "my-openai", + "text": "Deep learning fundamentals", + "vector": [0.15, 0.25, 0.35, ..., 0.55], + "vector_dim": 3072, + "metadata": { + "title": "DL Fundamentals", + "author": "Bob", + "year": 2024 + } + } + ] + }' +``` + +**Response:** + +```json +{ + "message": "2 embeddings uploaded successfully" +} +``` + +## 5. Search for Similar Documents + +### Option A: Search Using Stored Document + +Find documents similar to an already-stored document: + +```bash +curl -X GET "http://localhost:8880/v1/similars/alice/research-docs/doc1?count=5&threshold=0.7" \ + -H "Authorization: Bearer alice_vdb_key" +``` + +**Response:** + +```json +{ + "user_handle": "alice", + "project_handle": "research-docs", + "results": [ + { + "id": "doc2", + "similarity": 0.92 + }, + { + "id": "doc5", + "similarity": 0.85 + } + ] +} +``` + +### Option B: Search Using Raw Embeddings + +Search without storing the query embedding: + +```bash +curl -X POST "http://localhost:8880/v1/similars/alice/research-docs?count=5&threshold=0.7" \ + -H "Authorization: Bearer alice_vdb_key" \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [0.12, 0.22, 0.32, ..., 0.52] + }' +``` + +## 6. Filter by Metadata + +Exclude documents from a specific author when searching: + +```bash +curl -X GET "http://localhost:8880/v1/similars/alice/research-docs/doc1?count=5&metadata_path=author&metadata_value=Alice" \ + -H "Authorization: Bearer alice_vdb_key" +``` + +This excludes all documents where `metadata.author` equals "Alice". + +## 7. Retrieve Embeddings + +Get all embeddings in your project: + +```bash +curl -X GET "http://localhost:8880/v1/embeddings/alice/research-docs?limit=10&offset=0" \ + -H "Authorization: Bearer alice_vdb_key" +``` + +Get a specific embedding: + +```bash +curl -X GET http://localhost:8880/v1/embeddings/alice/research-docs/doc1 \ + -H "Authorization: Bearer alice_vdb_key" +``` + +## Complete Workflow Example + +Here's a complete script to get started: + +```bash +#!/bin/bash + +# Configuration +API_URL="http://localhost:8880" +ADMIN_KEY="your-admin-key" + +# 1. Create user +USER_RESPONSE=$(curl -s -X POST "$API_URL/v1/users" \ + -H "Authorization: Bearer $ADMIN_KEY" \ + -H "Content-Type: application/json" \ + -d '{"user_handle":"alice","name":"Alice Smith","email":"alice@example.com"}') + +USER_KEY=$(echo $USER_RESPONSE | jq -r '.vdb_key') +echo "User created with key: $USER_KEY" + +# 2. Create LLM service instance +curl -X PUT "$API_URL/v1/llm-services/alice/my-openai" \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "endpoint": "https://api.openai.com/v1/embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "api_key_encrypted": "sk-your-key" + }' + +# 3. Create project +curl -X POST "$API_URL/v1/projects/alice" \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "research-docs", + "description": "Research documents", + "instance_owner": "alice", + "instance_handle": "my-openai" + }' + +# 4. Upload embeddings +curl -X POST "$API_URL/v1/embeddings/alice/research-docs" \ + -H "Authorization: Bearer $USER_KEY" \ + -H "Content-Type: application/json" \ + -d @embeddings.json + +# 5. Search similar +curl -X GET "$API_URL/v1/similars/alice/research-docs/doc1?count=5" \ + -H "Authorization: Bearer $USER_KEY" + +echo "Setup complete!" +``` + +## API Documentation + +For complete API documentation, visit: + +```bash +curl http://localhost:8880/docs +``` + +Or open http://localhost:8880/docs in your browser. + +## Next Steps + +- [Learn about projects](../concepts/projects/) +- [Understand embeddings](../concepts/embeddings/) +- [Explore sharing projects](../guides/project-sharing/) +- [Set up metadata validation](../guides/metadata-validation/) diff --git a/docs/content/guides/_index.md b/docs/content/guides/_index.md new file mode 100644 index 0000000..9b042cf --- /dev/null +++ b/docs/content/guides/_index.md @@ -0,0 +1,23 @@ +--- +title: "Guides" +weight: 3 +--- + +# User Guides + +Step-by-step guides for common tasks and workflows with dhamps-vdb. + +## Available Guides + +This section contains practical guides for using dhamps-vdb in real-world scenarios: + +- **RAG Workflows** - Implement Retrieval Augmented Generation +- **Project Sharing** - Collaborate with other users +- **Public Projects** - Enable unauthenticated access +- **Ownership Transfer** - Move projects between users +- **Metadata Validation** - Ensure data quality with schemas +- **Metadata Filtering** - Exclude documents from similarity search +- **Batch Operations** - Work with multiple embeddings efficiently +- **Instance Management** - Configure LLM service instances + +These guides complement the API reference by providing context and best practices for common use cases. diff --git a/docs/content/guides/batch-operations.md b/docs/content/guides/batch-operations.md new file mode 100644 index 0000000..aaac38c --- /dev/null +++ b/docs/content/guides/batch-operations.md @@ -0,0 +1,718 @@ +--- +title: "Batch Operations Guide" +weight: 7 +--- + +# Batch Operations Guide + +This guide explains how to efficiently upload multiple embeddings, manage large datasets, and implement best practices for batch operations in dhamps-vdb. + +## Overview + +For production workloads and large datasets, efficient batch operations are essential. This guide covers: +- Uploading multiple embeddings in a single request +- Pagination strategies for large result sets +- Best practices for performance and reliability +- Error handling in batch operations + +## Batch Upload Basics + +### Single Request with Multiple Embeddings + +Upload multiple embeddings in one API call using the `embeddings` array: + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/my-project" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [ + { + "text_id": "doc001", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, 0.3, ...], + "vector_dim": 3072, + "text": "First document content", + "metadata": {"category": "science"} + }, + { + "text_id": "doc002", + "instance_handle": "openai-large", + "vector": [0.11, 0.21, 0.31, ...], + "vector_dim": 3072, + "text": "Second document content", + "metadata": {"category": "history"} + }, + { + "text_id": "doc003", + "instance_handle": "openai-large", + "vector": [0.12, 0.22, 0.32, ...], + "vector_dim": 3072, + "text": "Third document content", + "metadata": {"category": "literature"} + } + ] + }' +``` + +**Response:** +```json +{ + "message": "Embeddings uploaded successfully", + "count": 3 +} +``` + +## Optimal Batch Sizes + +### Recommended Batch Sizes + +Based on typical embedding dimensions and network constraints: + +| Embedding Dimensions | Recommended Batch Size | Maximum Batch Size | +|---------------------|------------------------|-------------------| +| 384 (small models) | 500-1000 | 2000 | +| 768 (BERT-base) | 300-500 | 1000 | +| 1536 (OpenAI small) | 100-300 | 500 | +| 3072 (OpenAI large) | 50-100 | 200 | + +**Factors to consider:** +- Network bandwidth and latency +- API gateway timeout limits +- Database transaction size +- Memory constraints +- Client-side serialization limits + +### Finding Your Optimal Batch Size + +Test different batch sizes to find the sweet spot: + +```python +import time +import requests + +def test_batch_size(embeddings, batch_size): + """Test upload performance with given batch size""" + start_time = time.time() + + for i in range(0, len(embeddings), batch_size): + batch = embeddings[i:i+batch_size] + response = requests.post( + "https://api.example.com/v1/embeddings/alice/my-project", + headers={ + "Authorization": "Bearer alice_api_key", + "Content-Type": "application/json" + }, + json={"embeddings": batch} + ) + response.raise_for_status() + + elapsed = time.time() - start_time + throughput = len(embeddings) / elapsed + print(f"Batch size {batch_size}: {throughput:.1f} embeddings/sec") + +# Test different batch sizes +for size in [50, 100, 200, 500]: + test_batch_size(my_embeddings, size) +``` + +## Pagination for Large Datasets + +### Retrieving All Embeddings with Pagination + +Use `limit` and `offset` parameters to paginate through large result sets: + +```bash +# Get first page (embeddings 0-99) +curl -X GET "https://api.example.com/v1/embeddings/alice/my-project?limit=100&offset=0" \ + -H "Authorization: Bearer alice_api_key" + +# Get second page (embeddings 100-199) +curl -X GET "https://api.example.com/v1/embeddings/alice/my-project?limit=100&offset=100" \ + -H "Authorization: Bearer alice_api_key" + +# Get third page (embeddings 200-299) +curl -X GET "https://api.example.com/v1/embeddings/alice/my-project?limit=100&offset=200" \ + -H "Authorization: Bearer alice_api_key" +``` + +### Pagination Best Practices + +**Default Values:** +- `limit`: 10 (if not specified) +- `offset`: 0 (if not specified) +- Maximum `limit`: 200 + +**Example: Download Entire Project** + +```python +import requests + +def download_all_embeddings(owner, project): + """Download all embeddings from a project""" + all_embeddings = [] + offset = 0 + limit = 100 + + while True: + response = requests.get( + f"https://api.example.com/v1/embeddings/{owner}/{project}", + headers={"Authorization": "Bearer api_key"}, + params={"limit": limit, "offset": offset} + ) + response.raise_for_status() + + batch = response.json()['embeddings'] + if not batch: + break # No more results + + all_embeddings.extend(batch) + offset += len(batch) + + print(f"Downloaded {len(all_embeddings)} embeddings...") + + return all_embeddings + +# Usage +embeddings = download_all_embeddings("alice", "my-project") +print(f"Total: {len(embeddings)} embeddings") +``` + +## Efficient Batch Upload Strategies + +### Strategy 1: Simple Sequential Upload + +Good for small to medium datasets (< 10,000 embeddings): + +```python +def upload_sequential(embeddings, batch_size=100): + """Upload embeddings sequentially in batches""" + for i in range(0, len(embeddings), batch_size): + batch = embeddings[i:i+batch_size] + response = requests.post( + "https://api.example.com/v1/embeddings/alice/my-project", + headers={ + "Authorization": "Bearer alice_api_key", + "Content-Type": "application/json" + }, + json={"embeddings": batch} + ) + response.raise_for_status() + print(f"Uploaded batch {i//batch_size + 1}, total: {i+len(batch)}") +``` + +### Strategy 2: Parallel Upload with Threading + +Good for larger datasets with stable network: + +```python +import concurrent.futures +import requests + +def upload_batch(batch, batch_num): + """Upload a single batch""" + try: + response = requests.post( + "https://api.example.com/v1/embeddings/alice/my-project", + headers={ + "Authorization": "Bearer alice_api_key", + "Content-Type": "application/json" + }, + json={"embeddings": batch}, + timeout=60 + ) + response.raise_for_status() + return batch_num, True, None + except Exception as e: + return batch_num, False, str(e) + +def upload_parallel(embeddings, batch_size=100, max_workers=4): + """Upload embeddings in parallel""" + batches = [ + embeddings[i:i+batch_size] + for i in range(0, len(embeddings), batch_size) + ] + + failed = [] + with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor: + futures = { + executor.submit(upload_batch, batch, i): i + for i, batch in enumerate(batches) + } + + for future in concurrent.futures.as_completed(futures): + batch_num, success, error = future.result() + if success: + print(f"✓ Batch {batch_num+1}/{len(batches)} uploaded") + else: + print(f"✗ Batch {batch_num+1} failed: {error}") + failed.append((batch_num, batches[batch_num])) + + return failed + +# Usage +failed_batches = upload_parallel(my_embeddings, batch_size=100, max_workers=4) +if failed_batches: + print(f"Failed batches: {len(failed_batches)}") +``` + +### Strategy 3: Retry with Exponential Backoff + +Robust strategy for unreliable networks: + +```python +import time +import random + +def upload_with_retry(batch, max_retries=3): + """Upload batch with exponential backoff retry""" + for attempt in range(max_retries): + try: + response = requests.post( + "https://api.example.com/v1/embeddings/alice/my-project", + headers={ + "Authorization": "Bearer alice_api_key", + "Content-Type": "application/json" + }, + json={"embeddings": batch}, + timeout=60 + ) + response.raise_for_status() + return True + except requests.exceptions.RequestException as e: + if attempt < max_retries - 1: + wait = (2 ** attempt) + random.uniform(0, 1) + print(f"Retry attempt {attempt+1} after {wait:.1f}s: {e}") + time.sleep(wait) + else: + print(f"Failed after {max_retries} attempts: {e}") + return False + +def upload_robust(embeddings, batch_size=100): + """Upload with robust error handling""" + failed = [] + for i in range(0, len(embeddings), batch_size): + batch = embeddings[i:i+batch_size] + if not upload_with_retry(batch): + failed.append((i, batch)) + else: + print(f"✓ Uploaded {i+batch_size}/{len(embeddings)}") + + return failed +``` + +## Progress Tracking and Resumability + +### Checkpoint-Based Upload + +For very large datasets, implement checkpointing to resume after failures: + +```python +import json +import os + +class CheckpointUploader: + def __init__(self, checkpoint_file="upload_progress.json"): + self.checkpoint_file = checkpoint_file + self.progress = self.load_progress() + + def load_progress(self): + """Load upload progress from checkpoint file""" + if os.path.exists(self.checkpoint_file): + with open(self.checkpoint_file, 'r') as f: + return json.load(f) + return {"uploaded_count": 0, "failed_batches": []} + + def save_progress(self): + """Save current progress""" + with open(self.checkpoint_file, 'w') as f: + json.dump(self.progress, f) + + def upload(self, embeddings, batch_size=100): + """Upload with checkpointing""" + start_idx = self.progress["uploaded_count"] + + for i in range(start_idx, len(embeddings), batch_size): + batch = embeddings[i:i+batch_size] + + try: + response = requests.post( + "https://api.example.com/v1/embeddings/alice/my-project", + headers={ + "Authorization": "Bearer alice_api_key", + "Content-Type": "application/json" + }, + json={"embeddings": batch}, + timeout=60 + ) + response.raise_for_status() + + self.progress["uploaded_count"] = i + len(batch) + self.save_progress() + print(f"✓ Progress: {self.progress['uploaded_count']}/{len(embeddings)}") + + except Exception as e: + print(f"✗ Failed at index {i}: {e}") + self.progress["failed_batches"].append(i) + self.save_progress() + raise + +# Usage +uploader = CheckpointUploader() +try: + uploader.upload(my_embeddings, batch_size=100) + print("Upload complete!") +except: + print("Upload interrupted. Run again to resume.") +``` + +## Error Handling + +### Validation Errors + +If any embedding in a batch fails validation, the entire batch is rejected: + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/my-project" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [ + { + "text_id": "doc001", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, 0.3], + "vector_dim": 3072, + "metadata": {"author": "Alice"} + }, + { + "text_id": "doc002", + "instance_handle": "openai-large", + "vector": [0.1, 0.2], + "vector_dim": 3072, + "metadata": {"author": "Bob"} + } + ] + }' +``` + +**Error Response:** +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "dimension validation failed: vector length mismatch for text_id 'doc002': actual vector has 2 elements but vector_dim declares 3072" +} +``` + +**Solution:** Validate all embeddings before batching, or handle errors and retry failed items. + +### Pre-Upload Validation + +```python +def validate_embeddings(embeddings, expected_dim): + """Validate embeddings before upload""" + errors = [] + + for i, emb in enumerate(embeddings): + # Check vector length + if len(emb['vector']) != expected_dim: + errors.append(f"Index {i} (text_id: {emb['text_id']}): " + f"vector length {len(emb['vector'])} != {expected_dim}") + + # Check declared dimension + if emb.get('vector_dim') != expected_dim: + errors.append(f"Index {i} (text_id: {emb['text_id']}): " + f"vector_dim {emb.get('vector_dim')} != {expected_dim}") + + # Check required fields + if not emb.get('text_id'): + errors.append(f"Index {i}: missing text_id") + + if not emb.get('instance_handle'): + errors.append(f"Index {i}: missing instance_handle") + + return errors + +# Usage +errors = validate_embeddings(my_embeddings, 3072) +if errors: + print("Validation errors:") + for error in errors: + print(f" - {error}") +else: + print("All embeddings valid, proceeding with upload...") +``` + +## Performance Optimization Tips + +### 1. Minimize Payload Size + +Exclude unnecessary fields: + +```python +# Include text only if needed for retrieval +embeddings_with_text = [ + { + "text_id": doc_id, + "instance_handle": "openai-large", + "vector": vector, + "vector_dim": 3072, + "text": text, # Include if needed + "metadata": metadata + } + for doc_id, vector, text, metadata in documents +] + +# Exclude text if not needed (smaller payload) +embeddings_without_text = [ + { + "text_id": doc_id, + "instance_handle": "openai-large", + "vector": vector, + "vector_dim": 3072, + "metadata": metadata + } + for doc_id, vector, metadata in documents +] +``` + +### 2. Compress Requests + +Use gzip compression for large payloads: + +```python +import gzip +import json + +def upload_compressed(embeddings): + """Upload with gzip compression""" + payload = json.dumps({"embeddings": embeddings}) + compressed = gzip.compress(payload.encode('utf-8')) + + response = requests.post( + "https://api.example.com/v1/embeddings/alice/my-project", + headers={ + "Authorization": "Bearer alice_api_key", + "Content-Type": "application/json", + "Content-Encoding": "gzip" + }, + data=compressed + ) + return response +``` + +### 3. Use Connection Pooling + +Reuse HTTP connections for multiple requests: + +```python +session = requests.Session() +session.headers.update({ + "Authorization": "Bearer alice_api_key", + "Content-Type": "application/json" +}) + +for batch in batches: + response = session.post( + "https://api.example.com/v1/embeddings/alice/my-project", + json={"embeddings": batch} + ) + response.raise_for_status() +``` + +### 4. Monitor Upload Rate + +Track and display upload progress: + +```python +import time + +class ProgressTracker: + def __init__(self, total): + self.total = total + self.uploaded = 0 + self.start_time = time.time() + + def update(self, count): + self.uploaded += count + elapsed = time.time() - self.start_time + rate = self.uploaded / elapsed if elapsed > 0 else 0 + percent = (self.uploaded / self.total) * 100 + eta = (self.total - self.uploaded) / rate if rate > 0 else 0 + + print(f"\rProgress: {self.uploaded}/{self.total} ({percent:.1f}%) " + f"Rate: {rate:.1f} emb/s ETA: {eta:.0f}s", end="") + +# Usage +tracker = ProgressTracker(len(all_embeddings)) +for batch in batches: + upload_batch(batch) + tracker.update(len(batch)) +print() # New line after completion +``` + +## Complete Example: Production-Grade Uploader + +```python +import requests +import time +import json +import logging +from concurrent.futures import ThreadPoolExecutor, as_completed +from typing import List, Dict, Tuple + +class ProductionUploader: + def __init__(self, api_base: str, api_key: str, + owner: str, project: str): + self.api_base = api_base + self.api_key = api_key + self.owner = owner + self.project = project + self.session = requests.Session() + self.session.headers.update({ + "Authorization": f"Bearer {api_key}", + "Content-Type": "application/json" + }) + + logging.basicConfig(level=logging.INFO) + self.logger = logging.getLogger(__name__) + + def upload_batch(self, batch: List[Dict], batch_num: int, + max_retries: int = 3) -> Tuple[int, bool, str]: + """Upload a single batch with retry logic""" + url = f"{self.api_base}/v1/embeddings/{self.owner}/{self.project}" + + for attempt in range(max_retries): + try: + response = self.session.post( + url, + json={"embeddings": batch}, + timeout=60 + ) + response.raise_for_status() + return batch_num, True, "" + except Exception as e: + if attempt < max_retries - 1: + wait = 2 ** attempt + self.logger.warning( + f"Batch {batch_num} attempt {attempt+1} failed: {e}. " + f"Retrying in {wait}s..." + ) + time.sleep(wait) + else: + return batch_num, False, str(e) + + def upload(self, embeddings: List[Dict], batch_size: int = 100, + max_workers: int = 4) -> Dict: + """Upload embeddings with parallel processing and progress tracking""" + batches = [ + embeddings[i:i+batch_size] + for i in range(0, len(embeddings), batch_size) + ] + + results = { + "total": len(embeddings), + "uploaded": 0, + "failed": [], + "start_time": time.time() + } + + self.logger.info(f"Uploading {len(embeddings)} embeddings in " + f"{len(batches)} batches...") + + with ThreadPoolExecutor(max_workers=max_workers) as executor: + futures = { + executor.submit(self.upload_batch, batch, i): i + for i, batch in enumerate(batches) + } + + for future in as_completed(futures): + batch_num, success, error = future.result() + + if success: + results["uploaded"] += len(batches[batch_num]) + percent = (results["uploaded"] / results["total"]) * 100 + self.logger.info( + f"✓ Batch {batch_num+1}/{len(batches)} " + f"({percent:.1f}% complete)" + ) + else: + results["failed"].append({ + "batch_num": batch_num, + "error": error, + "embeddings": batches[batch_num] + }) + self.logger.error(f"✗ Batch {batch_num+1} failed: {error}") + + results["elapsed"] = time.time() - results["start_time"] + results["rate"] = results["uploaded"] / results["elapsed"] + + self.logger.info( + f"\nUpload complete: {results['uploaded']}/{results['total']} " + f"in {results['elapsed']:.1f}s ({results['rate']:.1f} emb/s)" + ) + + if results["failed"]: + self.logger.warning(f"Failed batches: {len(results['failed'])}") + + return results + +# Usage +uploader = ProductionUploader( + api_base="https://api.example.com", + api_key="alice_api_key", + owner="alice", + project="my-project" +) + +results = uploader.upload( + embeddings=my_embeddings, + batch_size=100, + max_workers=4 +) + +# Save failed batches for retry +if results["failed"]: + with open("failed_batches.json", "w") as f: + json.dump(results["failed"], f) +``` + +## Best Practices Summary + +1. **Batch Size**: Test to find optimal size (typically 50-500 depending on dimensions) +2. **Parallelism**: Use 2-8 parallel workers for large uploads +3. **Retry Logic**: Implement exponential backoff for network errors +4. **Validation**: Pre-validate embeddings before upload +5. **Progress Tracking**: Monitor upload rate and ETA +6. **Checkpointing**: Save progress for resumable uploads +7. **Error Logging**: Log all errors with sufficient context +8. **Connection Reuse**: Use session objects for connection pooling +9. **Compression**: Use gzip for large payloads +10. **Testing**: Test with small batches before full upload + +## Related Documentation + +- [RAG Workflow Guide](./rag-workflow.md) - Complete RAG implementation +- [Metadata Validation Guide](./metadata-validation.md) - Schema validation +- [Instance Management Guide](./instance-management.md) - Managing LLM instances + +## Troubleshooting + +### Timeout Errors + +**Problem:** Requests timing out with large batches + +**Solution:** Reduce batch size or increase timeout value + +### Memory Issues + +**Problem:** Out of memory when processing large datasets + +**Solution:** Process embeddings in streaming fashion, don't load all into memory + +### Rate Limiting + +**Problem:** Getting rate limited by API + +**Solution:** Reduce parallelism (max_workers) or add delays between requests diff --git a/docs/content/guides/instance-management.md b/docs/content/guides/instance-management.md new file mode 100644 index 0000000..e09a871 --- /dev/null +++ b/docs/content/guides/instance-management.md @@ -0,0 +1,634 @@ +--- +title: "Instance Management Guide" +weight: 8 +--- + +# Instance Management Guide + +This guide explains how to create, configure, and share LLM service instances for generating and managing embeddings. + +## Overview + +LLM service instances define the configuration for connecting to embedding services (like OpenAI, Cohere, or Gemini). Each instance includes: +- API endpoint and credentials +- Model name and version +- Vector dimensions +- API standard (protocol) + +Instances can be created from system templates, user-defined templates, or as standalone configurations. They can also be shared with other users for collaborative work. + +## Instance Architecture + +### System Definitions + +The system provides pre-configured templates for common LLM services: + +```bash +# List available system definitions (no auth required) +curl -X GET "https://api.example.com/v1/llm-service-definitions/_system" +``` + +**Default System Definitions:** +- `openai-large`: OpenAI text-embedding-3-large (3072 dimensions) +- `openai-small`: OpenAI text-embedding-3-small (1536 dimensions) +- `cohere-v4`: Cohere Embed v4 (1536 dimensions) +- `gemini-embedding-001`: Google Gemini embedding-001 (3072 dimensions, default size) + +### User Instances + +Users create instances for their own use. Instances contain: +- Configuration (endpoint, model, dimensions) +- Encrypted API keys (write-only, never returned) +- Optional reference to a definition template + +## Creating LLM Service Instances + +### Option 1: Standalone Instance + +Create an instance by specifying all configuration fields: + +```bash +curl -X PUT "https://api.example.com/v1/llm-services/alice/my-openai" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "instance_handle": "my-openai", + "endpoint": "https://api.openai.com/v1/embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "description": "OpenAI large embedding model for research", + "api_key_encrypted": "sk-proj-your-openai-api-key-here" + }' +``` + +**Response:** +```json +{ + "instance_id": 123, + "instance_handle": "my-openai", + "owner": "alice", + "endpoint": "https://api.openai.com/v1/embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "description": "OpenAI large embedding model for research" +} +``` + +**Note:** The `api_key_encrypted` field is not returned in the response for security reasons. + +### Option 2: From System Definition + +Create an instance based on a system template (only requires API key): + +```bash +curl -X POST "https://api.example.com/v1/llm-services/alice/my-openai-instance" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "instance_handle": "my-openai-instance", + "definition_owner": "_system", + "definition_handle": "openai-large", + "api_key_encrypted": "sk-proj-your-openai-api-key-here" + }' +``` + +This inherits configuration from the `_system/openai-large` definition and only requires you to provide your API key. + +### Option 3: From User Definition + +Users can create their own definitions as templates: + +```bash +# Step 1: Create a custom definition +curl -X PUT "https://api.example.com/v1/llm-service-definitions/alice/my-custom-config" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "definition_handle": "my-custom-config", + "endpoint": "https://custom-api.example.com/embeddings", + "api_standard": "openai", + "model": "custom-model-v2", + "dimensions": 2048, + "description": "Custom embedding service" + }' + +# Step 2: Create instance from that definition +curl -X POST "https://api.example.com/v1/llm-services/alice/custom-instance" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "instance_handle": "custom-instance", + "definition_owner": "alice", + "definition_handle": "my-custom-config", + "api_key_encrypted": "your-api-key-here" + }' +``` + +## Managing Instances + +### List Your Instances + +Get all instances you own or have access to: + +```bash +curl -X GET "https://api.example.com/v1/llm-services/alice" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** +```json +{ + "owned_instances": [ + { + "instance_id": 123, + "instance_handle": "my-openai", + "endpoint": "https://api.openai.com/v1/embeddings", + "model": "text-embedding-3-large", + "dimensions": 3072 + }, + { + "instance_id": 124, + "instance_handle": "my-cohere", + "endpoint": "https://api.cohere.ai/v1/embed", + "model": "embed-english-v4.0", + "dimensions": 1536 + } + ], + "shared_instances": [ + { + "instance_id": 456, + "instance_handle": "team-openai", + "owner": "bob", + "endpoint": "https://api.openai.com/v1/embeddings", + "model": "text-embedding-3-large", + "dimensions": 3072, + "role": "reader" + } + ] +} +``` + +### Get Instance Details + +Retrieve details for a specific instance: + +```bash +curl -X GET "https://api.example.com/v1/llm-services/alice/my-openai" \ + -H "Authorization: Bearer alice_api_key" +``` + +### Update Instance + +Update instance configuration (owner only): + +```bash +curl -X PATCH "https://api.example.com/v1/llm-services/alice/my-openai" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "description": "Updated description", + "api_key_encrypted": "sk-proj-new-api-key-here" + }' +``` + +### Delete Instance + +Delete an instance (owner only): + +```bash +curl -X DELETE "https://api.example.com/v1/llm-services/alice/my-openai" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Note:** Cannot delete instances that are used by existing projects. + +## API Key Encryption + +### How Encryption Works + +API keys are encrypted using AES-256-GCM encryption: + +1. **Encryption Key Source**: `ENCRYPTION_KEY` environment variable +2. **Key Derivation**: SHA256 hash of the environment variable (ensures 32-byte key) +3. **Algorithm**: AES-256-GCM (authenticated encryption) +4. **Storage**: Encrypted bytes stored in `api_key_encrypted` column + +### Setting Up Encryption + +Add to your environment configuration: + +```bash +# .env file +ENCRYPTION_KEY="your-secure-random-32-character-key-or-longer" +``` + +**Important Security Notes:** +- Keep this key secure and backed up +- Losing the key means losing access to encrypted API keys +- Use a strong, random string (32+ characters) +- Never commit the key to version control +- Rotate the key periodically (requires re-encrypting all API keys) + +### API Key Security + +API keys are **write-only** in the API: + +```bash +# Upload API key (works) +curl -X PUT "https://api.example.com/v1/llm-services/alice/my-instance" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"api_key_encrypted": "sk-..."}' + +# Retrieve instance (API key NOT returned) +curl -X GET "https://api.example.com/v1/llm-services/alice/my-instance" \ + -H "Authorization: Bearer alice_api_key" + +# Response does NOT include api_key_encrypted field +{ + "instance_id": 123, + "instance_handle": "my-instance", + "endpoint": "https://api.openai.com/v1/embeddings", + "model": "text-embedding-3-large", + "dimensions": 3072 + // No api_key_encrypted field! +} +``` + +This protects API keys from: +- Accidental exposure in logs +- Unauthorized access via shared instances +- Client-side data breaches + +## Instance Sharing + +### Share an Instance + +Grant another user access to your instance: + +```bash +curl -X POST "https://api.example.com/v1/llm-services/alice/my-openai/share" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "share_with_handle": "bob", + "role": "reader" + }' +``` + +**Roles:** +- `reader`: Can use the instance in projects (read-only) +- `editor`: Can use the instance (currently same as reader) +- `owner`: Full control (only one owner, the creator) + +### List Shared Users + +See who has access to your instance: + +```bash +curl -X GET "https://api.example.com/v1/llm-services/alice/my-openai/shared-with" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** +```json +{ + "instance_handle": "my-openai", + "owner": "alice", + "shared_with": [ + {"user_handle": "bob", "role": "reader"}, + {"user_handle": "charlie", "role": "reader"} + ] +} +``` + +### Unshare an Instance + +Revoke access: + +```bash +curl -X DELETE "https://api.example.com/v1/llm-services/alice/my-openai/share/bob" \ + -H "Authorization: Bearer alice_api_key" +``` + +### Using Shared Instances + +Bob can reference Alice's shared instance in his projects: + +```bash +# Bob creates a project using Alice's instance +curl -X PUT "https://api.example.com/v1/projects/bob/my-project" \ + -H "Authorization: Bearer bob_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "my-project", + "description": "Using shared instance", + "instance_owner": "alice", + "instance_handle": "my-openai" + }' + +# Bob uploads embeddings using the shared instance +curl -X POST "https://api.example.com/v1/embeddings/bob/my-project" \ + -H "Authorization: Bearer bob_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "doc001", + "instance_handle": "alice/my-openai", + "vector": [0.1, 0.2, ...], + "vector_dim": 3072 + }] + }' +``` + +**Important:** Bob can use the instance but cannot see Alice's API key. + +## Instance Sharing Patterns + +### Team Shared Instance + +A team lead creates and shares an instance for the team: + +```bash +# Team lead creates instance +curl -X PUT "https://api.example.com/v1/llm-services/team_lead/team-openai" \ + -H "Authorization: Bearer team_lead_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "instance_handle": "team-openai", + "endpoint": "https://api.openai.com/v1/embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "api_key_encrypted": "sk-proj-team-api-key" + }' + +# Share with team members +for member in alice bob charlie; do + curl -X POST "https://api.example.com/v1/llm-services/team_lead/team-openai/share" \ + -H "Authorization: Bearer team_lead_api_key" \ + -H "Content-Type: application/json" \ + -d "{\"share_with_handle\": \"$member\", \"role\": \"reader\"}" +done +``` + +### Organization-Wide Instance + +Create a shared instance for organization-wide use: + +```bash +# Organization admin creates instance +curl -X PUT "https://api.example.com/v1/llm-services/org_admin/org-embeddings" \ + -H "Authorization: Bearer org_admin_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "instance_handle": "org-embeddings", + "endpoint": "https://api.openai.com/v1/embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "description": "Organization-wide embedding service", + "api_key_encrypted": "sk-proj-org-api-key" + }' +``` + +### Per-Project Instance + +Each project maintainer creates their own instance: + +```bash +# Alice creates her own instance for her project +curl -X PUT "https://api.example.com/v1/llm-services/alice/research-embeddings" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "instance_handle": "research-embeddings", + "endpoint": "https://api.openai.com/v1/embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "api_key_encrypted": "sk-proj-alice-research-key" + }' +``` + +## Common Configurations + +### OpenAI Configuration + +```json +{ + "instance_handle": "openai-large", + "endpoint": "https://api.openai.com/v1/embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "api_key_encrypted": "sk-proj-..." +} +``` + +### Cohere Configuration + +```json +{ + "instance_handle": "cohere-english", + "endpoint": "https://api.cohere.ai/v1/embed", + "api_standard": "cohere", + "model": "embed-english-v4.0", + "dimensions": 1536, + "api_key_encrypted": "your-cohere-api-key" +} +``` + +### Google Gemini Configuration + +```json +{ + "instance_handle": "gemini-embedding", + "endpoint": "https://generativelanguage.googleapis.com/v1beta/models/embedding-001:embedContent", + "api_standard": "gemini", + "model": "embedding-001", + "dimensions": 3072, + "api_key_encrypted": "your-gemini-api-key" +} +``` + +### Custom/Self-Hosted Service + +```json +{ + "instance_handle": "custom-service", + "endpoint": "https://custom-api.example.com/v1/embeddings", + "api_standard": "openai", + "model": "custom-model-v2", + "dimensions": 2048, + "description": "Self-hosted embedding service", + "api_key_encrypted": "custom-auth-token" +} +``` + +## Projects and Instances + +### 1:1 Relationship + +Each project must reference exactly one instance: + +```bash +curl -X PUT "https://api.example.com/v1/projects/alice/my-project" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "my-project", + "description": "Project with instance", + "instance_id": 123 + }' +``` + +### Changing Instance + +Update a project to use a different instance: + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/my-project" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"instance_id": 456}' +``` + +**Note:** Only switch to instances with matching dimensions, or you'll get validation errors on future uploads. + +### Finding Instance for Project + +```bash +curl -X GET "https://api.example.com/v1/projects/alice/my-project" \ + -H "Authorization: Bearer alice_api_key" +``` + +The response includes the `instance_id` field. + +## Best Practices + +### 1. Use System Definitions + +Start with system definitions for common services: + +```bash +# Easiest approach +curl -X POST "https://api.example.com/v1/llm-services/alice/my-instance" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "instance_handle": "my-instance", + "definition_owner": "_system", + "definition_handle": "openai-large", + "api_key_encrypted": "sk-..." + }' +``` + +### 2. Descriptive Instance Names + +Use clear, descriptive names: + +```bash +# Good names +"research-openai-large" +"prod-cohere-english" +"test-gemini-embedding" + +# Avoid generic names +"instance1" +"my-instance" +"test" +``` + +### 3. Separate Production and Development + +Create separate instances for different environments: + +```bash +# Development instance +curl -X PUT "https://api.example.com/v1/llm-services/alice/dev-openai" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"instance_handle": "dev-openai", ...}' + +# Production instance +curl -X PUT "https://api.example.com/v1/llm-services/alice/prod-openai" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"instance_handle": "prod-openai", ...}' +``` + +### 4. Document Instance Purpose + +Use the description field: + +```bash +curl -X PUT "https://api.example.com/v1/llm-services/alice/team-openai" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "instance_handle": "team-openai", + "description": "Shared OpenAI instance for research team. Contact alice@example.com for access.", + ... + }' +``` + +### 5. Regular Key Rotation + +Periodically update API keys: + +```bash +curl -X PATCH "https://api.example.com/v1/llm-services/alice/my-openai" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"api_key_encrypted": "sk-proj-new-key-here"}' +``` + +### 6. Monitor Instance Usage + +Track which projects use each instance to avoid deleting in-use instances. + +## Troubleshooting + +### Cannot Delete Instance + +**Error:** "Instance is in use by existing projects" + +**Solution:** Delete or update projects using this instance first. + +### Dimension Mismatch + +**Error:** "vector dimension mismatch" + +**Solution:** Ensure embeddings match the instance's configured dimensions. + +### API Key Not Working + +**Problem:** Embeddings uploads fail with authentication errors + +**Solution:** +1. Verify API key is correct +2. Check API key permissions with the LLM provider +3. Update the API key in the instance + +### Cannot Access Shared Instance + +**Problem:** Getting "Instance not found" errors + +**Solution:** Verify you've been granted access. Contact the instance owner. + +## Related Documentation + +- [RAG Workflow Guide](./rag-workflow.md) - Complete RAG implementation +- [Project Sharing Guide](./project-sharing.md) - Share projects with users +- [Batch Operations Guide](./batch-operations.md) - Upload embeddings efficiently + +## Security Summary + +1. **API keys are encrypted** at rest using AES-256-GCM +2. **API keys are never returned** via GET requests +3. **Shared users cannot see API keys** (write-only field) +4. **Encryption key must be secured** (loss means cannot decrypt keys) +5. **Regular key rotation recommended** for production use diff --git a/docs/content/guides/metadata-filtering.md b/docs/content/guides/metadata-filtering.md new file mode 100644 index 0000000..3a27b48 --- /dev/null +++ b/docs/content/guides/metadata-filtering.md @@ -0,0 +1,468 @@ +--- +title: "Metadata Filtering Guide" +weight: 6 +--- + +# Metadata Filtering Guide + +This guide explains how to use metadata filtering to exclude specific documents from similarity search results. + +## Overview + +When searching for similar documents, you may want to exclude results that share certain metadata values with your query. For example: +- Exclude documents from the same author when finding similar writing styles +- Filter out documents from the same source when finding related content +- Exclude documents with the same category when exploring diversity + +dhamps-vdb provides metadata filtering using query parameters that perform **negative matching** - they exclude documents where the metadata field matches the specified value. + +## Query Parameters + +Both similarity search endpoints support metadata filtering: + +- `metadata_path`: The JSON path to the metadata field (e.g., `author`, `source.id`, `tags[0]`) +- `metadata_value`: The value to exclude from results + +Both parameters must be used together. If you specify one without the other, the API returns an error. + +## Basic Filtering Examples + +### Exclude Documents by Author + +Find similar documents but exclude those from the same author: + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/literary-corpus/hamlet-soliloquy?count=10&metadata_path=author&metadata_value=William%20Shakespeare" \ + -H "Authorization: Bearer alice_api_key" +``` + +This returns similar documents, excluding any with `metadata.author == "William Shakespeare"`. + +### Exclude Documents from Same Source + +Find similar content from different sources: + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/news-articles/article123?count=10&metadata_path=source&metadata_value=NYTimes" \ + -H "Authorization: Bearer alice_api_key" +``` + +This excludes any documents with `metadata.source == "NYTimes"`. + +### Exclude by Category + +Find documents in different categories: + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/products/product456?count=10&metadata_path=category&metadata_value=electronics" \ + -H "Authorization: Bearer alice_api_key" +``` + +This excludes any documents with `metadata.category == "electronics"`. + +## Filtering with Raw Embeddings + +Metadata filtering also works when searching with raw embedding vectors: + +```bash +curl -X POST "https://api.example.com/v1/similars/alice/literary-corpus?count=10&metadata_path=author&metadata_value=William%20Shakespeare" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [0.032, -0.018, 0.056, ...] + }' +``` + +This searches using the provided vector but excludes documents where `metadata.author == "William Shakespeare"`. + +## Nested Metadata Paths + +For nested metadata objects, use dot notation: + +### Example Metadata Structure + +```json +{ + "author": { + "name": "Jane Doe", + "id": "author123", + "affiliation": "University" + }, + "publication": { + "journal": "Science", + "year": 2023 + } +} +``` + +### Filter by Nested Field + +```bash +# Exclude documents from same author ID +curl -X GET "https://api.example.com/v1/similars/alice/papers/paper001?count=10&metadata_path=author.id&metadata_value=author123" \ + -H "Authorization: Bearer alice_api_key" + +# Exclude documents from same journal +curl -X GET "https://api.example.com/v1/similars/alice/papers/paper001?count=10&metadata_path=publication.journal&metadata_value=Science" \ + -H "Authorization: Bearer alice_api_key" +``` + +## Combining with Other Parameters + +Metadata filtering works seamlessly with other search parameters: + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/documents/doc123?count=20&threshold=0.8&limit=10&offset=0&metadata_path=source_id&metadata_value=src_456" \ + -H "Authorization: Bearer alice_api_key" +``` + +Parameters: +- `count=20`: Consider top 20 similar documents +- `threshold=0.8`: Only include documents with similarity ≥ 0.8 +- `limit=10`: Return at most 10 results +- `offset=0`: Start from first result (for pagination) +- `metadata_path=source_id`: Filter on this metadata field +- `metadata_value=src_456`: Exclude documents with this value + +## Use Cases + +### 1. Finding Similar Writing Styles Across Authors + +When analyzing writing styles, you want similar texts from different authors: + +```bash +# Upload documents with author metadata +curl -X POST "https://api.example.com/v1/embeddings/alice/writing-styles" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "tolstoy-passage-1", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, ...], + "vector_dim": 3072, + "text": "Happy families are all alike...", + "metadata": { + "author": "Leo Tolstoy", + "work": "Anna Karenina", + "language": "Russian" + } + }] + }' + +# Find similar writing styles from other authors +curl -X GET "https://api.example.com/v1/similars/alice/writing-styles/tolstoy-passage-1?count=10&metadata_path=author&metadata_value=Leo%20Tolstoy" \ + -H "Authorization: Bearer alice_api_key" +``` + +### 2. Cross-Source Content Discovery + +Find related news articles from different sources: + +```bash +# Search for similar content, excluding same source +curl -X GET "https://api.example.com/v1/similars/alice/news-corpus/nyt-article-456?count=15&metadata_path=source&metadata_value=New%20York%20Times" \ + -H "Authorization: Bearer alice_api_key" +``` + +This helps discover how different outlets cover similar topics. + +### 3. Product Recommendations Across Categories + +Find similar products in different categories: + +```bash +# User is viewing a laptop +curl -X GET "https://api.example.com/v1/similars/alice/product-catalog/laptop-001?count=10&threshold=0.7&metadata_path=category&metadata_value=electronics" \ + -H "Authorization: Bearer alice_api_key" +``` + +This could recommend accessories, furniture (for home office), or other complementary items. + +### 4. Research Paper Discovery + +Find related papers from different research groups: + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/research-papers/paper123?count=20&metadata_path=lab_id&metadata_value=lab_abc_001" \ + -H "Authorization: Bearer alice_api_key" +``` + +Helps researchers discover related work from other institutions. + +### 5. Avoiding Duplicate Content + +When building a diverse content feed, exclude items from the same collection: + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/blog-posts/post789?count=5&metadata_path=collection_id&metadata_value=series_xyz" \ + -H "Authorization: Bearer alice_api_key" +``` + +### 6. Cross-Language Document Discovery + +Find similar documents in other languages: + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/multilingual-docs/doc_en_123?count=10&metadata_path=language&metadata_value=en" \ + -H "Authorization: Bearer alice_api_key" +``` + +This finds semantically similar documents in languages other than English. + +## Working with Multiple Values + +Currently, you can only filter by one metadata field at a time. To exclude multiple values, you need to: + +1. **Make multiple requests** and merge results in your application +2. **Use more specific metadata fields** that combine multiple attributes +3. **Post-process results** on the client side + +### Example: Excluding Multiple Authors + +```python +import requests + +def find_similar_excluding_authors(doc_id, exclude_authors): + """Find similar docs excluding multiple authors""" + all_results = [] + + for author in exclude_authors: + response = requests.get( + f"https://api.example.com/v1/similars/alice/corpus/{doc_id}", + headers={"Authorization": "Bearer alice_api_key"}, + params={ + "count": 20, + "metadata_path": "author", + "metadata_value": author + } + ) + results = response.json()['results'] + all_results.extend(results) + + # Deduplicate and sort by similarity + seen = set() + unique_results = [] + for r in sorted(all_results, key=lambda x: x['similarity'], reverse=True): + if r['id'] not in seen: + seen.add(r['id']) + unique_results.append(r) + + return unique_results[:10] + +# Usage +similar = find_similar_excluding_authors( + "doc123", + ["Author A", "Author B", "Author C"] +) +``` + +## Combining with Metadata Validation + +For reliable filtering, combine with metadata schema validation: + +```bash +# Step 1: Create project with metadata schema +curl -X POST "https://api.example.com/v1/projects/alice/validated-corpus" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "validated-corpus", + "description": "Corpus with validated metadata", + "instance_id": 123, + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"author\":{\"type\":\"string\"},\"source_id\":{\"type\":\"string\"}},\"required\":[\"author\",\"source_id\"]}" + }' + +# Step 2: Upload embeddings with metadata +curl -X POST "https://api.example.com/v1/embeddings/alice/validated-corpus" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "doc001", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, ...], + "vector_dim": 3072, + "metadata": { + "author": "John Doe", + "source_id": "source_123" + } + }] + }' + +# Step 3: Search with guaranteed metadata field existence +curl -X GET "https://api.example.com/v1/similars/alice/validated-corpus/doc001?count=10&metadata_path=author&metadata_value=John%20Doe" \ + -H "Authorization: Bearer alice_api_key" +``` + +See the [Metadata Validation Guide](./metadata-validation.md) for more details. + +## Understanding the Filter Logic + +The metadata filter uses **negative matching**: + +``` +INCLUDE document IF: + - document.similarity >= threshold + AND + - document.metadata[metadata_path] != metadata_value +``` + +**Important:** Documents without the specified metadata field are included (not filtered out). + +### Example + +Given this query: +```bash +?metadata_path=author&metadata_value=Alice +``` + +**Included:** +- Documents where `metadata.author == "Bob"` +- Documents where `metadata.author == "Charlie"` +- Documents without an `author` field in metadata + +**Excluded:** +- Documents where `metadata.author == "Alice"` + +## Performance Considerations + +Metadata filtering is performed at the database level using efficient indexing: + +1. **Vector similarity** is computed first +2. **Metadata filter** is applied to the similarity results +3. Results are sorted and limited + +For best performance: +- Use indexed metadata fields when possible +- Keep metadata values relatively small (under 1KB per document) +- Consider using IDs instead of full names for filtering + +## Error Handling + +### Missing One Parameter + +```bash +# Missing metadata_value +curl -X GET "https://api.example.com/v1/similars/alice/corpus/doc123?metadata_path=author" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Error:** +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "metadata_path and metadata_value must be used together" +} +``` + +### Non-Existent Metadata Field + +```bash +# Filtering on field that doesn't exist in documents +curl -X GET "https://api.example.com/v1/similars/alice/corpus/doc123?count=10&metadata_path=nonexistent_field&metadata_value=some_value" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Result:** Returns all matching documents (since none have the field, none are excluded). + +### URL Encoding + +Remember to URL-encode metadata values with special characters: + +```bash +# Correct: URL-encoded value +curl -X GET "https://api.example.com/v1/similars/alice/corpus/doc123?metadata_path=author&metadata_value=John%20Doe%20%26%20Jane%20Smith" \ + -H "Authorization: Bearer alice_api_key" + +# Incorrect: Unencoded special characters +curl -X GET "https://api.example.com/v1/similars/alice/corpus/doc123?metadata_path=author&metadata_value=John Doe & Jane Smith" \ + -H "Authorization: Bearer alice_api_key" +``` + +## Complete Example + +Here's a complete workflow demonstrating metadata filtering: + +```bash +# 1. Create project +curl -X POST "https://api.example.com/v1/projects/alice/literature" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "literature", + "instance_id": 123 + }' + +# 2. Upload documents with metadata +curl -X POST "https://api.example.com/v1/embeddings/alice/literature" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [ + { + "text_id": "tolstoy_1", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, ...], + "vector_dim": 3072, + "text": "All happy families...", + "metadata": {"author": "Tolstoy", "work": "Anna Karenina"} + }, + { + "text_id": "tolstoy_2", + "instance_handle": "openai-large", + "vector": [0.11, 0.21, ...], + "vector_dim": 3072, + "text": "It was the best of times...", + "metadata": {"author": "Tolstoy", "work": "War and Peace"} + }, + { + "text_id": "dickens_1", + "instance_handle": "openai-large", + "vector": [0.12, 0.19, ...], + "vector_dim": 3072, + "text": "It was the age of wisdom...", + "metadata": {"author": "Dickens", "work": "Tale of Two Cities"} + } + ] + }' + +# 3. Find similar to tolstoy_1, excluding Tolstoy's works +curl -X GET "https://api.example.com/v1/similars/alice/literature/tolstoy_1?count=10&metadata_path=author&metadata_value=Tolstoy" \ + -H "Authorization: Bearer alice_api_key" + +# Result: Returns dickens_1, excludes tolstoy_2 +``` + +## Related Documentation + +- [RAG Workflow Guide](./rag-workflow.md) - Complete RAG implementation +- [Metadata Validation Guide](./metadata-validation.md) - Schema validation +- [Batch Operations Guide](./batch-operations.md) - Upload large datasets + +## Troubleshooting + +### No Results Returned + +**Problem:** Filter excludes all results + +**Solution:** +- Verify the metadata field exists in your documents +- Check that the metadata value matches exactly (case-sensitive) +- Try without the filter to ensure there are similar documents + +### Filter Not Working + +**Problem:** Still seeing documents you want to exclude + +**Solution:** +- Check URL encoding of the metadata value +- Verify the metadata path is correct (use dot notation for nested fields) +- Ensure both `metadata_path` and `metadata_value` are specified + +### Want Positive Matching + +**Problem:** Want to include only specific values, not exclude them + +**Solution:** Currently, only negative matching (exclusion) is supported. For positive matching, retrieve all results and filter on the client side, or use multiple negative filters to exclude everything except your target values. diff --git a/docs/content/guides/metadata-validation.md b/docs/content/guides/metadata-validation.md new file mode 100644 index 0000000..c345fd1 --- /dev/null +++ b/docs/content/guides/metadata-validation.md @@ -0,0 +1,665 @@ +--- +title: "Metadata Validation Guide" +weight: 5 +--- + +# Metadata Validation Guide + +This guide explains how to use JSON Schema validation to ensure consistent metadata structure across your embeddings. + +## Overview + +dhamps-vdb supports optional metadata validation using JSON Schema. When you define a metadata schema for a project, the API automatically validates all embedding metadata against that schema, ensuring data quality and consistency. + +Benefits: +- Enforce consistent metadata structure across all embeddings +- Catch data entry errors early +- Document expected metadata fields +- Enable reliable metadata-based filtering + +## Defining a Metadata Schema + +### When Creating a Project + +Include a `metadataScheme` field with a valid JSON Schema when creating a project: + +```bash +curl -X POST "https://api.example.com/v1/projects/alice/validated-project" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "validated-project", + "description": "Project with metadata validation", + "instance_id": 123, + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"author\":{\"type\":\"string\"},\"year\":{\"type\":\"integer\"}},\"required\":[\"author\"]}" + }' +``` + +### Updating an Existing Project + +Add or update a metadata schema using PATCH: + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/my-project" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"title\":{\"type\":\"string\"},\"author\":{\"type\":\"string\"},\"year\":{\"type\":\"integer\"}},\"required\":[\"title\",\"author\"]}" + }' +``` + +**Note:** Schema updates only affect new or updated embeddings. Existing embeddings are not retroactively validated. + +## Common Schema Patterns + +### Simple Required Fields + +Require specific fields with basic types: + +```json +{ + "type": "object", + "properties": { + "author": {"type": "string"}, + "year": {"type": "integer"} + }, + "required": ["author"] +} +``` + +Example usage: + +```bash +curl -X POST "https://api.example.com/v1/projects/alice/literary-texts" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "literary-texts", + "description": "Literary texts with structured metadata", + "instance_id": 123, + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"author\":{\"type\":\"string\"},\"year\":{\"type\":\"integer\"}},\"required\":[\"author\"]}" + }' +``` + +### Using Enums for Controlled Values + +Restrict fields to specific allowed values: + +```json +{ + "type": "object", + "properties": { + "genre": { + "type": "string", + "enum": ["poetry", "prose", "drama", "essay"] + }, + "language": { + "type": "string", + "enum": ["en", "de", "fr", "es", "la"] + } + }, + "required": ["genre"] +} +``` + +Example: + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/literary-texts" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "hamlet-soliloquy", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, 0.3, ...], + "vector_dim": 3072, + "metadata": { + "author": "William Shakespeare", + "year": 1603, + "genre": "drama" + } + }] + }' +``` + +### Nested Objects + +Define structured metadata with nested objects: + +```json +{ + "type": "object", + "properties": { + "author": { + "type": "object", + "properties": { + "name": {"type": "string"}, + "birth_year": {"type": "integer"}, + "nationality": {"type": "string"} + }, + "required": ["name"] + }, + "publication": { + "type": "object", + "properties": { + "year": {"type": "integer"}, + "publisher": {"type": "string"}, + "city": {"type": "string"} + } + } + }, + "required": ["author"] +} +``` + +Example: + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/academic-papers" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "paper001", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, 0.3, ...], + "vector_dim": 3072, + "metadata": { + "author": { + "name": "Jane Smith", + "birth_year": 1975, + "nationality": "USA" + }, + "publication": { + "year": 2023, + "publisher": "Academic Press", + "city": "Boston" + } + } + }] + }' +``` + +### Arrays and Lists + +Define arrays of values: + +```json +{ + "type": "object", + "properties": { + "keywords": { + "type": "array", + "items": {"type": "string"}, + "minItems": 1, + "maxItems": 10 + }, + "categories": { + "type": "array", + "items": { + "type": "string", + "enum": ["philosophy", "literature", "science", "history"] + } + } + } +} +``` + +Example: + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/research-docs" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "doc001", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, 0.3, ...], + "vector_dim": 3072, + "metadata": { + "keywords": ["machine learning", "embeddings", "NLP"], + "categories": ["science", "literature"] + } + }] + }' +``` + +### Numeric Constraints + +Apply minimum, maximum, and range constraints: + +```json +{ + "type": "object", + "properties": { + "rating": { + "type": "number", + "minimum": 0, + "maximum": 5 + }, + "page_count": { + "type": "integer", + "minimum": 1 + }, + "confidence": { + "type": "number", + "minimum": 0.0, + "maximum": 1.0 + } + } +} +``` + +### String Constraints + +Apply length and pattern constraints: + +```json +{ + "type": "object", + "properties": { + "title": { + "type": "string", + "minLength": 1, + "maxLength": 200 + }, + "isbn": { + "type": "string", + "pattern": "^[0-9]{13}$" + }, + "doi": { + "type": "string", + "pattern": "^10\\.\\d{4,}/[\\w\\-\\.]+$" + } + } +} +``` + +## Validation Examples + +### Valid Upload + +When metadata conforms to the schema: + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/literary-texts" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "kant-critique", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, 0.3, ...], + "vector_dim": 3072, + "metadata": { + "author": "Immanuel Kant", + "year": 1781, + "genre": "prose" + } + }] + }' +``` + +**Response:** +```json +{ + "message": "Embeddings uploaded successfully", + "count": 1 +} +``` + +### Validation Error: Missing Required Field + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/literary-texts" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "some-text", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, 0.3, ...], + "vector_dim": 3072, + "metadata": { + "year": 1781 + } + }] + }' +``` + +**Error Response:** +```json +{ + "$schema": "http://localhost:8080/schemas/ErrorModel.json", + "title": "Bad Request", + "status": 400, + "detail": "metadata validation failed for text_id 'some-text': metadata validation failed:\n - author is required" +} +``` + +### Validation Error: Wrong Type + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/literary-texts" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "some-text", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, 0.3, ...], + "vector_dim": 3072, + "metadata": { + "author": "John Doe", + "year": "1781" + } + }] + }' +``` + +**Error Response:** +```json +{ + "$schema": "http://localhost:8080/schemas/ErrorModel.json", + "title": "Bad Request", + "status": 400, + "detail": "metadata validation failed for text_id 'some-text': metadata validation failed:\n - year: expected integer, got string" +} +``` + +### Validation Error: Invalid Enum Value + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/literary-texts" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "some-text", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, 0.3, ...], + "vector_dim": 3072, + "metadata": { + "author": "John Doe", + "year": 1781, + "genre": "novel" + } + }] + }' +``` + +**Error Response:** +```json +{ + "$schema": "http://localhost:8080/schemas/ErrorModel.json", + "title": "Bad Request", + "status": 400, + "detail": "metadata validation failed for text_id 'some-text': metadata validation failed:\n - genre: value must be one of: poetry, prose, drama, essay" +} +``` + +### Validation Error: Value Out of Range + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/rated-content" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "review001", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, 0.3, ...], + "vector_dim": 3072, + "metadata": { + "rating": 7.5 + } + }] + }' +``` + +**Error Response:** +```json +{ + "$schema": "http://localhost:8080/schemas/ErrorModel.json", + "title": "Bad Request", + "status": 400, + "detail": "metadata validation failed for text_id 'review001': metadata validation failed:\n - rating: must be <= 5" +} +``` + +## Real-World Schema Examples + +### Academic Publications + +```json +{ + "type": "object", + "properties": { + "doi": { + "type": "string", + "pattern": "^10\\.\\d{4,}/[\\w\\-\\.]+$" + }, + "title": { + "type": "string", + "minLength": 1, + "maxLength": 500 + }, + "authors": { + "type": "array", + "items": {"type": "string"}, + "minItems": 1 + }, + "year": { + "type": "integer", + "minimum": 1900, + "maximum": 2100 + }, + "journal": {"type": "string"}, + "volume": {"type": "integer"}, + "pages": {"type": "string"}, + "keywords": { + "type": "array", + "items": {"type": "string"} + } + }, + "required": ["doi", "title", "authors", "year"] +} +``` + +### Legal Documents + +```json +{ + "type": "object", + "properties": { + "case_number": {"type": "string"}, + "court": {"type": "string"}, + "date": { + "type": "string", + "pattern": "^\\d{4}-\\d{2}-\\d{2}$" + }, + "jurisdiction": { + "type": "string", + "enum": ["federal", "state", "local"] + }, + "category": { + "type": "string", + "enum": ["civil", "criminal", "administrative"] + }, + "parties": { + "type": "array", + "items": {"type": "string"} + } + }, + "required": ["case_number", "court", "date"] +} +``` + +### Product Catalog + +```json +{ + "type": "object", + "properties": { + "sku": { + "type": "string", + "pattern": "^[A-Z]{3}-\\d{6}$" + }, + "name": {"type": "string"}, + "category": { + "type": "string", + "enum": ["electronics", "clothing", "books", "home", "toys"] + }, + "price": { + "type": "number", + "minimum": 0 + }, + "in_stock": {"type": "boolean"}, + "tags": { + "type": "array", + "items": {"type": "string"} + } + }, + "required": ["sku", "name", "category", "price"] +} +``` + +## Admin Sanity Check + +Administrators can verify database integrity using the `/v1/admin/sanity-check` endpoint: + +```bash +curl -X GET "https://api.example.com/v1/admin/sanity-check" \ + -H "Authorization: Bearer admin_api_key" +``` + +**Response:** +```json +{ + "status": "PASSED", + "total_projects": 5, + "issues_count": 0, + "warnings_count": 1, + "warnings": [ + "Project alice/project1 has 100 embeddings but no metadata schema defined" + ] +} +``` + +**Status Values:** +- `PASSED`: No issues or warnings found +- `WARNING`: No critical issues, but warnings exist +- `FAILED`: Validation issues found that need attention + +The sanity check: +- Validates all embeddings have dimensions matching their LLM service +- Validates all metadata against project schemas (if defined) +- Reports projects without schemas as warnings + +## Best Practices + +### 1. Start Simple, Add Complexity Later + +Begin with basic required fields: + +```json +{ + "type": "object", + "properties": { + "source": {"type": "string"} + }, + "required": ["source"] +} +``` + +Add more constraints as your needs evolve. + +### 2. Test Schemas Before Deployment + +Use online JSON Schema validators like [jsonschemavalidator.net](https://www.jsonschemavalidator.net/) to test your schemas before deploying them. + +### 3. Document Your Schema + +Include a description in your project: + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/my-project" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "description": "Project with metadata schema: author (required string), year (integer), genre (enum)" + }' +``` + +### 4. Version Your Schemas + +If you need to change a schema significantly, consider creating a new project rather than updating the existing one. + +### 5. Optional vs Required + +Be judicious with `required` fields. Too many required fields can make uploads cumbersome. + +### 6. Escape JSON Properly + +When passing JSON schemas in curl commands, escape quotes properly or use single quotes for the outer JSON. + +## Projects Without Schemas + +If you don't provide a `metadataScheme` when creating a project: +- Metadata validation is skipped +- You can upload any valid JSON metadata +- This is useful for exploratory work or heterogeneous data + +## Schema Updates and Existing Data + +When you update a project's metadata schema: +- Existing embeddings are **not** revalidated +- The new schema only applies to new or updated embeddings +- Use the admin sanity check to find existing embeddings that don't conform + +## Removing a Schema + +To remove metadata validation from a project: + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/my-project" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"metadataScheme": null}' +``` + +After this, new embeddings can have any metadata structure. + +## Related Documentation + +- [RAG Workflow Guide](./rag-workflow.md) - Complete RAG implementation +- [Metadata Filtering Guide](./metadata-filtering.md) - Filter search results by metadata +- [Batch Operations Guide](./batch-operations.md) - Upload large datasets efficiently + +## Troubleshooting + +### Schema Syntax Errors + +**Error:** "Invalid JSON Schema" + +**Solution:** Validate your schema syntax using an online validator. Common issues: +- Missing commas between properties +- Unescaped quotes +- Invalid JSON structure + +### Uploads Failing After Schema Change + +**Problem:** Uploads worked before, now failing with validation errors + +**Solution:** Check that your metadata matches the new schema requirements. Review the error message for specific validation failures. + +### Want to Fix Non-Conforming Data + +**Problem:** Sanity check shows validation errors in existing data + +**Solution:** Either: +1. Update the schema to accept existing data +2. Re-upload conforming data to replace non-conforming embeddings +3. Delete and re-upload the project with correct metadata diff --git a/docs/content/guides/ownership-transfer.md b/docs/content/guides/ownership-transfer.md new file mode 100644 index 0000000..8c47b2b --- /dev/null +++ b/docs/content/guides/ownership-transfer.md @@ -0,0 +1,406 @@ +--- +title: "Ownership Transfer Guide" +weight: 4 +--- + +# Ownership Transfer Guide + +This guide explains how to transfer project ownership between users in dhamps-vdb. + +## Overview + +Project ownership transfer allows you to reassign full control of a project from one user to another. This is useful when: +- A project maintainer is leaving and wants to hand over control +- Organizational changes require reassigning project ownership +- Consolidating projects under a different user account +- Transferring stewardship of research data to a new PI + +## Important Constraints + +Before transferring ownership, understand these constraints: + +1. **Only the current owner can transfer** - Editors and readers cannot initiate transfers +2. **New owner must exist** - The target user must already be registered in the system +3. **No handle conflicts** - The new owner cannot already have a project with the same handle +4. **Old owner loses access** - After transfer, the original owner has no access to the project +5. **Data remains intact** - All embeddings and metadata are preserved during transfer +6. **Shared users remain** - Existing sharing relationships are maintained + +## Transferring Ownership + +### Basic Transfer + +Transfer a project to another user: + +```bash +curl -X POST "https://api.example.com/v1/projects/alice/research-data/transfer-ownership" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "new_owner_handle": "bob" + }' +``` + +**Response:** +```json +{ + "message": "Project ownership transferred successfully", + "project_handle": "research-data", + "old_owner": "alice", + "new_owner": "bob" +} +``` + +After this operation: +- The project is now accessible at `/v1/projects/bob/research-data` +- Bob has full owner privileges +- Alice no longer has any access to the project +- All embeddings remain unchanged + +### Complete Transfer Example + +Here's a complete workflow showing before and after transfer: + +```bash +# Before transfer - Alice is the owner +curl -X GET "https://api.example.com/v1/projects/alice/my-project" \ + -H "Authorization: Bearer alice_api_key" + +# Response +{ + "project_handle": "my-project", + "owner": "alice", + "description": "Research project", + "instance_id": 123 +} + +# Alice transfers to Bob +curl -X POST "https://api.example.com/v1/projects/alice/my-project/transfer-ownership" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"new_owner_handle": "bob"}' + +# After transfer - Bob is now the owner +curl -X GET "https://api.example.com/v1/projects/bob/my-project" \ + -H "Authorization: Bearer bob_api_key" + +# Response +{ + "project_handle": "my-project", + "owner": "bob", + "description": "Research project", + "instance_id": 123 +} + +# Alice can no longer access it +curl -X GET "https://api.example.com/v1/projects/alice/my-project" \ + -H "Authorization: Bearer alice_api_key" +# Returns: 404 Not Found +``` + +## Effects of Transfer + +### Project Access Path Changes + +The project URL changes to reflect the new owner: + +**Before:** +``` +/v1/projects/alice/research-data +/v1/embeddings/alice/research-data +/v1/similars/alice/research-data/doc123 +``` + +**After:** +``` +/v1/projects/bob/research-data +/v1/embeddings/bob/research-data +/v1/similars/bob/research-data/doc123 +``` + +**Important:** Update all client code and bookmarks to use the new owner's handle. + +### New Owner Gains Full Control + +Bob (new owner) can now: +- View and modify all embeddings +- Update project settings (description, instance, metadata schema) +- Manage sharing (add/remove shared users) +- Transfer ownership again to someone else +- Delete the project + +### Old Owner Loses All Access + +Alice (old owner) can no longer: +- Access the project in any way +- View or modify embeddings +- See project metadata +- Manage sharing +- Transfer ownership back + +**Note:** If Alice needs continued access, Bob should share the project with her after the transfer. + +### Shared Users Remain + +If the project was shared with other users, those sharing relationships are preserved: + +```bash +# Before transfer - Alice shares with Charlie (editor) +curl -X POST "https://api.example.com/v1/projects/alice/research-data/share" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"share_with_handle": "charlie", "role": "editor"}' + +# Transfer to Bob +curl -X POST "https://api.example.com/v1/projects/alice/research-data/transfer-ownership" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"new_owner_handle": "bob"}' + +# After transfer - Charlie still has editor access +curl -X GET "https://api.example.com/v1/projects/bob/research-data" \ + -H "Authorization: Bearer charlie_api_key" +# Works! Charlie can still access as editor +``` + +### Upgrading Shared User to Owner + +If the new owner was previously a shared user, their role is automatically upgraded: + +```bash +# Alice shares project with Bob (editor) +curl -X POST "https://api.example.com/v1/projects/alice/my-project/share" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"share_with_handle": "bob", "role": "editor"}' + +# Alice transfers ownership to Bob +curl -X POST "https://api.example.com/v1/projects/alice/my-project/transfer-ownership" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"new_owner_handle": "bob"}' + +# Bob's previous "editor" sharing role is removed +# Bob now has full owner privileges instead +``` + +## Use Cases + +### PI Leaving Institution + +A principal investigator leaving an institution transfers project ownership to a colleague: + +```bash +curl -X POST "https://api.example.com/v1/projects/prof_jones/lab_data/transfer-ownership" \ + -H "Authorization: Bearer prof_jones_api_key" \ + -H "Content-Type: application/json" \ + -d '{"new_owner_handle": "prof_smith"}' +``` + +### Account Consolidation + +Consolidate multiple projects under a single organizational account: + +```bash +# Transfer Alice's personal projects to organization account +curl -X POST "https://api.example.com/v1/projects/alice/project1/transfer-ownership" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"new_owner_handle": "org_datascience"}' + +curl -X POST "https://api.example.com/v1/projects/alice/project2/transfer-ownership" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"new_owner_handle": "org_datascience"}' +``` + +### Graduated Student Handoff + +A graduating student transfers their research project to their advisor: + +```bash +curl -X POST "https://api.example.com/v1/projects/student_bob/thesis_embeddings/transfer-ownership" \ + -H "Authorization: Bearer student_bob_api_key" \ + -H "Content-Type: application/json" \ + -d '{"new_owner_handle": "advisor_carol"}' + +# Advisor can then share it back with the student if needed +curl -X POST "https://api.example.com/v1/projects/advisor_carol/thesis_embeddings/share" \ + -H "Authorization: Bearer advisor_carol_api_key" \ + -H "Content-Type: application/json" \ + -d '{"share_with_handle": "student_bob", "role": "reader"}' +``` + +### Department Reorganization + +Projects move to a new department owner: + +```bash +curl -X POST "https://api.example.com/v1/projects/old_dept/resource_library/transfer-ownership" \ + -H "Authorization: Bearer old_dept_api_key" \ + -H "Content-Type: application/json" \ + -d '{"new_owner_handle": "new_dept"}' +``` + +## Error Conditions + +### New Owner Doesn't Exist + +```bash +curl -X POST "https://api.example.com/v1/projects/alice/my-project/transfer-ownership" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"new_owner_handle": "nonexistent_user"}' +``` + +**Error:** +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "User 'nonexistent_user' does not exist" +} +``` + +**Solution:** Ensure the target user is registered first. Contact admin to create the user. + +### Handle Conflict + +```bash +# Bob already has a project called "research-data" +curl -X POST "https://api.example.com/v1/projects/alice/research-data/transfer-ownership" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"new_owner_handle": "bob"}' +``` + +**Error:** +```json +{ + "title": "Conflict", + "status": 409, + "detail": "User 'bob' already has a project with handle 'research-data'" +} +``` + +**Solution:** Either: +1. Rename Alice's project before transferring: + ```bash + curl -X PATCH "https://api.example.com/v1/projects/alice/research-data" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"project_handle": "research-data-alice"}' + ``` +2. Ask Bob to rename or delete their existing project +3. Choose a different target user + +### Not the Owner + +```bash +# Charlie tries to transfer Alice's project +curl -X POST "https://api.example.com/v1/projects/alice/research-data/transfer-ownership" \ + -H "Authorization: Bearer charlie_api_key" \ + -H "Content-Type: application/json" \ + -d '{"new_owner_handle": "bob"}' +``` + +**Error:** +```json +{ + "title": "Forbidden", + "status": 403, + "detail": "Only the project owner can transfer ownership" +} +``` + +**Solution:** Only the current owner (Alice) can initiate the transfer. + +## Best Practices + +### Before Transferring + +1. **Communicate with New Owner**: Ensure they're willing to accept ownership +2. **Document Current State**: Export or document current embeddings and metadata +3. **Review Shared Users**: Check who has access and whether sharing should continue +4. **Update Client Code**: Identify all systems accessing the project that need updating +5. **Backup Data**: Consider exporting important data before transfer + +### During Transfer + +1. **Transfer at Low-Activity Time**: Minimize disruption by transferring during quiet periods +2. **Test Access First**: Verify new owner can access their other projects +3. **Use Correct Handle**: Double-check the new owner's handle before submitting + +### After Transferring + +1. **Verify New Ownership**: Confirm the transfer succeeded +2. **Update Client Applications**: Change all API calls to use new owner handle +3. **Grant Back Access if Needed**: New owner can share project back to old owner +4. **Update Documentation**: Update any documentation referencing the project path +5. **Notify Shared Users**: Inform shared users about the path change + +## Maintaining Access After Transfer + +If the original owner needs continued access, the new owner should share the project: + +```bash +# Step 1: Alice transfers to Bob +curl -X POST "https://api.example.com/v1/projects/alice/research-data/transfer-ownership" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"new_owner_handle": "bob"}' + +# Step 2: Bob shares back with Alice as editor +curl -X POST "https://api.example.com/v1/projects/bob/research-data/share" \ + -H "Authorization: Bearer bob_api_key" \ + -H "Content-Type: application/json" \ + -d '{"share_with_handle": "alice", "role": "editor"}' + +# Now Alice can still access (but as editor, not owner) +curl -X GET "https://api.example.com/v1/embeddings/bob/research-data" \ + -H "Authorization: Bearer alice_api_key" +``` + +## Checking Current Owner + +To verify current project ownership: + +```bash +curl -X GET "https://api.example.com/v1/projects/{owner}/{project}" \ + -H "Authorization: Bearer your_api_key" +``` + +The `owner` field in the response shows the current owner. + +## Related Documentation + +- [Project Sharing Guide](./project-sharing.md) - Share projects with specific users +- [Public Projects Guide](./public-projects.md) - Make projects publicly accessible + +## Troubleshooting + +### Cannot Find Project After Transfer + +**Problem:** Getting 404 after transfer + +**Solution:** Update the owner in your API calls: +- Old: `/v1/projects/alice/my-project` +- New: `/v1/projects/bob/my-project` + +### Need to Reverse Transfer + +**Problem:** Transferred by mistake, need to reverse + +**Solution:** New owner must transfer back: +```bash +curl -X POST "https://api.example.com/v1/projects/bob/my-project/transfer-ownership" \ + -H "Authorization: Bearer bob_api_key" \ + -H "Content-Type: application/json" \ + -d '{"new_owner_handle": "alice"}' +``` + +### Client Applications Failing + +**Problem:** Applications can't access project after transfer + +**Solution:** Update all hardcoded owner references in your code to use the new owner's handle. diff --git a/docs/content/guides/project-sharing.md b/docs/content/guides/project-sharing.md new file mode 100644 index 0000000..b9276e1 --- /dev/null +++ b/docs/content/guides/project-sharing.md @@ -0,0 +1,333 @@ +--- +title: "Project Sharing Guide" +weight: 2 +--- + +# Project Sharing Guide + +This guide explains how to share projects with specific users for collaborative work in dhamps-vdb. + +## Overview + +Project sharing allows you to grant other users access to your projects with different permission levels: +- **reader**: Read-only access to embeddings and similar documents +- **editor**: Read and write access to embeddings (can add/modify/delete embeddings) + +Only the project owner can manage sharing settings and delete the project. + +## Sharing During Project Creation + +You can specify users to share with when creating a new project using the `shared_with` field: + +```bash +curl -X PUT "https://api.example.com/v1/projects/alice/collaborative-project" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "collaborative-project", + "description": "A project shared with team members", + "instance_id": 123, + "shared_with": [ + { + "user_handle": "bob", + "role": "reader" + }, + { + "user_handle": "charlie", + "role": "editor" + } + ] + }' +``` + +In this example: +- `bob` can read embeddings and search for similar documents +- `charlie` can read embeddings, search, and also add/modify/delete embeddings +- `alice` (the owner) has full control including managing sharing and deleting the project + +## Managing Sharing After Creation + +### Share a Project with a User + +Add a user to an existing project: + +```bash +curl -X POST "https://api.example.com/v1/projects/alice/collaborative-project/share" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "share_with_handle": "david", + "role": "reader" + }' +``` + +**Response:** +```json +{ + "message": "Project shared successfully", + "user": "david", + "role": "reader" +} +``` + +### Update User's Role + +To change a user's role, simply share again with the new role: + +```bash +curl -X POST "https://api.example.com/v1/projects/alice/collaborative-project/share" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "share_with_handle": "david", + "role": "editor" + }' +``` + +This updates David's access from reader to editor. + +### Unshare a Project from a User + +Remove a user's access to a project: + +```bash +curl -X DELETE "https://api.example.com/v1/projects/alice/collaborative-project/share/david" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** +```json +{ + "message": "Project unshared successfully", + "user": "david" +} +``` + +### List Shared Users + +View all users a project is shared with: + +```bash +curl -X GET "https://api.example.com/v1/projects/alice/collaborative-project/shared-with" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** +```json +{ + "project_handle": "collaborative-project", + "owner": "alice", + "shared_with": [ + { + "user_handle": "bob", + "role": "reader" + }, + { + "user_handle": "charlie", + "role": "editor" + } + ] +} +``` + +**Note:** Only the project owner can view the list of shared users. Users who have been granted access cannot see which other users also have access. + +## What Shared Users Can Do + +### As a Reader + +Bob (with reader access) can: + +```bash +# View project metadata +curl -X GET "https://api.example.com/v1/projects/alice/collaborative-project" \ + -H "Authorization: Bearer bob_api_key" + +# Retrieve embeddings +curl -X GET "https://api.example.com/v1/embeddings/alice/collaborative-project" \ + -H "Authorization: Bearer bob_api_key" + +# Get a specific embedding +curl -X GET "https://api.example.com/v1/embeddings/alice/collaborative-project/doc123" \ + -H "Authorization: Bearer bob_api_key" + +# Search for similar documents +curl -X GET "https://api.example.com/v1/similars/alice/collaborative-project/doc123?count=5" \ + -H "Authorization: Bearer bob_api_key" +``` + +### As an Editor + +Charlie (with editor access) can do everything a reader can, plus: + +```bash +# Upload new embeddings +curl -X POST "https://api.example.com/v1/embeddings/alice/collaborative-project" \ + -H "Authorization: Bearer charlie_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "doc456", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, 0.3, ...], + "vector_dim": 3072, + "metadata": {"author": "Charlie"} + }] + }' + +# Delete specific embeddings +curl -X DELETE "https://api.example.com/v1/embeddings/alice/collaborative-project/doc456" \ + -H "Authorization: Bearer charlie_api_key" + +# Delete all embeddings +curl -X DELETE "https://api.example.com/v1/embeddings/alice/collaborative-project" \ + -H "Authorization: Bearer charlie_api_key" +``` + +## What Shared Users Cannot Do + +Even with editor access, shared users **cannot**: + +- Delete the project +- Change project settings (description, instance, metadata schema) +- Manage sharing (add/remove other users) +- View the list of other shared users +- Transfer project ownership + +These operations require owner privileges. + +## Permission Summary Table + +| Operation | Owner | Editor | Reader | +|-----------|-------|--------|--------| +| View project metadata | ✅ | ✅ | ✅ | +| Retrieve embeddings | ✅ | ✅ | ✅ | +| Search similar documents | ✅ | ✅ | ✅ | +| Add embeddings | ✅ | ✅ | ❌ | +| Modify embeddings | ✅ | ✅ | ❌ | +| Delete embeddings | ✅ | ✅ | ❌ | +| Update project settings | ✅ | ❌ | ❌ | +| Delete project | ✅ | ❌ | ❌ | +| Manage sharing | ✅ | ❌ | ❌ | +| View shared users list | ✅ | ❌ | ❌ | +| Transfer ownership | ✅ | ❌ | ❌ | + +## Use Cases + +### Research Team Collaboration + +A research team can share a project where: +- The principal investigator (PI) is the owner +- Research assistants have editor access to upload new data +- External collaborators have reader access to query the data + +```bash +# PI creates and shares the project +curl -X PUT "https://api.example.com/v1/projects/pi/research-corpus" \ + -H "Authorization: Bearer pi_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "research-corpus", + "description": "Research team corpus", + "instance_id": 123, + "shared_with": [ + {"user_handle": "assistant1", "role": "editor"}, + {"user_handle": "assistant2", "role": "editor"}, + {"user_handle": "external_collab", "role": "reader"} + ] + }' +``` + +### Data Pipeline with Read-Only Access + +Share processed embeddings with downstream consumers: + +```bash +# Data processor creates project +curl -X PUT "https://api.example.com/v1/projects/processor/processed-data" \ + -H "Authorization: Bearer processor_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "processed-data", + "description": "Processed embeddings for consumption", + "instance_id": 456, + "shared_with": [ + {"user_handle": "app_backend", "role": "reader"}, + {"user_handle": "analytics_team", "role": "reader"} + ] + }' +``` + +### Temporary Access + +Grant temporary access to a consultant and revoke it later: + +```bash +# Grant access +curl -X POST "https://api.example.com/v1/projects/alice/sensitive-project/share" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"share_with_handle": "consultant", "role": "reader"}' + +# Revoke access when consultation is complete +curl -X DELETE "https://api.example.com/v1/projects/alice/sensitive-project/share/consultant" \ + -H "Authorization: Bearer alice_api_key" +``` + +## Combining with Public Access + +You can combine user-specific sharing with public access (see [Public Projects Guide](./public-projects.md)): + +```bash +curl -X PUT "https://api.example.com/v1/projects/alice/mixed-access-project" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "mixed-access-project", + "description": "Public read, specific editors", + "instance_id": 123, + "public_read": true, + "shared_with": [ + {"user_handle": "bob", "role": "editor"} + ] + }' +``` + +In this case: +- Anyone can read the project (no authentication required) +- `bob` can also edit (add/modify/delete embeddings) +- `alice` retains full owner privileges + +## Security Considerations + +1. **Choose Roles Carefully**: Only grant editor access to trusted users who need to modify data +2. **Audit Access**: Regularly review the shared users list to ensure appropriate access levels +3. **Revoke Promptly**: Remove access immediately when users no longer need it +4. **Use Reader by Default**: Start with reader access and upgrade to editor only when necessary +5. **Consider Public Access**: For truly open data, use `public_read: true` instead of sharing with many users + +## Related Documentation + +- [Ownership Transfer Guide](./ownership-transfer.md) - Transfer project ownership +- [Public Projects Guide](./public-projects.md) - Make projects publicly accessible +- [Instance Management Guide](./instance-management.md) - Share LLM service instances + +## Troubleshooting + +### Cannot Share Project + +**Error:** "Only the owner can share projects" + +**Solution:** Only the project owner can manage sharing. If you need to share the project, ask the owner to add you, or consider [transferring ownership](./ownership-transfer.md). + +### User Not Found + +**Error:** "User not found" + +**Solution:** The user must exist in the system before you can share a project with them. Ask the admin to create the user first. + +### Cannot View Shared Users List + +**Error:** "Forbidden" + +**Solution:** Only the project owner can view the list of shared users. Shared users cannot see other shared users for privacy reasons. diff --git a/docs/content/guides/public-projects.md b/docs/content/guides/public-projects.md new file mode 100644 index 0000000..c2854fe --- /dev/null +++ b/docs/content/guides/public-projects.md @@ -0,0 +1,420 @@ +--- +title: "Public Projects Guide" +weight: 3 +--- + +# Public Projects Guide + +This guide explains how to make projects publicly accessible, allowing anyone to read embeddings and search for similar documents without authentication. + +## Overview + +Projects can be configured to allow unauthenticated (public) read access by setting the `public_read` field to `true`. This is useful for: +- Open datasets and research data +- Public APIs and services +- Shared knowledge bases +- Educational resources + +**Important:** Public access only applies to read operations. Write operations (creating, updating, or deleting embeddings) always require authentication. + +## Creating a Public Project + +Set `public_read` to `true` when creating a project: + +```bash +curl -X PUT "https://api.example.com/v1/projects/alice/public-knowledge" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "public-knowledge", + "description": "Publicly accessible knowledge base", + "instance_id": 123, + "public_read": true + }' +``` + +**Response:** +```json +{ + "project_handle": "public-knowledge", + "owner": "alice", + "description": "Publicly accessible knowledge base", + "instance_id": 123, + "public_read": true +} +``` + +## Making an Existing Project Public + +Update an existing project using PATCH: + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/my-project" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"public_read": true}' +``` + +## Making a Public Project Private + +To disable public access: + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/public-knowledge" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"public_read": false}' +``` + +## Accessing Public Projects Without Authentication + +Once a project has `public_read` enabled, anyone can access it without providing an API key. + +### Get Project Metadata + +```bash +curl -X GET "https://api.example.com/v1/projects/alice/public-knowledge" +``` + +**Response:** +```json +{ + "project_handle": "public-knowledge", + "owner": "alice", + "description": "Publicly accessible knowledge base", + "instance_id": 123, + "public_read": true +} +``` + +### Retrieve All Embeddings + +```bash +curl -X GET "https://api.example.com/v1/embeddings/alice/public-knowledge?limit=100" +``` + +**Response:** +```json +{ + "embeddings": [ + { + "text_id": "doc001", + "text": "Public document content", + "metadata": {"category": "science"}, + "vector_dim": 3072 + }, + { + "text_id": "doc002", + "text": "Another public document", + "metadata": {"category": "history"}, + "vector_dim": 3072 + } + ] +} +``` + +### Get a Specific Embedding + +```bash +curl -X GET "https://api.example.com/v1/embeddings/alice/public-knowledge/doc001" +``` + +**Response:** +```json +{ + "text_id": "doc001", + "text": "Public document content", + "metadata": {"category": "science"}, + "vector_dim": 3072, + "vector": [0.021, -0.015, 0.043, ...] +} +``` + +### Search for Similar Documents + +```bash +# Search by existing document ID +curl -X GET "https://api.example.com/v1/similars/alice/public-knowledge/doc001?count=5&threshold=0.7" +``` + +**Response:** +```json +{ + "user_handle": "alice", + "project_handle": "public-knowledge", + "results": [ + {"id": "doc002", "similarity": 0.92}, + {"id": "doc003", "similarity": 0.85}, + {"id": "doc004", "similarity": 0.78} + ] +} +``` + +### Search with Raw Embeddings + +```bash +curl -X POST "https://api.example.com/v1/similars/alice/public-knowledge?count=5" \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [0.032, -0.018, 0.056, ...] + }' +``` + +## Operations Still Requiring Authentication + +Even for public projects, these operations require authentication: + +### Creating Embeddings (Requires Auth) + +```bash +# This will fail with 401 Unauthorized +curl -X POST "https://api.example.com/v1/embeddings/alice/public-knowledge" \ + -H "Content-Type: application/json" \ + -d '{"embeddings": [...]}' + +# This succeeds with valid API key +curl -X POST "https://api.example.com/v1/embeddings/alice/public-knowledge" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "doc123", + "instance_handle": "openai-large", + "vector": [0.1, 0.2, 0.3, ...], + "vector_dim": 3072 + }] + }' +``` + +### Deleting Embeddings (Requires Auth) + +```bash +# Delete specific embedding (requires auth) +curl -X DELETE "https://api.example.com/v1/embeddings/alice/public-knowledge/doc001" \ + -H "Authorization: Bearer alice_api_key" + +# Delete all embeddings (requires auth) +curl -X DELETE "https://api.example.com/v1/embeddings/alice/public-knowledge" \ + -H "Authorization: Bearer alice_api_key" +``` + +### Modifying Project Settings (Requires Auth) + +```bash +# Update project description (requires auth) +curl -X PATCH "https://api.example.com/v1/projects/alice/public-knowledge" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"description": "Updated description"}' +``` + +### Deleting Project (Requires Auth) + +```bash +curl -X DELETE "https://api.example.com/v1/projects/alice/public-knowledge" \ + -H "Authorization: Bearer alice_api_key" +``` + +## Combining Public Access with User Sharing + +You can combine public read access with user-specific editor permissions: + +```bash +curl -X PUT "https://api.example.com/v1/projects/alice/collaborative-public" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "collaborative-public", + "description": "Public read, restricted write", + "instance_id": 123, + "public_read": true, + "shared_with": [ + { + "user_handle": "bob", + "role": "editor" + }, + { + "user_handle": "charlie", + "role": "editor" + } + ] + }' +``` + +In this configuration: +- **Anyone** can read embeddings and search (no auth required) +- **bob** and **charlie** can add/modify/delete embeddings (with auth) +- **alice** (owner) has full control (with auth) + +## Use Cases + +### Open Research Dataset + +Share research data publicly while maintaining write control: + +```bash +curl -X PUT "https://api.example.com/v1/projects/university/research-corpus" \ + -H "Authorization: Bearer university_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "research-corpus", + "description": "Open research corpus for academic use", + "instance_id": 456, + "public_read": true, + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"doi\":{\"type\":\"string\"},\"year\":{\"type\":\"integer\"}},\"required\":[\"doi\"]}" + }' +``` + +Researchers worldwide can access the data without credentials, but only authorized users can add new data. + +### Public API Backend + +Build a public search API on top of dhamps-vdb: + +```python +import requests + +def public_search_api(query_vector, count=10): + """Public search function requiring no authentication""" + response = requests.post( + "https://api.example.com/v1/similars/company/product-docs", + json={"vector": query_vector}, + params={"count": count, "threshold": 0.6} + ) + return response.json() + +# No API key needed for public projects! +results = public_search_api(query_embedding) +``` + +### Educational Resources + +Share educational content publicly: + +```bash +curl -X PUT "https://api.example.com/v1/projects/edu/learning-materials" \ + -H "Authorization: Bearer edu_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "learning-materials", + "description": "Free educational content embeddings", + "instance_id": 789, + "public_read": true + }' +``` + +Students and educators can access the materials without creating accounts. + +### Community-Driven Knowledge Base + +Open knowledge base with restricted editors: + +```bash +curl -X PUT "https://api.example.com/v1/projects/community/wiki-embeddings" \ + -H "Authorization: Bearer community_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "wiki-embeddings", + "description": "Community wiki embeddings", + "instance_id": 321, + "public_read": true, + "shared_with": [ + {"user_handle": "moderator1", "role": "editor"}, + {"user_handle": "moderator2", "role": "editor"} + ] + }' +``` + +## Security Considerations + +### What is Publicly Visible + +When `public_read` is enabled: +- ✅ Project metadata (name, description, owner) +- ✅ All embedding vectors and text content +- ✅ All embedding metadata +- ✅ Vector dimensions and instance references +- ❌ API keys (never exposed) +- ❌ User passwords or credentials + +### Best Practices + +1. **Review Content First**: Ensure no sensitive information is in embeddings or metadata before enabling public access +2. **Use Metadata Schemas**: Enforce consistent metadata structure with validation +3. **Monitor Usage**: Track access patterns to your public projects +4. **Set Clear Descriptions**: Provide clear project descriptions explaining the data's purpose and licensing +5. **Consider Rate Limiting**: For high-traffic public APIs, implement rate limiting at the application level + +### What to Avoid + +❌ **Don't** make projects public that contain: +- Personal identifiable information (PII) +- Proprietary or confidential data +- Sensitive research data not yet published +- Internal company information + +✅ **Do** make projects public that contain: +- Already-published research data +- Open educational resources +- Public domain content +- Creative Commons licensed materials + +## Disabling Public Access + +If you need to make a public project private again: + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/public-knowledge" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"public_read": false}' +``` + +After this change: +- All read operations require authentication +- Existing anonymous access is immediately revoked +- No data is deleted, just access is restricted + +## Checking if a Project is Public + +View project metadata to check the `public_read` flag: + +```bash +curl -X GET "https://api.example.com/v1/projects/alice/public-knowledge" +``` + +Look for `"public_read": true` in the response. + +## Related Documentation + +- [Project Sharing Guide](./project-sharing.md) - Share with specific users +- [RAG Workflow Guide](./rag-workflow.md) - Complete RAG implementation +- [Metadata Validation Guide](./metadata-validation.md) - Enforce data quality + +## Troubleshooting + +### Public Access Not Working + +**Symptom:** Still getting 401 Unauthorized for public project + +**Solutions:** +1. Verify `public_read: true` is set: + ```bash + curl -X GET "https://api.example.com/v1/projects/alice/my-project" + ``` +2. Check you're using GET/POST for similars (not other methods) +3. Ensure the project exists and handle is correct + +### Accidentally Made Project Public + +**Solution:** Immediately disable public access: +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/my-project" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{"public_read": false}' +``` + +### Want to Track Public Usage + +**Solution:** Anonymous requests are logged with user set to "public". Review server logs to monitor public access patterns. diff --git a/docs/content/guides/rag-workflow.md b/docs/content/guides/rag-workflow.md new file mode 100644 index 0000000..30c5138 --- /dev/null +++ b/docs/content/guides/rag-workflow.md @@ -0,0 +1,393 @@ +--- +title: "RAG Workflow Guide" +weight: 1 +--- + +# Complete RAG Workflow Guide + +This guide demonstrates a complete Retrieval Augmented Generation (RAG) workflow using dhamps-vdb as your vector database. + +## Overview + +A typical RAG workflow involves: +1. Generate embeddings from your text content (using an external LLM service) +2. Upload embeddings to dhamps-vdb +3. Search for similar documents based on a query +4. Retrieve the relevant context +5. Use the context with an LLM to generate responses + +## Prerequisites + +- Access to dhamps-vdb API with a valid API key +- An external LLM service for generating embeddings (e.g., OpenAI, Cohere) +- Text content you want to process + +## Step 1: Generate Embeddings Externally + +First, use your chosen LLM service to generate embeddings for your text content. Here's an example using OpenAI's API: + +```python +import openai + +# Initialize OpenAI client +client = openai.OpenAI(api_key="your-openai-key") + +# Generate embeddings for your text +text = "The quick brown fox jumps over the lazy dog" +response = client.embeddings.create( + model="text-embedding-3-large", + input=text, + dimensions=3072 +) + +embedding_vector = response.data[0].embedding +``` + +## Step 2: Create LLM Service Instance + +Before uploading embeddings, create an LLM service instance in dhamps-vdb that matches your embedding configuration: + +```bash +curl -X PUT "https://api.example.com/v1/llm-services/alice/my-openai" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "endpoint": "https://api.openai.com/v1/embeddings", + "api_standard": "openai", + "model": "text-embedding-3-large", + "dimensions": 3072, + "description": "OpenAI large embedding model", + "api_key_encrypted": "sk-proj-your-openai-key" + }' +``` + +**Response:** +```json +{ + "instance_id": 123, + "instance_handle": "my-openai", + "owner": "alice", + "endpoint": "https://api.openai.com/v1/embeddings", + "model": "text-embedding-3-large", + "dimensions": 3072 +} +``` + +## Step 3: Create a Project + +Create a project to organize your embeddings: + +```bash +curl -X PUT "https://api.example.com/v1/projects/alice/my-documents" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "project_handle": "my-documents", + "description": "Document embeddings for RAG", + "instance_id": 123 + }' +``` + +## Step 4: Upload Embeddings to dhamps-vdb + +Upload your pre-generated embeddings along with metadata and optional text content: + +```bash +curl -X POST "https://api.example.com/v1/embeddings/alice/my-documents" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "embeddings": [{ + "text_id": "doc001", + "instance_handle": "my-openai", + "vector": [0.021, -0.015, 0.043, ...], + "vector_dim": 3072, + "text": "The quick brown fox jumps over the lazy dog", + "metadata": { + "source": "example.txt", + "author": "Alice", + "category": "animals" + } + }] + }' +``` + +**Tip:** Upload multiple embeddings in batches for efficiency (see [Batch Operations Guide](./batch-operations.md)). + +## Step 5: Search for Similar Documents + +When you need to retrieve relevant context for a query: + +### Option A: Search by Stored Document ID + +If you already have a document in your database that represents your query: + +```bash +curl -X GET "https://api.example.com/v1/similars/alice/my-documents/doc001?count=5&threshold=0.7" \ + -H "Authorization: Bearer alice_api_key" +``` + +### Option B: Search with Raw Query Embedding + +Generate an embedding for your query and search without storing it: + +```python +# Generate query embedding +query = "What animals are mentioned?" +query_response = client.embeddings.create( + model="text-embedding-3-large", + input=query, + dimensions=3072 +) +query_vector = query_response.data[0].embedding +``` + +```bash +curl -X POST "https://api.example.com/v1/similars/alice/my-documents?count=5&threshold=0.7" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [0.032, -0.018, 0.056, ...] + }' +``` + +**Response:** +```json +{ + "user_handle": "alice", + "project_handle": "my-documents", + "results": [ + { + "id": "doc001", + "similarity": 0.95 + }, + { + "id": "doc042", + "similarity": 0.87 + }, + { + "id": "doc103", + "similarity": 0.82 + } + ] +} +``` + +## Step 6: Retrieve Context Documents + +Retrieve the full content and metadata for the most similar documents: + +```bash +curl -X GET "https://api.example.com/v1/embeddings/alice/my-documents/doc001" \ + -H "Authorization: Bearer alice_api_key" +``` + +**Response:** +```json +{ + "text_id": "doc001", + "text": "The quick brown fox jumps over the lazy dog", + "metadata": { + "source": "example.txt", + "author": "Alice", + "category": "animals" + }, + "vector_dim": 3072 +} +``` + +## Step 7: Use Context with LLM + +Combine the retrieved context with your original query to generate an informed response: + +```python +# Collect context from similar documents +context_docs = [] +for result in similarity_results['results'][:3]: + doc = get_document(result['id']) # Your function to fetch document + context_docs.append(doc['text']) + +# Build context string +context = "\n\n".join(context_docs) + +# Generate response with context +response = client.chat.completions.create( + model="gpt-4", + messages=[ + {"role": "system", "content": "Answer based on the provided context."}, + {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {query}"} + ] +) + +answer = response.choices[0].message.content +``` + +## Complete Python Example + +Here's a complete example combining all steps: + +```python +import openai +import requests + +# Configuration +DHAMPS_API = "https://api.example.com" +DHAMPS_KEY = "your-dhamps-api-key" +OPENAI_KEY = "your-openai-key" + +# Initialize OpenAI +client = openai.OpenAI(api_key=OPENAI_KEY) + +def embed_and_store(text_id, text, metadata=None): + """Generate embedding and store in dhamps-vdb""" + # Generate embedding + response = client.embeddings.create( + model="text-embedding-3-large", + input=text, + dimensions=3072 + ) + vector = response.data[0].embedding + + # Upload to dhamps-vdb + requests.post( + f"{DHAMPS_API}/v1/embeddings/alice/my-documents", + headers={ + "Authorization": f"Bearer {DHAMPS_KEY}", + "Content-Type": "application/json" + }, + json={ + "embeddings": [{ + "text_id": text_id, + "instance_handle": "my-openai", + "vector": vector, + "vector_dim": 3072, + "text": text, + "metadata": metadata or {} + }] + } + ) + +def search_similar(query, count=5): + """Search for similar documents using query text""" + # Generate query embedding + response = client.embeddings.create( + model="text-embedding-3-large", + input=query, + dimensions=3072 + ) + query_vector = response.data[0].embedding + + # Search in dhamps-vdb + result = requests.post( + f"{DHAMPS_API}/v1/similars/alice/my-documents?count={count}", + headers={ + "Authorization": f"Bearer {DHAMPS_KEY}", + "Content-Type": "application/json" + }, + json={"vector": query_vector} + ) + return result.json()['results'] + +def retrieve_context(doc_ids): + """Retrieve full document content""" + docs = [] + for doc_id in doc_ids: + response = requests.get( + f"{DHAMPS_API}/v1/embeddings/alice/my-documents/{doc_id}", + headers={"Authorization": f"Bearer {DHAMPS_KEY}"} + ) + docs.append(response.json()) + return docs + +def rag_query(query): + """Complete RAG workflow""" + # Search for similar documents + similar = search_similar(query, count=3) + + # Retrieve context + context_docs = retrieve_context([r['id'] for r in similar]) + context = "\n\n".join([doc['text'] for doc in context_docs]) + + # Generate answer with LLM + response = client.chat.completions.create( + model="gpt-4", + messages=[ + {"role": "system", "content": "Answer based on the provided context."}, + {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {query}"} + ] + ) + + return response.choices[0].message.content + +# Usage +embed_and_store("doc001", "The quick brown fox jumps over the lazy dog", + {"category": "animals"}) +answer = rag_query("What animals are mentioned?") +print(answer) +``` + +## Best Practices + +1. **Batch Upload**: Upload embeddings in batches of 100-1000 for better performance +2. **Use Metadata**: Include rich metadata for better filtering and organization +3. **Set Thresholds**: Use similarity thresholds (e.g., 0.7) to filter low-quality matches +4. **Cache Embeddings**: Cache generated embeddings to avoid redundant API calls +5. **Monitor Dimensions**: Ensure all embeddings use consistent dimensions (3072 for text-embedding-3-large) + +## Advanced Features + +### Metadata Filtering + +Exclude certain documents from search results using metadata filters: + +```bash +# Exclude documents from the same author as the query +curl -X GET "https://api.example.com/v1/similars/alice/my-documents/doc001?metadata_path=author&metadata_value=Alice" \ + -H "Authorization: Bearer alice_api_key" +``` + +See the [Metadata Filtering Guide](./metadata-filtering.md) for more details. + +### Metadata Validation + +Enforce consistent metadata structure using JSON Schema validation: + +```bash +curl -X PATCH "https://api.example.com/v1/projects/alice/my-documents" \ + -H "Authorization: Bearer alice_api_key" \ + -H "Content-Type: application/json" \ + -d '{ + "metadataScheme": "{\"type\":\"object\",\"properties\":{\"author\":{\"type\":\"string\"},\"category\":{\"type\":\"string\"}},\"required\":[\"author\"]}" + }' +``` + +See the [Metadata Validation Guide](./metadata-validation.md) for more details. + +## Related Documentation + +- [Batch Operations Guide](./batch-operations.md) - Efficiently upload large datasets +- [Metadata Filtering Guide](./metadata-filtering.md) - Advanced search filtering +- [Metadata Validation Guide](./metadata-validation.md) - Schema validation +- [Instance Management Guide](./instance-management.md) - Managing LLM service instances + +## Troubleshooting + +### Dimension Mismatch Error + +```json +{ + "title": "Bad Request", + "status": 400, + "detail": "dimension validation failed: vector dimension mismatch" +} +``` + +**Solution**: Ensure the `vector_dim` field matches the dimensions configured in your LLM service instance. + +### No Similar Results + +If searches return no results, try: +- Lowering the similarity threshold (e.g., from 0.8 to 0.5) +- Increasing the count parameter +- Verifying embeddings are uploaded correctly +- Checking that query embeddings use the same model and dimensions diff --git a/docs/content/reference/_index.md b/docs/content/reference/_index.md new file mode 100644 index 0000000..99baa22 --- /dev/null +++ b/docs/content/reference/_index.md @@ -0,0 +1,20 @@ +--- +title: "Reference" +weight: 7 +--- + +# Reference Documentation + +Technical reference materials and specifications. + +## Contents + +- [Configuration Reference](configuration/) - Complete configuration options +- [Database Schema](database-schema/) - Database structure +- [Roadmap](roadmap/) - Planned features and improvements + +## Additional Resources + +- **OpenAPI Specification**: Available at `/openapi.yaml` on any running instance +- **Go Package Documentation**: Coming soon +- **Source Code**: [github.com/mpilhlt/dhamps-vdb](https://github.com/mpilhlt/dhamps-vdb) diff --git a/docs/content/reference/configuration.md b/docs/content/reference/configuration.md new file mode 100644 index 0000000..ef561a1 --- /dev/null +++ b/docs/content/reference/configuration.md @@ -0,0 +1,524 @@ +--- +title: "Configuration Reference" +weight: 1 +--- + +# Configuration Reference + +Complete reference for configuring dhamps-vdb. This guide consolidates all configuration options, including environment variables, command-line flags, and Docker configuration. + +## Overview + +dhamps-vdb is configured through a combination of: + +1. **Environment variables** (recommended) +2. **Command-line flags** (overrides environment variables) +3. **`.env` files** (for Docker and local development) + +Configuration is loaded in the following priority order (highest to lowest): + +1. Command-line flags +2. Environment variables +3. `.env` file values +4. Default values from `options.go` + +## Configuration Options + +### Service Configuration + +Options for controlling the API service behavior. + +| Option | Environment Variable | CLI Flag | Type | Default | Required | Description | +|--------|---------------------|----------|------|---------|----------|-------------| +| Debug | `SERVICE_DEBUG` | `-d`, `--debug` | Boolean | `true` | No | Enable verbose debug logging | +| Host | `SERVICE_HOST` | `--host` | String | `localhost` | No | Hostname or IP to bind to | +| Port | `SERVICE_PORT` | `-p`, `--port` | Integer | `8880` | No | Port number to listen on | + +**Debug Mode:** +- `true` - Detailed logs including SQL queries, request details, internal operations +- `false` - Minimal logs for production use + +**Host Configuration:** +- `localhost` - Local development only +- `0.0.0.0` - Listen on all interfaces (required for Docker) +- Specific IP - Bind to particular network interface + +**Port Configuration:** +- Default: `8880` +- Ports below 1024 require elevated privileges +- Ensure port is not in use by another service + +### Database Configuration + +Options for connecting to the PostgreSQL database with pgvector extension. + +| Option | Environment Variable | CLI Flag | Type | Default | Required | Description | +|--------|---------------------|----------|------|---------|----------|-------------| +| DB Host | `SERVICE_DBHOST` | `--db-host` | String | `localhost` | Yes | PostgreSQL server hostname | +| DB Port | `SERVICE_DBPORT` | `--db-port` | Integer | `5432` | No | PostgreSQL server port | +| DB User | `SERVICE_DBUSER` | `--db-user` | String | `postgres` | Yes | Database username | +| DB Password | `SERVICE_DBPASSWORD` | `--db-password` | String | `password` | Yes | Database password | +| DB Name | `SERVICE_DBNAME` | `--db-name` | String | `postgres` | Yes | Database name | + +**Database Requirements:** +- PostgreSQL 12+ (16+ recommended) +- pgvector extension installed and enabled +- User must have CREATE, ALTER, DROP, INSERT, SELECT, UPDATE, DELETE privileges +- Database must exist before starting dhamps-vdb + +**Common Database Hosts:** +- `localhost` - Local PostgreSQL instance +- `postgres` - Docker Compose service name +- `db.example.com` - Remote database server +- IP address - Direct connection to database + +### Security Configuration + +Critical security settings for authentication and encryption. + +| Option | Environment Variable | CLI Flag | Type | Default | Required | Description | +|--------|---------------------|----------|------|---------|----------|-------------| +| Admin Key | `SERVICE_ADMINKEY` | `--admin-key` | String | - | **Yes** | Master API key for admin operations | +| Encryption Key | `ENCRYPTION_KEY` | - | String | - | **Yes** | AES-256 key for API key encryption | + +**Admin Key (`SERVICE_ADMINKEY`):** +- Master API key with full administrative privileges +- Used to create users and manage global resources +- Generate with: `openssl rand -base64 32` +- Must be kept secure and rotated regularly +- Transmitted via `Authorization: Bearer` header + +**Encryption Key (`ENCRYPTION_KEY`):** +- Used to encrypt user API keys in the database +- Minimum 32 characters required +- Uses AES-256-GCM encryption with SHA-256 hashing +- **CRITICAL:** If lost, all stored API keys become unrecoverable +- Cannot be changed without re-encrypting all existing API keys +- Generate with: `openssl rand -hex 32` +- Must be backed up securely and separately from database + +## Configuration Files + +### .env File + +The recommended way to configure dhamps-vdb. Create a `.env` file in the project root: + +```bash +# Copy template +cp template.env .env + +# Edit with your values +nano .env +``` + +**Example `.env` file:** + +```bash +# Service Configuration +SERVICE_DEBUG=false +SERVICE_HOST=0.0.0.0 +SERVICE_PORT=8880 + +# Database Configuration +SERVICE_DBHOST=localhost +SERVICE_DBPORT=5432 +SERVICE_DBUSER=dhamps_user +SERVICE_DBPASSWORD=secure_password_here +SERVICE_DBNAME=dhamps_vdb + +# Security Configuration +SERVICE_ADMINKEY=generated_admin_key_here +ENCRYPTION_KEY=generated_encryption_key_min_32_chars +``` + +**Security Notes:** +- `.env` files are in `.gitignore` by default +- Never commit `.env` files to version control +- Set restrictive permissions: `chmod 600 .env` +- Use different keys for dev/staging/production + +### template.env + +Starting template for configuration: + +```bash +#!/usr/bin/env bash + +SERVICE_DEBUG=true +SERVICE_HOST=localhost +SERVICE_PORT=8888 +SERVICE_DBHOST=localhost +SERVICE_DBPORT=5432 +SERVICE_DBUSER=postgres +SERVICE_DBPASSWORD=postgres +SERVICE_DBNAME=postgres +SERVICE_ADMINKEY=Ch4ngeM3! + +# Encryption key for API keys in LLM service instances +# Must be secure random string, at least 32 characters +# Generate with: openssl rand -hex 32 +ENCRYPTION_KEY=ChangeThisToASecureRandomKey123456789012 +``` + +## Docker Configuration + +### Docker Compose + +The `docker-compose.yml` file defines the full stack including PostgreSQL: + +```yaml +services: + postgres: + image: pgvector/pgvector:0.7.4-pg16 + environment: + POSTGRES_USER: ${SERVICE_DBUSER:-postgres} + POSTGRES_PASSWORD: ${SERVICE_DBPASSWORD:-postgres} + POSTGRES_DB: ${SERVICE_DBNAME:-dhamps_vdb} + ports: + - "${POSTGRES_PORT:-5432}:5432" + volumes: + - postgres_data:/var/lib/postgresql/data + + dhamps-vdb: + build: + context: . + dockerfile: Dockerfile + depends_on: + postgres: + condition: service_healthy + environment: + SERVICE_DEBUG: ${SERVICE_DEBUG:-false} + SERVICE_HOST: ${SERVICE_HOST:-0.0.0.0} + SERVICE_PORT: ${SERVICE_PORT:-8880} + SERVICE_DBHOST: ${SERVICE_DBHOST:-postgres} + SERVICE_DBPORT: ${SERVICE_DBPORT:-5432} + SERVICE_DBUSER: ${SERVICE_DBUSER:-postgres} + SERVICE_DBPASSWORD: ${SERVICE_DBPASSWORD:-postgres} + SERVICE_DBNAME: ${SERVICE_DBNAME:-dhamps_vdb} + SERVICE_ADMINKEY: ${SERVICE_ADMINKEY} + ENCRYPTION_KEY: ${ENCRYPTION_KEY} + ports: + - "${API_PORT:-8880}:8880" +``` + +**Docker-Specific Variables:** +- `POSTGRES_PORT` - External port for PostgreSQL (default: 5432) +- `API_PORT` - External port for dhamps-vdb API (default: 8880) + +### Docker Setup Script + +Automated setup using `docker-setup.sh`: + +```bash +# Run automated setup (generates secure keys) +./docker-setup.sh + +# Start services +docker-compose up -d + +# View logs +docker-compose logs -f dhamps-vdb +``` + +The script automatically: +- Generates secure `SERVICE_ADMINKEY` +- Generates secure `ENCRYPTION_KEY` +- Creates `.env` file with proper configuration +- Validates Docker and docker-compose installation + +### Docker Run Command + +For standalone container deployment: + +```bash +docker run -d \ + --name dhamps-vdb \ + -e SERVICE_DEBUG=false \ + -e SERVICE_HOST=0.0.0.0 \ + -e SERVICE_PORT=8880 \ + -e SERVICE_DBHOST=db.example.com \ + -e SERVICE_DBPORT=5432 \ + -e SERVICE_DBUSER=dhamps_user \ + -e SERVICE_DBPASSWORD=secure_password \ + -e SERVICE_DBNAME=dhamps_vdb \ + -e SERVICE_ADMINKEY=admin_key_here \ + -e ENCRYPTION_KEY=encryption_key_here \ + -p 8880:8880 \ + dhamps-vdb:latest +``` + +### External Database + +Using `docker-compose.external-db.yml` for external PostgreSQL: + +```bash +# Set database connection in .env +SERVICE_DBHOST=db.external.com +SERVICE_DBPORT=5432 +SERVICE_DBUSER=dhamps_user +SERVICE_DBPASSWORD=secure_password +SERVICE_DBNAME=dhamps_vdb + +# Start without bundled PostgreSQL +docker-compose -f docker-compose.external-db.yml up -d +``` + +## Configuration Scenarios + +### Development Environment + +Optimized for local development with verbose logging: + +```bash +# .env for development +SERVICE_DEBUG=true +SERVICE_HOST=localhost +SERVICE_PORT=8880 +SERVICE_DBHOST=localhost +SERVICE_DBPORT=5432 +SERVICE_DBUSER=postgres +SERVICE_DBPASSWORD=postgres +SERVICE_DBNAME=dhamps_vdb_dev +SERVICE_ADMINKEY=dev-admin-key-not-for-production +ENCRYPTION_KEY=dev-encryption-key-at-least-32-chars +``` + +**Start service:** +```bash +./dhamps-vdb +``` + +### Docker Development + +Docker-based development with hot reload: + +```bash +# .env for Docker development +SERVICE_DEBUG=true +SERVICE_HOST=0.0.0.0 +SERVICE_PORT=8880 +SERVICE_DBHOST=postgres +SERVICE_DBPORT=5432 +SERVICE_DBUSER=postgres +SERVICE_DBPASSWORD=postgres +SERVICE_DBNAME=dhamps_vdb +SERVICE_ADMINKEY=dev-admin-key +ENCRYPTION_KEY=dev-encryption-32-chars-minimum +``` + +**Start with:** +```bash +docker-compose up +``` + +### Production Environment + +Production-ready configuration with security hardening: + +```bash +# .env for production +SERVICE_DEBUG=false +SERVICE_HOST=0.0.0.0 +SERVICE_PORT=8880 +SERVICE_DBHOST=prod-db.internal.example.com +SERVICE_DBPORT=5432 +SERVICE_DBUSER=dhamps_prod_user +SERVICE_DBPASSWORD= +SERVICE_DBNAME=dhamps_vdb_prod +SERVICE_ADMINKEY= +ENCRYPTION_KEY= +``` + +**Production Best Practices:** +- Use secrets management (Vault, AWS Secrets Manager, etc.) +- Disable debug logging (`SERVICE_DEBUG=false`) +- Use dedicated database user (not superuser) +- Enable SSL/TLS for database connections +- Deploy behind reverse proxy (nginx, Traefik) +- Set up monitoring and alerting +- Regular key rotation (except ENCRYPTION_KEY) +- Firewall rules to restrict access + +### Testing Environment + +Configuration for running tests: + +```bash +# Tests use testcontainers - no external config needed +go test -v ./... +``` + +**Test-specific setup:** +```bash +# Enable Docker for testcontainers +systemctl --user start podman.socket +export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/podman/podman.sock + +# Run tests +go test -v ./... +``` + +## Validation and Verification + +### Startup Validation + +dhamps-vdb validates configuration on startup: + +1. **Required variables check** - Fails if missing +2. **Database connection test** - Verifies connectivity +3. **Schema migration** - Applies pending migrations +4. **Extension check** - Verifies pgvector is available + +### Configuration Test + +Verify configuration is working: + +```bash +# Check service health +curl http://localhost:8880/docs + +# Test admin authentication +curl -X GET http://localhost:8880/v1/users \ + -H "Authorization: Bearer ${SERVICE_ADMINKEY}" + +# Check database connectivity +docker-compose exec dhamps-vdb echo "Config OK" +``` + +### Common Issues + +**Missing Required Variables:** +``` +Error: SERVICE_ADMINKEY environment variable is not set +``` +Solution: Set all required variables. + +**Database Connection Failed:** +``` +Error: failed to connect to database +``` +Solution: Verify `SERVICE_DBHOST`, credentials, and that PostgreSQL is running. + +**Invalid Encryption Key:** +``` +Error: ENCRYPTION_KEY must be at least 32 characters +``` +Solution: Generate proper key with `openssl rand -hex 32`. + +**Port Already in Use:** +``` +Error: bind: address already in use +``` +Solution: Change `SERVICE_PORT` or stop conflicting service. + +## Generating Secure Keys + +### Admin Key Generation + +```bash +# Base64 encoded (recommended) +openssl rand -base64 32 + +# Hex encoded (alternative) +openssl rand -hex 24 + +# Output example: +# Kx7mP9nQ2rT5vY8zA1bC4dF6gH9jK0lM3nP5qR7sT9u= +``` + +### Encryption Key Generation + +```bash +# 32-byte hex key (required format) +openssl rand -hex 32 + +# Output example: +# a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6a1b2c3d4e5f6 +``` + +### Secure Key Storage + +**Development:** +- Store in `.env` file (not committed) +- Use password manager for team sharing + +**Production:** +- Use secrets management system +- Rotate admin key every 90 days +- Never rotate encryption key (breaks existing API keys) +- Store encryption key backup separately from database + +## Environment-Specific Examples + +### Local with External PostgreSQL + +```bash +SERVICE_DEBUG=true +SERVICE_HOST=localhost +SERVICE_PORT=8880 +SERVICE_DBHOST=192.168.1.100 +SERVICE_DBPORT=5432 +SERVICE_DBUSER=dhamps_user +SERVICE_DBPASSWORD=user_password +SERVICE_DBNAME=dhamps_vdb +SERVICE_ADMINKEY=$(openssl rand -base64 32) +ENCRYPTION_KEY=$(openssl rand -hex 32) +``` + +### Kubernetes ConfigMap + Secrets + +ConfigMap: +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: dhamps-vdb-config +data: + SERVICE_DEBUG: "false" + SERVICE_HOST: "0.0.0.0" + SERVICE_PORT: "8880" + SERVICE_DBHOST: "postgres-service" + SERVICE_DBPORT: "5432" + SERVICE_DBUSER: "dhamps_user" + SERVICE_DBNAME: "dhamps_vdb" +``` + +Secrets: +```yaml +apiVersion: v1 +kind: Secret +metadata: + name: dhamps-vdb-secrets +type: Opaque +stringData: + SERVICE_DBPASSWORD: "secure_db_password" + SERVICE_ADMINKEY: "secure_admin_key" + ENCRYPTION_KEY: "secure_encryption_key_32_chars_min" +``` + +### Docker Swarm Secrets + +```bash +# Create secrets +echo "admin_key_here" | docker secret create dhamps_admin_key - +echo "encryption_key" | docker secret create dhamps_encryption_key - + +# Reference in stack file +services: + dhamps-vdb: + secrets: + - dhamps_admin_key + - dhamps_encryption_key + environment: + SERVICE_ADMINKEY_FILE: /run/secrets/dhamps_admin_key + ENCRYPTION_KEY_FILE: /run/secrets/dhamps_encryption_key +``` + +## Related Documentation + +- [Getting Started - Installation](../getting-started/installation/) +- [Getting Started - Quick Start](../getting-started/quick-start/) +- [Deployment - Docker](../deployment/docker/) +- [Deployment - Database Setup](../deployment/database/) +- [Deployment - Security](../deployment/security/) +- [Reference - Database Schema](database-schema/) diff --git a/docs/content/reference/database-schema.md b/docs/content/reference/database-schema.md new file mode 100644 index 0000000..797917a --- /dev/null +++ b/docs/content/reference/database-schema.md @@ -0,0 +1,514 @@ +--- +title: "Database Schema Reference" +weight: 2 +--- + +# Database Schema Reference + +Complete reference for the dhamps-vdb PostgreSQL database schema. This document describes all tables, columns, types, constraints, relationships, and indexes. + +## Overview + +The database uses PostgreSQL 12+ with the pgvector extension for vector similarity search. The schema is managed through migrations in `internal/database/migrations/`. + +**Key Features:** +- Vector embeddings stored as `halfvec` for efficient storage +- HNSW indexes for fast approximate nearest neighbor search +- Automatic timestamp tracking (`created_at`, `updated_at`) +- Foreign key constraints with CASCADE deletion +- Role-based access control through association tables +- Multi-tenancy support (user-owned resources) + +## Schema Migrations + +Current schema version is defined by 4 migration files: + +1. **001_create_initial_scheme.sql** - Core tables and relationships +2. **002_create_emb_index.sql** - HNSW vector indexes +3. **003_add_public_read_flag.sql** - Public access support +4. **004_refactor_llm_services_architecture.sql** - LLM service architecture refactor + +## Core Tables + +### users + +Stores user accounts with API authentication. + +| Column | Type | Constraints | Description | +|--------|------|-------------|-------------| +| `user_handle` | `VARCHAR(20)` | PRIMARY KEY | Unique user identifier (username) | +| `name` | `TEXT` | | Full name or display name | +| `email` | `TEXT` | UNIQUE, NOT NULL | Email address | +| `vdb_key` | `CHAR(64)` | UNIQUE, NOT NULL | API key (64-character hex) | +| `created_at` | `TIMESTAMP` | NOT NULL | Creation timestamp | +| `updated_at` | `TIMESTAMP` | NOT NULL | Last update timestamp | + +**Indexes:** +- Primary key on `user_handle` +- Unique constraint on `email` +- Unique constraint on `vdb_key` + +**Special Users:** +- `_system` - Reserved system user for global LLM service definitions + +**Relationships:** +- Owns projects (1:N) +- Owns LLM service instances (1:N) +- Owns embeddings (1:N) +- Can share projects (N:M via `users_projects`) +- Can share LLM service instances (N:M via `instances_shared_with`) + +**Notes:** +- `vdb_key` is generated during user creation and returned once +- Cannot be recovered if lost +- Transmitted as `Authorization: Bearer` token + +### projects + +Stores embedding projects owned by users. + +| Column | Type | Constraints | Description | +|--------|------|-------------|-------------| +| `project_id` | `SERIAL` | PRIMARY KEY | Auto-incrementing project ID | +| `project_handle` | `VARCHAR(20)` | NOT NULL | Project identifier (unique per owner) | +| `owner` | `VARCHAR(20)` | NOT NULL, FK→users | Project owner user handle | +| `description` | `TEXT` | | Project description | +| `metadata_scheme` | `TEXT` | | JSON Schema for validating embedding metadata | +| `public_read` | `BOOLEAN` | DEFAULT FALSE | Allow unauthenticated read access | +| `instance_id` | `INTEGER` | FK→instances | LLM service instance used for embeddings | +| `created_at` | `TIMESTAMP` | NOT NULL | Creation timestamp | +| `updated_at` | `TIMESTAMP` | NOT NULL | Last update timestamp | + +**Constraints:** +- `UNIQUE (owner, project_handle)` - Project handles unique per owner +- `ON DELETE CASCADE` - Delete project when owner deleted +- `ON DELETE RESTRICT` - Prevent instance deletion if used by project + +**Relationships:** +- Owned by one user (N:1) +- Uses one LLM service instance (N:1) +- Has many embeddings (1:N) +- Can be shared with users (N:M via `users_projects`) + +**Metadata Schema:** +- Stored as JSON Schema string +- Used to validate embedding metadata +- Optional (can be NULL) +- See [Data Validation](../../concepts/metadata/) for details + +**Public Access:** +- When `public_read=TRUE`, allows unauthenticated embedding queries +- Controlled via API or by setting shared_with to `["*"]` + +### embeddings + +Stores vector embeddings with text identifiers and metadata. + +| Column | Type | Constraints | Description | +|--------|------|-------------|-------------| +| `embeddings_id` | `SERIAL` | PRIMARY KEY | Auto-incrementing embedding ID | +| `text_id` | `TEXT` | INDEXED | Text identifier (URL, DOI, custom ID) | +| `owner` | `VARCHAR(20)` | NOT NULL, FK→users | Embedding owner user handle | +| `project_id` | `SERIAL` | NOT NULL, FK→projects | Project the embedding belongs to | +| `instance_id` | `SERIAL` | NOT NULL, FK→instances | LLM service instance used | +| `text` | `TEXT` | | Optional full text of the embedded content | +| `vector` | `halfvec` | NOT NULL | Embedding vector (half-precision float) | +| `vector_dim` | `INTEGER` | NOT NULL | Vector dimensionality | +| `metadata` | `jsonb` | | Optional metadata (validated if schema defined) | +| `created_at` | `TIMESTAMP` | NOT NULL | Creation timestamp | +| `updated_at` | `TIMESTAMP` | NOT NULL | Last update timestamp | + +**Constraints:** +- `UNIQUE (text_id, owner, project_id, instance_id)` - Unique text IDs per project/instance +- `ON DELETE CASCADE` - Delete embeddings when owner, project, or instance deleted + +**Indexes:** +- Primary key on `embeddings_id` +- B-tree index on `text_id` +- HNSW vector indexes for dimensions: 384, 768, 1024, 1536, 3072 (see [Vector Indexes](#vector-indexes)) + +**Vector Storage:** +- Uses `halfvec` type (16-bit floating point) for efficient storage +- Dimensions must match LLM service instance configuration +- Validated on upload against instance dimensions + +**Relationships:** +- Owned by one user (N:1) +- Belongs to one project (N:1) +- Created with one LLM service instance (N:1) + +## LLM Service Architecture + +The LLM service architecture separates service definitions (templates) from user-specific instances. + +### definitions + +Templates for LLM embedding services (shared or user-specific). + +| Column | Type | Constraints | Description | +|--------|------|-------------|-------------| +| `definition_id` | `SERIAL` | PRIMARY KEY | Auto-incrementing definition ID | +| `definition_handle` | `VARCHAR(20)` | NOT NULL | Definition identifier (unique per owner) | +| `owner` | `VARCHAR(20)` | NOT NULL, FK→users | Definition owner (_system for global) | +| `endpoint` | `TEXT` | NOT NULL | API endpoint URL | +| `description` | `TEXT` | | Service description | +| `api_standard` | `VARCHAR(20)` | NOT NULL, FK→api_standards | API standard handle | +| `model` | `TEXT` | NOT NULL | Model name (e.g., text-embedding-3-large) | +| `dimensions` | `INTEGER` | NOT NULL | Vector dimensions produced by model | +| `context_limit` | `INTEGER` | NOT NULL | Maximum context length (tokens/chars) | +| `is_public` | `BOOLEAN` | NOT NULL, DEFAULT FALSE | Share with all users if true | +| `created_at` | `TIMESTAMP` | NOT NULL | Creation timestamp | +| `updated_at` | `TIMESTAMP` | NOT NULL | Last update timestamp | + +**Constraints:** +- `UNIQUE (owner, definition_handle)` - Definition handles unique per owner +- `ON DELETE CASCADE` - Delete definition when owner deleted + +**Built-in Definitions:** + +System-provided definitions (owned by `_system`, `is_public=TRUE`): + +| Handle | Model | Dimensions | Context Limit | API Standard | +|--------|-------|------------|---------------|--------------| +| `openai-large` | text-embedding-3-large | 3072 | 8192 | openai | +| `openai-small` | text-embedding-3-small | 1536 | 8191 | openai | +| `cohere-v4` | embed-v4.0 | 1536 | 128000 | cohere | +| `gemini-embedding-001` | gemini-embedding-001 | 3072 | 2048 | gemini | + +**Relationships:** +- Owned by one user (N:1) +- References one API standard (N:1) +- Can be shared with users (N:M via `definitions_shared_with`) +- Used by instances (1:N) + +### instances + +User-specific configurations of LLM services with API keys. + +| Column | Type | Constraints | Description | +|--------|------|-------------|-------------| +| `instance_id` | `SERIAL` | PRIMARY KEY | Auto-incrementing instance ID | +| `instance_handle` | `VARCHAR(20)` | NOT NULL | Instance identifier (unique per owner) | +| `owner` | `VARCHAR(20)` | NOT NULL, FK→users | Instance owner user handle | +| `endpoint` | `TEXT` | NOT NULL | API endpoint URL | +| `description` | `TEXT` | | Instance description | +| `api_standard` | `VARCHAR(20)` | NOT NULL, FK→api_standards | API standard handle | +| `model` | `TEXT` | NOT NULL | Model name | +| `dimensions` | `INTEGER` | NOT NULL | Vector dimensions | +| `context_limit` | `INTEGER` | NOT NULL | Maximum context length | +| `definition_id` | `INTEGER` | FK→definitions | Reference to definition template | +| `api_key_encrypted` | `BYTEA` | | Encrypted API key for service authentication | +| `created_at` | `TIMESTAMP` | NOT NULL | Creation timestamp | +| `updated_at` | `TIMESTAMP` | NOT NULL | Last update timestamp | + +**Constraints:** +- `UNIQUE (owner, instance_handle)` - Instance handles unique per owner +- `ON DELETE CASCADE` - Delete instance when owner deleted +- `ON DELETE SET NULL` - Keep instance if definition deleted + +**Indexes:** +- Composite index on `(owner, instance_handle)` + +**API Key Encryption:** +- Encrypted using AES-256-GCM +- Encryption key from `ENCRYPTION_KEY` environment variable +- Cannot be recovered if encryption key is lost +- Stored as base64-encoded BYTEA + +**Relationships:** +- Owned by one user (N:1) +- References one API standard (N:1) +- Based on one definition template (N:1, optional) +- Used by projects (1:N) +- Used by embeddings (1:N) +- Can be shared with users (N:M via `instances_shared_with`) + +### api_standards + +Defines API specifications for LLM embedding services. + +| Column | Type | Constraints | Description | +|--------|------|-------------|-------------| +| `api_standard_handle` | `VARCHAR(20)` | PRIMARY KEY | Standard identifier (e.g., openai, cohere) | +| `description` | `TEXT` | | API description and version info | +| `key_method` | `VARCHAR(20)` | NOT NULL, FK→key_methods | Authentication method | +| `key_field` | `VARCHAR(20)` | | Header/field name for API key | +| `created_at` | `TIMESTAMP` | NOT NULL | Creation timestamp | +| `updated_at` | `TIMESTAMP` | NOT NULL | Last update timestamp | + +**Built-in Standards:** + +| Handle | Description | Key Method | Key Field | +|--------|-------------|------------|-----------| +| `openai` | OpenAI Embeddings API v1 | auth_bearer | Authorization | +| `cohere` | Cohere Embed API v2 | auth_bearer | Authorization | +| `gemini` | Gemini Embeddings API | auth_bearer | x-goog-api-key | + +**Relationships:** +- Used by definitions (1:N) +- Used by instances (1:N) +- References one key_method (N:1) + +## Association Tables + +### users_projects + +Defines project sharing and access control. + +| Column | Type | Constraints | Description | +|--------|------|-------------|-------------| +| `user_handle` | `VARCHAR(20)` | FK→users, PK | User being granted access | +| `project_id` | `SERIAL` | FK→projects, PK | Project being shared | +| `role` | `VARCHAR(20)` | NOT NULL, FK→vdb_roles | Access level (owner/editor/reader) | +| `created_at` | `TIMESTAMP` | NOT NULL | Share creation timestamp | +| `updated_at` | `TIMESTAMP` | NOT NULL | Last update timestamp | + +**Constraints:** +- `PRIMARY KEY (user_handle, project_id)` - One role per user per project +- `ON DELETE CASCADE` - Remove sharing when user or project deleted + +**Roles:** +- `owner` - Full control (only project creator) +- `editor` - Can add/modify/delete embeddings +- `reader` - Read-only access to embeddings + +### instances_shared_with + +Defines LLM service instance sharing. + +| Column | Type | Constraints | Description | +|--------|------|-------------|-------------| +| `user_handle` | `VARCHAR(20)` | FK→users, PK | User being granted access | +| `instance_id` | `INTEGER` | FK→instances, PK | Instance being shared | +| `role` | `VARCHAR(20)` | NOT NULL, FK→vdb_roles | Access level | +| `created_at` | `TIMESTAMP` | NOT NULL | Share creation timestamp | +| `updated_at` | `TIMESTAMP` | NOT NULL | Last update timestamp | + +**Constraints:** +- `PRIMARY KEY (user_handle, instance_id)` - One role per user per instance +- `ON DELETE CASCADE` - Remove sharing when user or instance deleted + +### definitions_shared_with + +Defines LLM service definition sharing. + +| Column | Type | Constraints | Description | +|--------|------|-------------|-------------| +| `user_handle` | `VARCHAR(20)` | FK→users, PK | User being granted access | +| `definition_id` | `INTEGER` | FK→definitions, PK | Definition being shared | +| `created_at` | `TIMESTAMP` | NOT NULL | Share creation timestamp | +| `updated_at` | `TIMESTAMP` | NOT NULL | Last update timestamp | + +**Constraints:** +- `PRIMARY KEY (user_handle, definition_id)` - One share per user per definition +- `ON DELETE CASCADE` - Remove sharing when user or definition deleted + +**Indexes:** +- Index on `user_handle` for efficient user lookups +- Index on `definition_id` for efficient definition lookups + +## Reference Tables + +### vdb_roles + +Enumeration of valid access roles. + +| Column | Type | Constraints | Description | +|--------|------|-------------|-------------| +| `vdb_role` | `VARCHAR(20)` | PRIMARY KEY | Role name | + +**Values:** +- `owner` - Full control over resource +- `editor` - Read and write access +- `reader` - Read-only access + +### key_methods + +Enumeration of API authentication methods. + +| Column | Type | Constraints | Description | +|--------|------|-------------|-------------| +| `key_method` | `VARCHAR(20)` | PRIMARY KEY | Authentication method | + +**Values:** +- `auth_bearer` - Bearer token in Authorization header +- `body_form` - API key in request body +- `query_param` - API key in URL query parameter +- `custom_header` - API key in custom header + +## Vector Indexes + +HNSW (Hierarchical Navigable Small World) indexes for fast approximate nearest neighbor search. + +### Index Configuration + +All vector indexes use HNSW with these parameters: +- `m = 24` - Number of neighbors per node +- `ef_construction = 200` - Build-time accuracy parameter +- `ef_search = 100` - Query-time accuracy parameter +- Expected recall: ~99.8% + +### Dimension-Specific Indexes + +| Index Name | Dimensions | Common Models | +|-----------|------------|---------------| +| `embeddings_vector_384` | 384 | Cohere embed-multilingual-light-v3.0, embed-english-light-v3.0 | +| `embeddings_vector_768` | 768 | BERT base, Cohere embed-multilingual-v2.0, Gemini Embeddings | +| `embeddings_vector_1024` | 1024 | BERT large, SBERT, Cohere embed-multilingual-v3.0, embed-english-v3.0 | +| `embeddings_vector_1536` | 1536 | OpenAI text-embedding-ada-002, text-embedding-3-small, Cohere embed-v4.0 | +| `embeddings_vector_3072` | 3072 | OpenAI text-embedding-3-large, Gemini embedding-001 | + +**Index Structure:** +```sql +CREATE INDEX embeddings_vector_1536 ON embeddings +USING hnsw ((vector::halfvec(1536)) halfvec_cosine_ops) +WITH (m = 24, ef_construction = 200) +WHERE (vector_dim = 1536); +``` + +**Usage:** +- Indexes are partial - only include vectors of matching dimension +- Automatically used for similarity queries with matching dimensions +- Use cosine distance for similarity calculation + +## Relationships Diagram + +``` +users (1) ──owns──> (N) projects (1) ──contains──> (N) embeddings + │ │ │ + │ └─> uses (1) instances (N) <───┘ + │ │ + │ └─> based on (1) definitions + │ │ + ├──owns──> (N) instances ──uses──> (1) api_standards + │ │ │ + │ └─> references (1) definitions + │ + └──owns──> (N) definitions ──references──> (1) api_standards + +Sharing: + users (N) <──shares─> (M) projects (via users_projects) + users (N) <──shares─> (M) instances (via instances_shared_with) + users (N) <──shares─> (M) definitions (via definitions_shared_with) +``` + +## Data Validation + +### Dimension Validation + +Embeddings must have dimensions matching their LLM service instance: +- `embeddings.vector_dim` must equal `instances.dimensions` +- Enforced at API level during upload +- Similarity queries automatically filter by matching dimensions + +### Metadata Validation + +Projects can define JSON Schema in `metadata_scheme`: +- Validates all embedding metadata on upload +- Optional - if NULL, metadata not validated +- Enforced at API level +- Admin sanity check validates all existing metadata + +### Sanity Check Queries + +The `/v1/admin/sanity-check` endpoint verifies: +1. All embeddings have dimensions matching their instance +2. All metadata conforms to project schemas (if defined) + +## Common Queries + +### Find User's Projects + +```sql +SELECT p.project_handle, p.description, p.created_at +FROM projects p +WHERE p.owner = 'alice' +ORDER BY p.created_at DESC; +``` + +### Find Shared Projects + +```sql +SELECT p.project_handle, p.owner, up.role +FROM users_projects up +JOIN projects p ON up.project_id = p.project_id +WHERE up.user_handle = 'bob' +ORDER BY p.owner, p.project_handle; +``` + +### Find Similar Embeddings + +```sql +SELECT text_id, vector <-> '[0.1, 0.2, ...]'::vector AS distance +FROM embeddings +WHERE project_id = 123 AND vector_dim = 1536 +ORDER BY vector <-> '[0.1, 0.2, ...]'::vector +LIMIT 10; +``` + +### Count Embeddings by Project + +```sql +SELECT p.project_handle, COUNT(e.embeddings_id) AS embedding_count +FROM projects p +LEFT JOIN embeddings e ON p.project_id = e.project_id +WHERE p.owner = 'alice' +GROUP BY p.project_id, p.project_handle +ORDER BY embedding_count DESC; +``` + +## Database Maintenance + +### Backup Recommendations + +**Critical Data:** +- User accounts (`users`) +- API keys (`vdb_key` in users, `api_key_encrypted` in instances) +- ENCRYPTION_KEY environment variable (backup separately!) +- Projects and embeddings + +**Backup Strategy:** +```bash +# Full database backup +pg_dump -U postgres dhamps_vdb > backup.sql + +# Backup encryption key separately +echo "$ENCRYPTION_KEY" > encryption_key.backup +chmod 400 encryption_key.backup +``` + +### Vacuum and Analyze + +```sql +-- Regular maintenance +VACUUM ANALYZE embeddings; +VACUUM ANALYZE projects; + +-- Rebuild vector indexes if needed +REINDEX INDEX embeddings_vector_1536; +``` + +### Monitoring Queries + +```sql +-- Check index usage +SELECT schemaname, tablename, indexname, idx_scan +FROM pg_stat_user_indexes +WHERE tablename = 'embeddings' +ORDER BY idx_scan DESC; + +-- Check table sizes +SELECT relname, pg_size_pretty(pg_total_relation_size(oid)) +FROM pg_class +WHERE relname IN ('embeddings', 'projects', 'users') +ORDER BY pg_total_relation_size(oid) DESC; +``` + +## Related Documentation + +- [Concepts - Architecture](../../concepts/architecture/) +- [Concepts - Projects](../../concepts/projects/) +- [Concepts - Metadata](../../concepts/metadata/) +- [Deployment - Database Setup](../../deployment/database/) +- [Reference - Configuration](configuration/) +- [API - Endpoints](../../api/endpoints/) diff --git a/docs/content/reference/roadmap.md b/docs/content/reference/roadmap.md new file mode 100644 index 0000000..469aafb --- /dev/null +++ b/docs/content/reference/roadmap.md @@ -0,0 +1,346 @@ +--- +title: "Product Roadmap" +weight: 3 +--- + +# Product Roadmap + +Development roadmap for dhamps-vdb, tracking completed features, in-progress work, and planned enhancements. + +## Overview + +This roadmap outlines the development priorities for dhamps-vdb. Items marked with [x] are completed, items in progress are noted, and planned features are listed by priority. + +## Completed Features + +### Core Functionality + +- [x] **User authentication & restrictions on some API calls** + - Bearer token authentication + - Role-based access control + - Admin vs user permissions + +- [x] **API versioning** + - Version 1 API with `/v1/` prefix + - Backward compatibility support + +- [x] **Better options handling** + - Command-line flags via Huma CLI + - Environment variable configuration + - `.env` file support + +- [x] **Handle metadata** + - JSONB storage for flexible metadata + - Metadata attached to embeddings + +- [x] **Validation with metadata schema** + - JSON Schema validation for embedding metadata + - Project-level schema definitions + - Automatic validation on upload + +- [x] **Filter similar passages by metadata field** + - Metadata-based filtering in similarity queries + - Exclude documents by metadata value + - Query parameters: `metadata_path` and `metadata_value` + +### Data Management + +- [x] **Use transactions** + - Atomic operations for multi-step actions + - Consistency for project creation with sharing + - Rollback on errors + +- [x] **Catch POST to existing resources** + - Prevent duplicate creation + - Return appropriate error codes + - Suggest using PUT for updates + +- [x] **Always use specific error messages** + - Detailed error descriptions + - Helpful troubleshooting information + - Consistent error response format + +### Testing & Quality + +- [x] **Tests** + - Integration tests for all major operations + - Testcontainers for isolated database testing + - Cleanup verification queries + +- [x] **When testing, check cleanup by adding a new query/function to see if all tables are empty** + - Verify test isolation + - Ensure no data leakage between tests + +- [x] **Make sure input is validated consistently** + - Dimension validation for embeddings + - Schema validation for metadata + - Request validation via Huma + +### Collaboration Features + +- [x] **Add project sharing/unsharing functions & API paths** + - Share projects with specific users + - Define roles: owner, editor, reader + - API endpoints for managing sharing + +- [x] **Add mechanism to allow anonymous/public reading access to embeddings** + - `public_read` flag on projects + - Wildcard sharing via `"*"` in `shared_with` + - Unauthenticated access to public embeddings + +- [x] **Transfer of projects from one owner to another as new operation** + - Owner-initiated project transfers + - Ownership verification + - Automatic cleanup of old owner associations + +### Service Architecture + +- [x] **Add definition creation/listing/deletion functions & paths** + - LLM service definitions (templates) + - Instances (user-specific configurations) + - System-provided global definitions + +### Deployment & Operations + +- [x] **Dockerization** + - Multi-stage Dockerfile + - Docker Compose with PostgreSQL + - External database support + - Automated setup script + +- [x] **Make sure pagination is supported consistently** + - Limit and offset parameters + - Consistent across all list endpoints + - Documented pagination behavior + +### Security + +- [x] **Prevent acceptance of requests as user "_system"** + - Reserved system user for internal use + - Blocked from external authentication + - Protected system-owned resources + +## In Progress + +### Documentation + +- [ ] **Revisit all documentation** + - Comprehensive reference documentation + - Updated API examples + - Docker deployment guides + +- [ ] **Add documentation for metadata filtering of similars** + - Document `metadata_path` and `metadata_value` parameters + - Provide usage examples + - Explain use cases (exclude same author, etc.) + - **Note:** Query parameters are: `metadata_path` and `metadata_value` as in: `https://xy.org/vdb-api/v1/similars/sal/sal-openai-large/https%3A%2F%2Fid.myproject.net%2Ftexts%2FW0011%3A1.3.1.3.1?threshold=0.7&limit=5&metadata_path=author_id&metadata_value=A0083` + +## Planned Features + +### High Priority + +#### Network Connectivity + +- [ ] **Implement and make consequent use of max_idle (5), max_concurr (5), timeouts, and cancellations** + - Connection pool management + - Maximum idle connections: 5 + - Maximum concurrent connections: 5 + - Request timeouts + - Context cancellation support + +- [ ] **Concurrency (leaky bucket approach) and Rate limiting** + - Leaky bucket algorithm for concurrency control + - Rate limiting using Redis + - Sliding window implementation + - Standard rate limit headers + - See [Huma request limits](https://huma.rocks/features/request-limits/) for implementation + +- [ ] **Caching** + - Response caching for frequently accessed data + - Cache invalidation strategies + - Redis or in-memory caching + - Configurable TTL + +#### API Standards + +- [ ] **Add API standards for anthropic, mistral, llama.cpp, ollama, vllm, llmstudio** + - Anthropic embeddings API + - Mistral embeddings API + - llama.cpp server API + - Ollama embeddings API + - vLLM embeddings API + - LM Studio embeddings API + - Standard authentication methods + - Example definitions in testdata + +### Medium Priority + +#### User Experience + +- [ ] **HTML UI** + - Web-based interface for API management + - User-friendly project creation + - Visual embedding explorer + - API key management + - Alternative to CLI/API usage + +- [ ] **Allow to request verbose information even in list outputs** + - `verbose=yes` query parameter + - Full object details in list endpoints + - Optional vs default minimal output + - Performance considerations + +- [ ] **Add possibility to use PATCH method to change existing resources** + - Partial updates without full replacement + - PATCH support for users, projects, instances + - Merge semantics for nested objects + - Validation of partial updates + - **Status:** Partially implemented via automatic PATCH handler + +#### Logging and Monitoring + +- [ ] **Proper logging with `--verbose` and `--quiet` modes** + - Structured logging (JSON format) + - Log levels: ERROR, WARN, INFO, DEBUG, TRACE + - `--verbose` flag for detailed logs + - `--quiet` flag for minimal logs + - Request/response logging + - Performance metrics logging + - Integration with log aggregation systems + +### Future Enhancements + +#### Advanced Features + +- [ ] **Bulk Operations** + - Batch embedding upload + - Bulk deletion + - Transaction support for large operations + +- [ ] **Advanced Search** + - Combined vector + metadata filtering + - Hybrid search (vector + keyword) + - Multi-vector search + - Weighted search results + +- [ ] **Embeddings Management** + - Update embeddings in place + - Re-embedding workflows + - Embedding versioning + - Dimension conversion utilities + +#### Performance + +- [ ] **Query Optimization** + - Query plan analysis + - Index optimization + - Materialized views for common queries + - Database connection pooling improvements + +- [ ] **Scaling** + - Horizontal scaling support + - Read replicas for query load + - Partitioning strategies for large datasets + - Distributed vector search + +#### Security & Access Control + +- [ ] **Fine-grained Permissions** + - Custom roles beyond owner/editor/reader + - Permission inheritance + - Temporary access grants + - IP-based access control + +- [ ] **Audit Logging** + - Track all API operations + - User action history + - Security event logging + - Compliance reporting + +- [ ] **OAuth/SAML Integration** + - OAuth 2.0 authentication + - SAML SSO support + - Identity provider integration + - External authentication services + +#### Integration + +- [ ] **Webhooks** + - Event notifications for embeddings changes + - Project updates notifications + - Configurable webhook endpoints + - Retry logic and delivery guarantees + +- [ ] **Export/Import** + - Project export to standard formats + - Bulk embedding export + - Import from other vector databases + - Migration utilities + +- [ ] **SDK Support** + - Python SDK + - JavaScript/TypeScript SDK + - Go SDK + - CLI improvements + +## Development Process + +### Release Cycle + +- **Minor versions** (0.x.0): New features, API additions +- **Patch versions** (0.0.x): Bug fixes, documentation updates +- **Major versions** (x.0.0): Breaking API changes (future) + +### Feature Requests + +To request a feature or suggest improvements: + +1. Check existing issues on GitHub +2. Open a new issue with: + - Clear description of the feature + - Use cases and motivation + - Proposed implementation (if any) +3. Engage in discussion with maintainers + +### Contribution Guidelines + +Contributions are welcome! See the main repository for: +- Development setup instructions +- Code style guidelines +- Testing requirements +- Pull request process + +## Version History + +### v0.1.0 (2026-02-08) +- Fix many things +- Add many things +- Still API v1 on the way to stable + +### v0.0.1 (2024-12-10) +- Initial public release +- API v1 (work in progress) +- Core functionality implemented +- Docker support +- Project sharing +- Metadata validation + +## Feedback + +We value your feedback! Please share: + +- **Feature requests** - What would make dhamps-vdb more useful? +- **Bug reports** - Help us improve quality +- **Use cases** - How are you using dhamps-vdb? +- **Documentation** - What needs clarification? + +Open an issue on GitHub or contact the maintainers directly. + +## Related Documentation + +- [Getting Started](../../getting-started/) +- [API Documentation](../../api/) +- [Deployment Guide](../../deployment/) +- [Reference - Configuration](configuration/) +- [Reference - Database Schema](database-schema/) +- [GitHub Repository](https://github.com/mpilhlt/dhamps-vdb) diff --git a/docs/themes/book b/docs/themes/book new file mode 160000 index 0000000..81a841c --- /dev/null +++ b/docs/themes/book @@ -0,0 +1 @@ +Subproject commit 81a841c92d62f2ed8d9134b0b18623b8b2471661