Vector Database for the DH at Max Planck Society initiative
EmbAPI (/ɛmˈbɑːpeɪ/) ⚽ is a PostgreSQL-backed vector database with pgvector support, providing a RESTful API for managing embeddings in Retrieval Augmented Generation (RAG) workflows. Store embeddings for text snippets with metadata, then find similar content using cosine similarity search.
The typical use case is as a RAG component: Create embeddings for your text collection, upload them with identifiers and optional metadata, then query for similar texts either by identifier (GET) or by posting raw embeddings (POST). The service returns text identifiers with similarity scores for use in your application.
- PostgreSQL with pgvector backend - Reliable, scalable vector storage
- RESTful API - OpenAPI-documented endpoints
- Docker deployment ready - Includes PostgreSQL with pgvector
- Comprehensive test coverage - Integration tests with testcontainers
- Multi-user support - Role-based access control (admin, owner, reader, editor)
- Project sharing - Collaborate with specific users
- Public access mode - Enable unauthenticated read access for projects
- Project ownership transfer - Transfer projects between users
- LLM service management - Service definitions and instances with encrypted API keys
- Multiple embedding configurations - Support for different dimensions
- Automatic dimension validation - Ensures vector consistency
- Flexible instance sharing - Share LLM service instances across users
- JSON Schema-based metadata validation - Enforce metadata structure
- Metadata filtering in similarity search - Exclude documents by metadata field values
- PATCH support - Partial updates for projects and embeddings
- Configurable thresholds - Control similarity search results
# Automated setup (generates secure keys)
./docker-setup.sh
# Start services (includes PostgreSQL with pgvector)
docker-compose up -d
# Access the API documentation
curl http://localhost:8880/docscurl -X POST http://localhost:8880/v1/users \
-H "Authorization: Bearer YOUR_ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{"user_handle": "alice", "name": "Alice Smith"}'
# Response includes: {"embapi_key": "alice_abc123..."}
# ⚠️ Save the embapi_key! It cannot be recovered.# Use a system-provided definition (openai-large, openai-small, etc.)
curl -X PUT http://localhost:8880/v1/llm-instances/alice/my-openai \
-H "Authorization: Bearer ALICE_EMBAPI_KEY" \
-H "Content-Type: application/json" \
-d '{
"definition_owner": "_system",
"definition_handle": "openai-large",
"api_key_encrypted": "YOUR_OPENAI_API_KEY"
}'curl -X POST http://localhost:8880/v1/projects/alice \
-H "Authorization: Bearer ALICE_EMBAPI_KEY" \
-H "Content-Type: application/json" \
-d '{
"project_handle": "my-texts",
"description": "My text embeddings",
"instance_owner": "alice",
"instance_handle": "my-openai"
}'curl -X POST http://localhost:8880/v1/embeddings/alice/my-texts \
-H "Authorization: Bearer ALICE_EMBAPI_KEY" \
-H "Content-Type: application/json" \
-d '{
"embeddings": [{
"text_id": "doc1",
"instance_handle": "my-openai",
"vector": [0.1, 0.2, 0.3, ...],
"vector_dim": 3072,
"metadata": {"author": "John Doe", "year": 2024}
}]
}'# Get documents similar to doc1
curl "http://localhost:8880/v1/similars/alice/my-texts/doc1?threshold=0.7&limit=5" \
-H "Authorization: Bearer ALICE_EMBAPI_KEY"# Exclude documents from the same author
curl "http://localhost:8880/v1/similars/alice/my-texts/doc1?threshold=0.7&metadata_path=author&metadata_value=John%20Doe" \
-H "Authorization: Bearer ALICE_EMBAPI_KEY"- Installation Guide - Build from source
- Docker Guide - Detailed Docker deployment (also see DOCKER.md)
- Configuration - Environment variables and options
- Your First Project - Complete walkthrough
- API Reference - Complete API documentation
- Architecture - How EmbAPI works
- Users & Authentication - API keys and roles
- Projects - Organizing embeddings
- LLM Services - Definitions vs. instances
- Similarity Search - How it works
- RAG Workflow - End-to-end example
- Project Sharing - Collaborate with others
- Metadata Filtering - Exclude documents in search
- Metadata Validation - Enforce schemas
- Public Projects - Unauthenticated access
# Install dependencies and generate code
go get ./...
sqlc generate --no-remote
# Build
go build -o build/embapi main.go
# Or run directly
go run main.goTests use testcontainers for integration testing:
# Start container runtime (if using podman)
systemctl --user start podman.socket
export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/podman/podman.sock
# Run tests
go test -v ./...For more details, see the Testing Guide.
Contributions are welcome! Please see our Contributing Guide for details.
This project is licensed under the terms specified in the LICENSE file.