An implementation of the OpenAI API interface for building compatible LLM API services.
- Drop-in compatible with existing OpenAI API clients for chat completion
- Supports Chat Completions API
- Streaming support (Server-Sent Events)
- OpenAI-compatible authentication
- Python 3.13+
- uv for dependency management
Create a .env
file in the root directory with:
# Environment (development, test, staging, production)
# Valid values: development, test, staging, production
# Default: development
ENVIRONMENT=development
# Authentication
# Comma-separated list of valid API keys (minimum 16 characters each)
# REQUIRED FOR PRODUCTION/STAGING - application will fail to start if missing or invalid.
# Auto-generated for 'development' (dev_ prefix) or 'test' (test_ prefix) if not provided.
API_AUTH_TOKENS=your-production-token-12345678901234567890,another-token-12345678901234567890
# Server Configuration
# REQUIRED FOR PRODUCTION/STAGING
HOST=0.0.0.0
PORT=8000
LOG_LEVEL=INFO
# CORS Settings (comma-separated domains, optional)
# CORS_ORIGINS=http://localhost:3000,https://example.com
# Rate Limiting
# REQUIRED FOR PRODUCTION/STAGING
RATE_LIMIT_PER_MINUTE=10
In production-like environments (ENVIRONMENT=production
or ENVIRONMENT=staging
), the application has strict configuration requirements enforced by Pydantic validation:
-
The following environment variables are required, and the application will fail to start if any are missing or invalid (e.g.,
PORT
not an integer):API_AUTH_TOKENS
: Must contain at least one token, each with a minimum of 16 characters.HOST
: Server host address.PORT
: Server port (must be a valid integer).LOG_LEVEL
: Logging level (e.g., DEBUG, INFO, WARNING, ERROR, CRITICAL).RATE_LIMIT_PER_MINUTE
: Request rate limit per minute (must be a valid integer).
-
Project metadata (
project_name
andproject_version
) must be available frompyproject.toml
.- The application will fail to start in production/staging if
project_name
orproject_version
cannot be read frompyproject.toml
or are empty.
- The application will fail to start in production/staging if
In development
and test
environments, the application is more permissive:
- Missing or invalid values for
HOST
,PORT
,LOG_LEVEL
,RATE_LIMIT_PER_MINUTE
will cause Pydantic validation errors and halt startup, but the error messages will indicate they are non-critical for these environments. API_AUTH_TOKENS
:- If
API_AUTH_TOKENS
is not provided or empty:- In
development
mode, a secure random token prefixed withdev_
is automatically generated and used. A warning is logged. - In
test
mode, a fixed tokentest_key
is used. A warning is logged.
- In
- If tokens are provided but are too short (less than 16 characters), a warning is logged, but the application will proceed with those tokens.
- If
The application automatically reads project name and version exclusively from the pyproject.toml
file.
- In development mode: If no tokens are provided via
API_AUTH_TOKENS
, a secure random token with the prefixdev_
will be generated and a warning logged. - In test mode: If no tokens are provided, a consistent token
test_key
with the prefixtest_
is used and a warning logged. - In production/staging mode:
- You MUST provide strong API tokens (minimum 16 characters each) via the
API_AUTH_TOKENS
environment variable. - The application will fail to start if tokens are missing or do not meet the length requirement.
- All tokens should be randomly generated, unique, and kept confidential.
- You MUST provide strong API tokens (minimum 16 characters each) via the
- Create and activate a virtual environment:
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
- Install dependencies:
uv pip install -e ".[dev]"
- Run the API server:
uvicorn openai_api_blueprint.main:app --reload
The API will be available at http://127.0.0.1:8000. Visit http://127.0.0.1:8000/docs for interactive documentation.
To run with Docker:
# Use BuildKit for faster builds with caching
export DOCKER_BUILDKIT=1
# Build the image
docker build -t openai-api-blueprint .
# Run with proper health checks and as non-root user
docker run -p 8000:8000 --env-file .env openai-api-blueprint
With Docker Compose (recommended for development):
# Enable BuildKit
export DOCKER_BUILDKIT=1
export COMPOSE_DOCKER_CLI_BUILD=1
# Start the service with watch mode for live reloading
docker compose up
For modern Docker Compose development with watch mode:
docker compose watch
Run tests with pytest:
pytest
For only service tests:
pytest tests/services
For only API tests:
pytest tests/api
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-production-token-12345678901234567890" \
-d '{
"model": "blueprint-standard",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-production-token-12345678901234567890" \
-d '{
"model": "blueprint-standard",
"messages": [
{
"role": "user",
"content": "Hello!"
}
],
"stream": true
}'
To use this with actual LLM backends, modify the service implementations in src/openai_api_blueprint/services/
.