Skip to content

Python Serverless Functions

Akshay B edited this page Mar 16, 2026 · 1 revision

Python Serverless Functions (Azure Functions)

This document tracks the Python backend implemented in PluckIt.Processor.

It keeps shared runtime/setup behavior here and links to feature-level pages for endpoint contracts and per-feature behavior.

The page intentionally avoids duplicating domain-level behavior that is documented under each service-domain page.

Documentation metadata

  • Audience: external contributors
  • Last reviewed: 2026-03-16
  • Scope: Python runtime contracts and behavior only

Overview

  • Entry file: PluckIt.Processor/function_app.py
  • Runtime: Python (FUNCTIONS_WORKER_RUNTIME=python)
  • Stack: FastAPI application mounted into Azure Functions via func.AsgiFunctionApp
  • HTTP route prefix: /api
  • Health endpoint: GET /api/health
  • Notable characteristics:
    • FastAPI docs disabled (docs_url=None, redoc_url=None)
    • Chat route uses SSE (text/event-stream)
    • Optional telemetry integration via OpenTelemetry and Langfuse

Setup checklist

Local onboarding

  • Install Python 3.12+, Azure Functions Core Tools 4.x, Azurite, and local Cosmos emulator.
  • Create local settings from template:
    • cp PluckIt.Processor/local.settings.json.example PluckIt.Processor/local.settings.json
  • Configure local secrets (AZURE_OPENAI_*, metadata auth keys) in PluckIt.Processor/local.settings.json.
  • Set up venv + dependencies:
    • cd PluckIt.Processor
    • python3 -m venv .venv
    • source .venv/bin/activate (or .venv\Scripts\activate on Windows)
    • pip install -r requirements-prod.txt
    • pip install -r requirements-test.txt (for local unit tests)
  • Run service:
    • func start --port 7071
  • Confirm health:
    • curl http://localhost:7071/api/health
  • Run local unit tests:
    • python -m pytest tests/unit/ -v --tb=short -m unit

CI checks

  • python -m pytest tests/unit/ -v --tb=short -m unit --cov=. --cov-report=xml
  • Validate that .github/workflows/function-ci.yml completes test before deploy stage.

Host configuration entrypoint

  • PluckIt.Processor/function_app.py defines:
    • fastapi_app = FastAPI(...)
    • app = func.AsgiFunctionApp(app=fastapi_app, http_auth_level=func.AuthLevel.ANONYMOUS)
  • Non-HTTP triggers (queue + timer) are declared on the same app.

Authentication and authorization

  • Canonical token and local-auth behavior is documented in Authentication and Identity.
  • This page documents only the shared Python service contract (separate from .NET contracts).
  • Most functional routes depend on Depends(get_user_id).
  • Administrative operations are explicitly called out in the matching domain pages.

Public and protected route model

  • Public contract:
    • GET /api/health
  • Protected routes:
    • All non-public /api/* contracts below use resolved user identity.
  • Admin routes:
    • POST /api/admin/* requires membership in ADMIN_USER_IDS.

Service domain contracts

Policy

  • Route ownership and behavior details are documented in domain pages.
  • Route authentication model for each page should align with shared token flow documented in Authentication and Identity.
  • Endpoint-level policy checks:
    • POST /api/admin/* routes require caller membership in ADMIN_USER_IDS.
    • GET /api/health is intentionally public.
  • Endpoint contracts are versioned implicitly by domain pages in this backend area.
  • Route behavior and endpoint-specific constraints are documented in the linked domain pages.

Configuration

Core runtime/dependencies

  • FUNCTIONS_WORKER_RUNTIME=python
  • Storage / blobs:
    • StorageQueue (or STORAGE_ACCOUNT_NAME + STORAGE_ACCOUNT_KEY)
    • ARCHIVE_CONTAINER_NAME
  • Cosmos:
    • COSMOS_DB_ENDPOINT
    • COSMOS_DB_KEY
    • COSMOS_DB_DATABASE
    • COSMOS_DB_TASTE_JOBS_CONTAINER
    • COSMOS_DB_TASTE_JOB_DEAD_LETTER_CONTAINER
  • AI / metadata:
    • AZURE_OPENAI_ENDPOINT
    • AZURE_OPENAI_API_KEY
    • AZURE_OPENAI_DEPLOYMENT (default: gpt-4.1-mini)
    • SEGMENTATION_ENDPOINT_URL
    • SEGMENTATION_SHARED_TOKEN

Security and routing controls

  • ADMIN_USER_IDS
  • CORS_ALLOWED_ORIGINS
  • METADATA_EXTRACT_AUTH_MODE
  • METADATA_EXTRACT_API_KEY
  • METADATA_EXTRACT_AZURE_AD_AUDIENCE
  • METADATA_EXTRACT_AZURE_AD_ISSUER

Async and queue tuning

  • TASTE_JOB_QUEUE_NAME
  • TASTE_JOB_DEAD_LETTER_QUEUE_NAME
  • TASTE_JOB_DEDUPE_TTL_SECONDS
  • TASTE_JOB_COMPLETED_TTL_SECONDS
  • TASTE_JOB_MAX_RETRIES
  • TASTE_JOB_BASE_BACKOFF_SECONDS
  • TASTE_JOB_MAX_BACKOFF_SECONDS
  • TASTE_JOB_JITTER_SECONDS
  • TASTE_JOB_PROFILE_UPDATE_MAX_RETRIES
  • WEEKLY_DIGEST_MAX_CONCURRENCY
  • SCRAPER_MAX_CONCURRENCY

Observability

  • OTEL_EXPORTER_OTLP_ENDPOINT
  • OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
  • OTEL_EXPORTER_OTLP_METRICS_ENDPOINT
  • OTEL_EXPORTER_OTLP_LOGS_ENDPOINT
  • OTEL_SERVICE_NAME
  • OTEL_EXPORTER_OTLP_HEADERS
  • OTEL_EXPORTER_OTLP_PROTOCOL
  • OTEL_TRACES_EXPORTER
  • OTEL_METRICS_EXPORTER
  • OTEL_LOGS_EXPORTER
  • LANGFUSE_PUBLIC_KEY
  • LANGFUSE_SECRET_KEY
  • LANGFUSE_HOST
  • LANGFUSE_BASE_URL

Local settings

  • PluckIt.Processor/local.settings.json
  • PluckIt.Processor/local.settings.local.json

External documentation policy

  • Keep variable names, but do not document concrete secret values or tenant-specific hostnames.
  • If auth behavior changes, update Authentication and Identity first, then synchronize this page.

Clone this wiki locally