Platform for managing, deploying, monitoring, and optimizing the NEXUS multi-agent trading system. Built for enterprise AI operations with full lifecycle management from development to production.
| Category | Capabilities |
|---|---|
| Registry | Agent, Model, Prompt, Tool, Config registries with semantic versioning and dependency tracking |
| Deployment | Rolling, Blue-Green, Canary strategies with health checking and automatic rollback |
| Monitoring | Prometheus metrics, distributed tracing, SLO tracking, anomaly detection, cost analysis |
| Experimentation | A/B testing, multi-armed bandits, feature flags with statistical analysis |
| Human-in-the-Loop | Review queues, approval workflows, feedback collection, RLHF training |
| Security | JWT/API key auth, RBAC, Fernet encryption, secret scanning, compliance checking, audit logging |
| Performance | Query optimization, LRU caching, profiling, LLM cost optimization |
| CLI | Full-featured command-line interface with interactive setup wizard |
# Install
pip install poetry
poetry install
# Interactive setup (configures DB, Redis, S3, LLM providers)
nexusops setup
# Or start with Docker
docker compose -f docker/docker-compose.dev.yml up -d
# Run migrations
nexusops admin db-migrate
# Start API server
uvicorn nexusops.api.main:app --reload --port 8000# Agent management
nexusops agent register my-agent --type research --version 1.0.0
nexusops agent list
# Deployment
nexusops deploy create my-agent production --strategy blue-green --replicas 3
nexusops deploy status <deployment-id>
# Monitoring
nexusops monitor status
nexusops monitor alerts --severity critical
# Experiments
nexusops experiment create "test-name" --variant control:50 --variant treatment:50 --metric accuracy --duration 14
# Administration
nexusops admin health
nexusops admin security-scan
nexusops admin backup┌─────────────────────────────────────────────────────────┐
│ CLI / REST API │
├──────────┬──────────┬──────────┬──────────┬─────────────┤
│ Registry │Deployment│Monitoring│ Exper- │ Security │
│ Service │ Pipeline │ Stack │imentation│ Module │
├──────────┴──────────┴──────────┴──────────┴─────────────┤
│ Core (Auth, RBAC, Config) │
├──────────┬──────────┬──────────┬────────────────────────┤
│PostgreSQL│ Redis │ S3 │ Vault │
└──────────┴──────────┴──────────┴────────────────────────┘
nexusops/
├── api/ # FastAPI app, routes, middleware
├── cli/ # Click CLI commands and utilities
├── core/ # Auth, RBAC, config, encryption
├── deployment/ # Deployment strategies, health checking
├── experimentation/ # A/B testing, bandits, HITL, RLHF
├── monitoring/ # Metrics, alerts, SLOs, cost tracking
├── performance/ # Profiling, caching, query optimization
├── pipeline/ # Build pipeline, artifact management
├── registry/ # Agent, Model, Prompt registries
├── scheduler/ # Job scheduling, worker pools
├── security/ # Scanning, compliance, audit
├── services/ # Business logic layer
└── storage/ # PostgreSQL, Redis, S3, Vault
docs/
├── architecture/ # System design and data flows
├── api/ # REST API, CLI, SDK reference
├── guides/ # Quickstart, deployment, monitoring
├── runbooks/ # Incident response, DR, scaling
└── tutorials/ # Step-by-step walkthroughs
| Component | Technology |
|---|---|
| Language | Python 3.11+ |
| API Framework | FastAPI |
| CLI | Click + Rich |
| ORM | SQLAlchemy 2.0 (async) |
| Database | PostgreSQL 15 (asyncpg) |
| Cache | Redis 7 |
| Object Store | S3 / MinIO |
| Secrets | HashiCorp Vault |
| Auth | JWT + API Keys + RBAC |
| Encryption | Fernet (cryptography) |
| Validation | Pydantic v2 |
| Monitoring | Prometheus + Langfuse |
| Containers | Docker + Kubernetes |
| CI/CD | GitHub Actions |
make dev # Install dev dependencies
make lint # Ruff linter
make format # Black formatter
make typecheck # mypy strict mode
make test # All tests
make test-cov # Tests with coverage
make security-scan # Security scanner
make health # Platform health check- Architecture: Overview · Components · Data Flow
- API Reference: REST API · CLI Reference · Python SDK
- Guides: Quickstart · Deployment · Monitoring · Experimentation · Troubleshooting
- Runbooks: Incident Response · Disaster Recovery · Scaling · Backup & Restore
- Tutorials: First Agent · Experiments · Monitoring · HITL Workflow
Proprietary - All rights reserved.