Context:
For 99.9% uptime requirement, we need comprehensive health monitoring of all critical services (database, Redis, Stellar network).
Problem:
Implement health check endpoints that verify all external dependencies are operational.
What Done Looks Like:
/health endpoint returning overall status
/health/stellar checking Horizon connectivity
/health/soroban checking Soroban RPC
/health/db checking PostgreSQL
/health/cache checking Redis
- Prometheus metrics endpoint
Folder Structure:
src/
├── health/
│ ├── health.controller.ts
│ ├── health.module.ts
│ └── indicators/
│ ├── stellar.health.ts
│ ├── soroban.health.ts
│ └── database.health.ts
Implementation Guidelines:
- Use
@nestjs/terminus for health checks
- Return proper HTTP status codes (200 for healthy, 503 for unhealthy)
- Include response time metrics
- Test actual connectivity, not just service existence
- Add custom health indicators for Stellar-specific checks
Response Format:
{
"status": "ok",
"info": {
"database": { "status": "up" },
"redis": { "status": "up" },
"stellar": { "status": "up", "latency": "45ms" },
"soroban": { "status": "up", "latency": "120ms" }
}
}
Validation:
- All health checks pass in healthy state
- Returns 503 when service is down
- Response time under 1 second
Context:
For 99.9% uptime requirement, we need comprehensive health monitoring of all critical services (database, Redis, Stellar network).
Problem:
Implement health check endpoints that verify all external dependencies are operational.
What Done Looks Like:
/healthendpoint returning overall status/health/stellarchecking Horizon connectivity/health/sorobanchecking Soroban RPC/health/dbchecking PostgreSQL/health/cachechecking RedisFolder Structure:
Implementation Guidelines:
@nestjs/terminusfor health checksResponse Format:
{ "status": "ok", "info": { "database": { "status": "up" }, "redis": { "status": "up" }, "stellar": { "status": "up", "latency": "45ms" }, "soroban": { "status": "up", "latency": "120ms" } } }Validation: