-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Problem
When a gateway container is restarted via envctl.sh restart, the gateway process receives SIGTERM, triggers graceful shutdown, and notifies Ganymede to deallocate (POST /gateway/stop). This sets ended_at on the allocation row. When the gateway comes back up, it's idle — the user must trigger a fresh allocation from the frontend, causing an unnecessary cold-start delay.
Current Behavior
envctl.sh restart
→ trigger-reload.sh sends SIGTERM to Node.js
→ shutdownGateway() runs (gateway-init.ts:383)
→ POST /gateway/stop to Ganymede
→ proc_organizations_gateways_stop() sets ended_at
→ Gateway allocation ENDED
→ Node.js restarts
→ POST /gateway/config returns 404 (no active allocation)
→ Gateway comes up IDLE
→ User must reload page → POST /gateway/start → full cold-start pipeline
Expected Behavior (for restart)
The gateway should come back up and reclaim its previous allocation, avoiding the cold-start pipeline for the user.
Design Considerations
The gateway currently can't distinguish between:
- Permanent stop — "I'm being decommissioned, deallocate me"
- Restart/reload — "I'm restarting, keep my allocation"
Option A: Flag file to skip deallocation
start-app-gateway.sh already uses /tmp/gateway-reloading to detect reload vs crash. The shutdownGateway() function could check this flag and skip POST /gateway/stop when it's a reload.
// In shutdownGateway():
if (fs.existsSync('/tmp/gateway-reloading')) {
log('Skipping deallocation (reload in progress)');
} else {
await ganymedeClient.request({ url: '/gateway/stop', ... });
}Option B: Ganymede-side grace period
Instead of immediately ending the allocation, Ganymede could keep the allocation alive for N seconds after /gateway/stop. If the same gateway calls /gateway/ready within that window, the allocation is restored.
Option C: Separate stop vs restart signals
Add a /gateway/restart endpoint that preserves the allocation, distinct from /gateway/stop which ends it. The gateway calls the appropriate one based on shutdown context.
Option D: Kill without graceful shutdown on restart
envctl.sh restart could use SIGKILL instead of SIGTERM for restarts, skipping shutdownGateway() entirely. The allocation would remain active in the DB, and the restarted gateway would reclaim it via /gateway/config.
Files
packages/app-gateway/src/main.ts:168-194— SIGTERM handler callsshutdownGateway()packages/app-gateway/src/initialization/gateway-init.ts:383-437—shutdownGateway()sendsPOST /gateway/stoppackages/app-ganymede/src/routes/gateway/index.ts:398-445—/gateway/stophandler setsended_atdocker-images/backend-images/gateway/app/lib/start-app-gateway.sh—/tmp/gateway-reloadingflag checkscripts/local-dev/envctl.sh— restart logic