-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Problem
When envctl.sh restart re-extracts the gateway tarball while Node.js is still running inside the container, the update-nginx-locations periodic task (runs every 5 seconds) tries to execute /opt/gateway/app/main.sh which has been momentarily deleted during extraction. This causes a crash:
Error: Error executing [/opt/gateway/app/main.sh -r bin/update-nginx-locations.sh]:
spawnSync /opt/gateway/app/main.sh ENOENT
The auto-restart loop in start-app-gateway.sh recovers after 3 seconds, but there's a brief outage window.
Root Cause
The reload script (trigger-reload.sh / envctl.sh restart) extracts the new tarball over the existing directory while the Node.js process is still running and executing scripts from that directory. The 5-second update-nginx-locations interval makes collisions frequent.
Proposed Fix
Stop Node.js before extracting the tarball. The reload sequence should be:
- Signal Node.js to stop gracefully
- Wait for process exit
- Extract new tarball
- Start Node.js
This is already partially handled by the reload script but the timing may not be reliable.
Priority
Low — the auto-restart loop recovers, but during the ~3s outage window, active WebSocket connections drop and reconnect.
Files
scripts/local-dev/envctl.sh— restart logicdocker-images/backend-images/gateway/app/lib/start-app-gateway.sh— auto-restart loopdocker-images/backend-images/gateway/app/lib/trigger-reload.sh— reload trigger