fix: merge ambient auto-resume feature to main (TASK-004)#9
fix: merge ambient auto-resume feature to main (TASK-004)#9tiwillia-ai-bot wants to merge 3 commits intoopenshift-online:mainfrom
Conversation
…eive messages When an Ambient session is stopped due to inactivity, agents cannot receive messages until manually restarted. This change automatically resumes stopped Ambient sessions during message delivery via SingleAgentCheckIn. **Changes:** - Modified SingleAgentCheckIn to detect missing Ambient sessions and auto-restart - Added logging for auto-resume events - Added comprehensive test coverage for auto-resume behavior - Tests verify Ambient-only restart (tmux sessions still skip as expected) Fixes TASK-013. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add SupportsAutoResume() capability method to SessionBackend interface - Implement SupportsAutoResume() in AmbientSessionBackend (true) and TmuxSessionBackend (false) - Move auto-resume logic from tmux.go to lifecycle.go as maybeAutoResumeAgent helper - Update SingleAgentCheckIn to use capability interface instead of hardcoded backend name - Fix test docstrings to accurately describe test behavior - Add BackendType preservation assertion to TestAutoResumeAmbientSession - Rename TestAutoResumeFailureHandling to TestSingleAgentCheckInNonexistentAgent - Update mock backends to implement SupportsAutoResume() Addresses review feedback: - [IMPORTANT] Backend detection now uses capability interface vs hardcoded string - [IMPORTANT] Auto-resume logic moved from check-in layer to lifecycle layer - [SUGGESTION] Test comments now match actual test implementation - [SUGGESTION] Test verifies BackendType preservation after restart
Code Review: PR #9fix: merge ambient auto-resume feature to main (TASK-004) Review SummaryOverall: CONCERNS
This PR adds auto-resume functionality for Ambient sessions through a clean interface extension ( GeneralVerdict: APPROVE — Auto-resume feature is well-architected with clean interface extension, comprehensive testing, and proper backend capability gating. DetailsFindings
Positive
Tmux BackendVerdict: APPROVE — The auto-resume feature is properly gated to ambient-only, preserving tmux's existing skip behavior with no impact on tmux backend logic. DetailsFindings
Positive
Ambient BackendVerdict: CONCERNS — Auto-resume is architecturally sound for Ambient's cloud session model, but loses agent context (initial prompt, workflow config, repos) on restart. DetailsFindings
Positive
QualityVerdict: CONCERNS — Test coverage bypasses production code path; grandfathered file grows without extraction plan. DetailsFindings
Positive
|
Resolves all 7 concerns raised in PR review:
**Critical Issues (1-3):**
1. lifecycle: restartAgentService now uses req.InitialMessage and req.TaskID
- Previously passed empty spawnRequest{}, losing context
- Now delivers initial messages and assigns tasks as intended
2. auto_resume_test: test now exercises maybeAutoResumeAgent
- Previously called restartAgentService directly
- Now tests the actual auto-resume code path
3. lifecycle.go: reduced from 969 to 507 lines (target: <600)
- Extracted spawn/restart/stop service functions to lifecycle_service.go
- Kept lifecycle handlers in original file for clarity
**Fixes (4+7):**
4. lifecycle: fixed session ID handling in maybeAutoResumeAgent
- Return empty string on error instead of stale sessionID
- Prevents caller from using invalid session references
7. auto_resume_test: fixed fragile mock ID generation
- Changed from rune arithmetic to fmt.Sprintf
- Now works correctly for all counts (not just 0-9)
**Documentation (5+6):**
5. session_backend.go: documented SupportsAutoResume() breaking change
- Added BREAKING CHANGE note for external implementations
- Recommends returning false for backward compatibility
6. tmux.go: documented message delivery after auto-resume
- Clarified that ignition happens automatically
- Check-in message delivered separately after session ready
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
OpenDispatch PR #9 ReviewReview SummaryOverall: CHANGES REQUESTED
This PR adds auto-resume functionality for Ambient sessions and successfully refactors lifecycle.go to reduce file size. The auto-resume feature is properly gated via a capability interface ( GeneralVerdict: CHANGES REQUESTED — Breaking interface change blocks merge; error handling edge case needs correction. DetailsFindings
Positive
Tmux BackendVerdict: APPROVE — Changes correctly implement the interface contract and preserve existing tmux behavior while enabling auto-resume for ambient-only. DetailsFindings
Positive
Ambient BackendVerdict: CHANGES REQUESTED — Critical error handling gap and breaking interface change must be resolved before merge. DetailsFindings
Positive
QualityVerdict: CONCERNS — File size compliance achieved, but test coverage gaps and security implications warrant discussion before merge. DetailsFindings
Positive
|
|
Closing as redundant: auto-resume feature already exists in upstream/main. Evidence:
The feature is fully implemented in main branch with complete test coverage. This PR was created based on the incorrect assumption that upstream was missing the feature, when in reality only the fork (tiwillia/open-dispatch) was out of sync with upstream. TASK-004 complete - feature exists as intended in upstream. |
Addresses code-reviewer major concerns openshift-online#4 and openshift-online#9: Major openshift-online#4: Leader Election Lock Pattern - Fix ON CONFLICT WHERE clause to handle expired locks atomically - Add OR condition for same-instance re-acquisition - Use < operator instead of <= per spec - Pattern now matches spec: acquire when expired OR same instance Major openshift-online#9: Foreign Key Constraints - Add FK constraint: agent_check_in_configs → agents (ON DELETE CASCADE) - Add FK constraint: check_in_events → agent_check_in_configs (ON DELETE CASCADE) - Create migration SQL files in db/migrations/ - Update rollback and verification scripts Major openshift-online#7: Metric count cleared at 16 (per code-reviewer) Migration files: - check_in_constraints_migration.sql: 7 CHECK + 2 FK constraints - check_in_constraints_rollback.sql: Safe rollback with IF EXISTS - verify_check_in_indexes.sql: Verification queries for indexes, constraints, FKs Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Summary
Merges the ambient session auto-resume feature that was previously implemented but never merged to main.
When a message is sent to an agent whose ambient session has been stopped due to idle timeout, the system will now automatically resume that session.
Root Cause
The auto-resume feature was successfully implemented in commits 663a7af and 66a3391, merged in PR #1, but those commits only exist on feature branches and never made it to main. The main branch still has the old code that skips agents when sessions don't exist.
Changes
maybeAutoResumeAgent()function in lifecycle.goSingleAgentCheckIn()to call auto-resume logic before checking session existenceSupportsAutoResume()capability method to SessionBackend interfaceFiles changed: 7 files, +269 lines, -1 line
Testing
✅ All auto-resume tests passing
✅ Full coordinator test suite passing (10.892s)
TestAutoResumeAmbientSession- verifies ambient sessions auto-resumeTestAutoResumeOnlyForAmbient- verifies tmux sessions do NOT auto-resumeImplementation Details
When
SingleAgentCheckIn()is called for message delivery:SupportsAutoResume()restartAgentService()Fixes TASK-004
🤖 Generated by ambient-debugger (via CEO)