fix(session): propagate extractor failures to async task error#511
Merged
qin-ctx merged 5 commits intovolcengine:mainfrom Mar 12, 2026
Merged
Conversation
Follow-up to volcengine#472. When `wait=false`, background commit failures were silently lost — callers had no way to know if memory extraction succeeded. This adds a lightweight in-memory TaskTracker that returns a `task_id` on async commit, which callers can poll via new `/tasks` endpoints to check completion status, results, or errors. Key changes: - New TaskTracker singleton with TTL-based cleanup (24h completed, 7d failed) - New API: GET /api/v1/tasks/{task_id} and GET /api/v1/tasks (with filters) - Atomic duplicate commit detection (eliminates race condition) - Error message sanitization (keys/tokens redacted) - Defensive copies on all public reads (thread safety) - 35 tests (26 unit + 9 integration), all existing tests pass Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
qin-ctx
reviewed
Mar 10, 2026
|
|
||
| async def create_memory( | ||
| self, | ||
| candidate: CandidateMemory, |
Collaborator
There was a problem hiding this comment.
[Bug] extract_strict() 是 extract() 的完整复制粘贴(~115 行),仅在 except 块中有行为差异(raise RuntimeError vs return [])。
这意味着后续对 extract() 的任何 bug 修复或功能变更都必须同步到 extract_strict(),否则两者会产生行为分歧(silent divergence)。
建议通过在 extract() 中添加一个 strict 参数来消除重复:
async def extract(
self, context, user, session_id, *, strict: bool = False
) -> List[CandidateMemory]:
# ... existing logic ...
except Exception as e:
logger.error(f"Memory extraction failed: {e}")
if strict:
raise RuntimeError(f"memory_extraction_failed: {e}") from e
return []然后 extract_strict 可以只做一层委托,或直接在 compressor 中传 strict=True。
| return [] | ||
|
|
||
| candidates = await self.extractor.extract(context, user, session_id) | ||
| if strict_extract_errors: |
Collaborator
There was a problem hiding this comment.
[Suggestion] strict_extract_errors=True 时异常会直接从 extract_long_term_memories 抛出,调用方 commit_async()(session.py:338)没有 try/catch,而是依赖外层 task tracker 来捕获。
这个假设本身是合理的,但建议至少在此处或 commit_async 中加一行注释说明 "extraction errors intentionally propagate to caller (task tracker)",避免后续维护者误以为遗漏了异常处理而加上 try/catch 吞掉错误。
Contributor
Author
|
Addressed review feedback in
Thanks for the review. |
qin-ctx
approved these changes
Mar 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
In async session commit (
wait=false), extractor exceptions were swallowed byMemoryExtractor.extract()and converted to an empty list.This caused task tracking to incorrectly report:
status=completedmemories_extracted=0error=nullwhich made downstream monitoring (e.g. Slack alerting via task polling) unable to detect real extraction failures.
Changes
MemoryExtractor.extract_strict()extract()RuntimeError("memory_extraction_failed: ...")on extraction errorsSessionCompressor.extract_long_term_memories(..., strict_extract_errors=False)extract_strict()Session.commit_async()now passesstrict_extract_errors=Truetests/test_session_task_tracking.py::test_task_failed_when_memory_extraction_raisesfailedand exposes error messageValidation
Unit/integration tests
Ran:
Result: 40 passed
Local production verification
Verified on local running OV service before opening this PR:
claude-haiku-4-5-20251001) -> async task failed with error containingmemory_extraction_failedThis confirms failure signals now propagate to task API as expected.
Compatibility
extract()behavior is unchanged (still non-throwing).