Skip to content

feat: agent team improvements#1

Merged
hericlesferraz merged 9 commits intomainfrom
feat/agent-team-improvements
Apr 4, 2026
Merged

feat: agent team improvements#1
hericlesferraz merged 9 commits intomainfrom
feat/agent-team-improvements

Conversation

@hericlesferraz
Copy link
Copy Markdown
Owner

No description provided.

hericlesferraz and others added 3 commits April 4, 2026 12:57
…stency

Address findings from parallel agent team audit across 4 areas:

Security: require explicit JWT secret (no default), fail-closed injection
detection, LLM injection check on streaming endpoint, sanitize DocVaultError
responses, remove infra details from public health endpoint.

RBAC: add auth to 19 unprotected endpoints (tags, versions, tasks, feedback,
observability), fix session ownership check in sharing.

RAG: fix BM25 multi-tenant leak (user_id filtering), async CrossEncoder via
to_thread, selective cache invalidation per document, eliminate double
embedding on cache miss, clamp scores before averaging, compute grounding
score from verdicts.

API: standardize DELETE→200 with body, POST→201 for creates, Pydantic model
for tag requests, feedback response envelope, generic RBAC error message.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Agent mode was not passing filename_map to extract_citations(), causing
citation badges to show wrong document names (positional fallback picked
whichever chunk happened to be at that index).

Citation normalization short-circuited too eagerly: if the answer already
contained any [citation:N] marker, bare [N] and Passage N formats in the
same answer were silently ignored.

PDF viewer used w-fit allowing pages to overflow the container. Now uses
ResizeObserver to measure container width and passes it to react-pdf Page
component so the PDF scales to fit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The citation system was relying on positional fallback (citation N maps
to retrieval result N-1) which produced wrong document references when
the LLM cited lazily or when document coverage reordering changed positions.

Now _extract_citation_quotes extracts the sentence surrounding each
[citation:N] marker and uses it for fuzzy matching against chunks.
Matching uses bidirectional word overlap (max of forward/reverse ratio)
with _WORD_SPLIT tokenization to handle punctuation differences.

Also improved the RAG prompt to explicitly instruct the LLM to verify
passage content before citing, and added 7 tests covering multi-document
citation accuracy scenarios.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@hericlesferraz hericlesferraz self-assigned this Apr 4, 2026
@hericlesferraz hericlesferraz added the enhancement New feature or request label Apr 4, 2026
hericlesferraz and others added 5 commits April 4, 2026 15:02
Restore a dev-only JWT secret default so tests can instantiate Settings
without env vars. Production validation now rejects secrets containing
"dev-only" or "change" keywords in addition to the length check.

Fix line-too-long in test files, format PdfViewer.tsx with Prettier,
and update injection test to expect fail-closed behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add auth headers to observability, task, feedback, and traces tests
- Update status code assertions (POST /sessions now 201)
- Simplify health endpoint test expectations (no infra details)
- Update injection tests for fail-closed behavior
- Fix semantic cache tests for tuple return type
- Update hallucination test for verdict-computed grounding score
- Fix error sanitizer test for sanitized DocVaultError messages
- Fix feedback.py User import for runtime compatibility

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend: bump litellm >=1.83.0, onnx >=1.21.0, and pin transitive deps
(aiohttp >=3.13.4, cryptography >=46.0.6, requests >=2.33.0,
pygments >=2.20.0, pyasn1 >=0.6.3, ecdsa >=0.19.2).

Frontend: add pnpm override for lodash-es >=4.18.0 to fix code injection
vulnerability in transitive dep via react-force-graph-2d.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…al tests

Citation context extraction now uses clause boundaries (previous marker
to current marker) instead of full sentence, preventing multi-citation
sentences from matching all citations to the same chunk.

Update adversarial tests: injection fail-closed assertion, and
hallucination test uses majority-unsupported scenario so the filter
actually strips fabricated claims with verdict-computed grounding score.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add 4 screenshots showing chat with citations, PDF highlighting,
knowledge graph visualization, and admin panel. Create SECURITY.md
with vulnerability reporting policy, credential management guide,
and production hardening checklist. Update development process section
to document multi-agent orchestration workflow with Claude Code.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@hericlesferraz hericlesferraz force-pushed the feat/agent-team-improvements branch from 52c4eca to 0607d8a Compare April 4, 2026 19:59
Pin to latest patch release to ensure no older versions with known
security issues can be resolved during installation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@hericlesferraz hericlesferraz merged commit 263d79b into main Apr 4, 2026
8 checks passed
@hericlesferraz hericlesferraz deleted the feat/agent-team-improvements branch April 4, 2026 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant