feat: adversarial red team engine, custom persona builder, claim-level extraction, engine sweep by marceloceccon · Pull Request #4 · entropyvortex/roundtable

marceloceccon · 2026-04-25T13:32:05Z

Adversarial Red Team engine: rotating attacker stress-tests positions
before a parallel post-stress synthesis. Attacker confidence is
out-of-band for the consensus formula; defenders run in parallel.
Custom persona builder: six-axis (risk, optimism, evidence, formality,
verbosity, contrarian) slider UI. Server composes the system prompt
from vetted phrase fragments — no user free-text reaches the LLM.
Claim-level disagreement extraction: post-final LLM pass emits
structured contradictions with verbatim quotes per side. Quotes are
verified against actual response content; fabrications are dropped.
Engine sweep mode: one-click sequential run across all three engines
with a side-by-side comparison panel and explicit Cancel Sweep.

Code-quality bundle: HMR-safe rate-limit cleanup, last-occurrence
CONFIDENCE matching, snapshot usage reconstruction, judge/claim stream
cleanup on cancel, hard-abort cost cap, tighter extractUsage type
guards, fix for mockImplementation pollution across tests.

Docs: README updated for the new engines and affordances; CHANGELOG.md
and SECURITY.md added; newfeatures.md tracks per-feature rationale and
QA notes.

Tests: 207 -> 255 (+48 covering all new features and QA bundle).

…l extraction, engine sweep - Adversarial Red Team engine: rotating attacker stress-tests positions before a parallel post-stress synthesis. Attacker confidence is out-of-band for the consensus formula; defenders run in parallel. - Custom persona builder: six-axis (risk, optimism, evidence, formality, verbosity, contrarian) slider UI. Server composes the system prompt from vetted phrase fragments — no user free-text reaches the LLM. - Claim-level disagreement extraction: post-final LLM pass emits structured contradictions with verbatim quotes per side. Quotes are verified against actual response content; fabrications are dropped. - Engine sweep mode: one-click sequential run across all three engines with a side-by-side comparison panel and explicit Cancel Sweep. Code-quality bundle: HMR-safe rate-limit cleanup, last-occurrence CONFIDENCE matching, snapshot usage reconstruction, judge/claim stream cleanup on cancel, hard-abort cost cap, tighter extractUsage type guards, fix for mockImplementation pollution across tests. Docs: README updated for the new engines and affordances; CHANGELOG.md and SECURITY.md added; newfeatures.md tracks per-feature rationale and QA notes. Tests: 207 -> 255 (+48 covering all new features and QA bundle).

…ntrolled format string' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

github-advanced-security AI found potential problems Apr 25, 2026

View reviewed changes

Comment thread lib/consensus-engine.ts Fixed

marceloceccon and others added 2 commits April 25, 2026 10:35

Potential fix for pull request finding 'CodeQL / Use of externally-co…

bb810cc

…ntrolled format string' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

Linting fixes

075f4ca

marceloceccon merged commit 68e596b into main Apr 25, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: adversarial red team engine, custom persona builder, claim-level extraction, engine sweep#4

feat: adversarial red team engine, custom persona builder, claim-level extraction, engine sweep#4
marceloceccon merged 3 commits intomainfrom
features/new-enhancements

marceloceccon commented Apr 25, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

marceloceccon commented Apr 25, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants