RAI Planner Agent — Session Feedback #1263
raymond-nassar
started this conversation in
General
Replies: 1 comment
-
|
We might also have to revise or get rid of the Restricted Uses part of the RAI Planner review. As our company's definition of Restricted Uses probably doesn't meet others. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Strengths
1. Progressive disclosure worked well
The agent didn't dump all questions at once. It asked scoped questions per phase and let you provide context incrementally — README first, then demo slides, then transcript, then evaluation screenshots. This kept cognitive load manageable across a complex 6-phase assessment.
2. Responsive to new information mid-phase
When you revealed JIRA Service Management auto-captures operator actions, the agent immediately downgraded T-RAI-005 from Critical to Medium and recalculated downstream impacts. Same with the training/disclosure mitigations in Phase 2 — it revised sensitive uses scores in real time rather than locking them in. This felt genuinely collaborative rather than checklist-driven.
3. Scoring transparency
The agent consistently showed its math — why a category scored 1 vs 0, how the tier was derived, and what the composite score formula was. This made it possible to challenge or validate decisions at each step.
4. Artifact quality
The generated artifacts (system-definition-pack, stakeholder-impact-map, evidence-register, scorecard) were detailed, internally consistent, and structured for downstream use. The 14 work items in GitHub issue format with acceptance criteria are immediately actionable.
Areas for Improvement
1. Phase 1 questioning could be more efficient
The agent asked 5+ open questions before it had fully consumed the README and demo script you'd already attached. Several answers were discoverable from the provided materials. The agent should extract maximum context from attached files before asking clarifying questions, reducing back-and-forth.
2. Over-reliance on user "ready" signals
The agent waited for explicit "ready" confirmation between every phase. For a user who is clearly engaged and has no objections, this adds friction. A lighter touch — "Moving to Phase 3 unless you have anything to add" — would maintain user control while improving flow.
3. Sensitive uses scoring scale wasn't explained proactively
I had to ask what the 0-1 scoring scale meant. For a framework where the scores directly determine the assessment tier (Standard vs Comprehensive), the scoring semantics should be explained upfront before presenting results, not after.
4. Standards mapping phase was dense and hard to validate
Phase 3 produced a large mapping of 5 components × 6 principles × NIST subcategories. The output was thorough but difficult to review in one pass. Breaking this into component-by-component summaries with a final cross-cutting view would improve reviewability.
5. Threat model could better distinguish AI-specific vs general threats
Several threats in Phase 4 (e.g., operational detail leakage, LLM latency) are standard application security concerns rather than AI-specific RAI threats. The AI STRIDE extensions are valuable, but the agent should more clearly separate AI-unique risks from general operational risks to keep the RAI assessment focused.
6. Gap between evaluation data and confidence presentation
The agent identified the 55% accuracy vs 0.95 confidence disconnect as a tradeoff (TO-001), but could have surfaced this tension earlier — during Phase 1 when the evaluation screenshots were provided — rather than waiting until Phase 5. Early flagging of this kind of systemic issue would give it more weight throughout the assessment.
7. No intermediate checkpoints on cumulative score trajectory
The composite score (75.65) appeared for the first time in Phase 6. Showing a running estimate after Phases 3-5 would help the user understand whether mitigations are moving the needle and where to focus additional context or corrections.
Process Suggestions
Beta Was this translation helpful? Give feedback.
All reactions