, ``` →
}
+
+ ```
+3. Save as `YYYY-MM-DD-report.html` (or `YYYY-MM-DD-executive-summary.html` for exec audience)
+4. Mermaid code blocks in HTML: wrap in `...
` for browser rendering
+
+No external CSS files, no CSS changes to other files — inline only.
## Output Formats
@@ -380,45 +518,46 @@ After documentation review, run the human-voice plugin to ensure report language
If file is missing for a dimension, fall back to `blackboard_read(scope="{topic_slug}", key="findings_{dimension}")`.
Merge all available findings with state.json for complete coverage.
3. **Recall Atlatl Memories**: `recall_memories(query="sigint {topic}", tags=["sigint-research"])`
-4. **Organize Content**: Map findings to report sections
-5. **Generate Narrative**: Write flowing prose connecting findings
-6. **Create Visualizations**: Generate all Mermaid diagrams
-7. **Write Report**: Produce complete document
-8. **Format Outputs**: Generate requested formats
-9. **Save Files**: Write to reports directory
-10. **Run Documentation Review** (if plugin available): Execute `/documentation-review:doc-review` on reports directory
-11. **Fix Issues** (if plugin available): All markdown must pass review before completing
-12. **Run Human Voice Review** (if plugin available): Execute `/human-voice:voice-review` on each report file with emoji preservation instruction
-13. **Fix Voice Issues** (if plugin available): Rewrite flagged sections for natural, human-sounding language while preserving emojis
-14. **Post-Report Codex Review Gate (BLOCKING):**
+4. **Organize Content**: Map findings to report sections using Section → Data Mapping
+5. **Generate Sections**: Execute Section Iterator for each section (generate content or placeholder)
+6. **Create Visualizations**: Generate Mermaid diagrams where conditions are met (see Section Generation Protocol)
+7. **Apply Audience Transform**: Reorder sections and apply content transforms per Audience Transform Protocol
+8. **Write Report**: Produce complete markdown document
+9. **Format Outputs**: Generate requested formats (HTML if `--format html` or `--format both`)
+10. **Save Files**: Write to reports directory
+11. **Run Documentation Review** (if plugin available): Execute `/documentation-review:doc-review` on reports directory
+12. **Fix Issues** (if plugin available): All markdown must pass review before completing
+13. **Run Human Voice Review** (if plugin available): Execute `/human-voice:voice-review` on each report file with emoji preservation instruction
+14. **Fix Voice Issues** (if plugin available): Rewrite flagged sections for natural, human-sounding language while preserving emojis
+15. **Post-Report Codex Review Gate (BLOCKING):**
Self-review the report against the findings data before delivering:
- **Step 14a: Load findings for cross-reference**
+ **Step 15a: Load findings for cross-reference**
Read `./reports/{topic_slug}/state.json` to get the authoritative findings array.
- **Step 14b: Verify claim traceability**
+ **Step 15b: Verify claim traceability**
For each factual assertion in the report:
- Check: does it trace to a specific finding ID in state.json?
- Check: does the finding have provenance (sources with URLs)?
- Flag untraced claims
- **Step 14c: Verify no hallucinated statistics**
+ **Step 15c: Verify no hallucinated statistics**
For each number/statistic in the report:
- Check: does it appear in a finding's summary, evidence, or provenance snippet?
- Flag numbers not traceable to findings data
- **Step 14d: Check balanced representation**
+ **Step 15d: Check balanced representation**
- Compare section coverage against `elicitation.priorities` ranking
- Flag if any priority dimension is missing or under-represented
- **Step 14e: Remediate or warn**
+ **Step 15e: Remediate or warn**
- If flagged issues found: revise the report to fix traceable issues (max 1 revision pass)
- If issues remain after revision: append a "Provenance Warnings" section listing unresolved claims
- If no issues: proceed
**Fallback:** If spawned with a `team_name` and a team lead is available, send flagged issues via SendMessage for awareness. Do not wait for a response — the self-review is authoritative.
-15. **Capture Summary**: `capture_memory(namespace="_semantic/knowledge", tags=["sigint-research", "report"], title="Report generated: {topic}", ...)` then `enrich_memory(id)`
-16. **Signal Completion** (required when spawned as a swarm teammate with `team_name`):
+16. **Capture Summary**: `capture_memory(namespace="_semantic/knowledge", tags=["sigint-research", "report"], title="Report generated: {topic}", ...)` then `enrich_memory(id)`
+17. **Signal Completion** (required when spawned as a swarm teammate with `team_name`):
```
TaskUpdate(taskId, status: "completed")
SendMessage(
diff --git a/agents/research-orchestrator.md b/agents/research-orchestrator.md
index e90997e..4b2a5de 100644
--- a/agents/research-orchestrator.md
+++ b/agents/research-orchestrator.md
@@ -1,6 +1,6 @@
---
name: research-orchestrator
-version: 0.5.0
+version: 0.5.1
description: |
Orchestrator agent for sigint research sessions. Owns all phase management: team lifecycle,
dimension-analyst spawning, methodology verification, codex review gates, finding merge,
@@ -176,6 +176,87 @@ After elicitation:
---
+## Phase 1.5: Smart Dimension Selection (Full Mode Only)
+
+Skip this phase in `update` and `augment` modes — those modes use pre-determined dimensions. In `full` mode, run after elicitation completes.
+
+**Skip condition**: If `--dimensions` flag was passed in the spawn prompt (from `/sigint:start --dimensions ...`), use those dimensions directly and skip to Phase 2 with a progress note: "Dimension selection bypassed — using caller-specified dimensions: {list}".
+
+### Step 1.5.1: Assess Dimension Relevance
+
+For each of the 8 standard dimensions, evaluate relevance based on:
+- The elicited `topic`, `decision_context`, `scope`, and `priorities`
+- Cap pre-selected dimensions at `max_dimensions` (from config, default 5)
+
+Default relevance preference order for general business topics: competitive → sizing → trends → customer → regulatory → financial → tech → trend_modeling
+
+For technology topics: tech → competitive → trends → sizing → regulatory → customer → financial → trend_modeling
+
+Adjust based on elicitation priorities — dimensions that appear in `elicitation.priorities` should be included regardless of defaults.
+
+### Step 1.5.2: Present Dimension Selection UI
+
+```
+Research dimensions for "{topic}":
+
+Use ✅ for included, ❌ for excluded:
+
+✅ competitive — [1-line rationale]
+✅ sizing — [1-line rationale]
+✅ trends — [1-line rationale]
+❌ customer — [1-line rationale explaining why excluded]
+✅ regulatory — [1-line rationale]
+❌ tech — [1-line rationale]
+❌ financial — [1-line rationale]
+❌ trend_modeling — Requires trend data; run after trends dimension
+
+Max dimensions: {max_dimensions} (from config)
+
+Type dimension names to add, or type 'confirm' to proceed with this selection.
+```
+
+Use `AskUserQuestion` to present this and capture the response.
+
+### Step 1.5.3: Apply User Input
+
+Parse the user's response:
+- `confirm` or blank → use pre-selected dimensions as-is
+- Dimension names to add → append to selected list (if not already included)
+- `remove ` → remove from selected list
+- Custom names (not in standard 8) → add the custom name as a string to the dimensions list AND record it separately in `elicitation.custom_dimensions` array (so downstream consumers know which dimensions lack SKILL.md). The analyst for custom dimensions is spawned with `SKILL_OVERRIDE: null`.
+
+Final selected list must not exceed `max_dimensions`.
+
+### Step 1.5.4: Persist Final Dimension Selection
+
+Update elicitation with confirmed dimensions using jq (per Structured Data Protocol). `elicitation.dimensions` is always an array of strings (to pass schema validation). Custom dimension metadata is stored separately in `elicitation.custom_dimensions`:
+```bash
+jq --argjson dims "$SELECTED_DIMS_JSON" \
+ --argjson custom "$CUSTOM_DIMS_JSON" \
+ '.elicitation.dimensions = $dims | .elicitation.custom_dimensions = $custom' \
+ "./reports/$TOPIC_SLUG/state.json" > tmp.$$ && mv tmp.$$ "./reports/$TOPIC_SLUG/state.json"
+jq -e -f schemas/state.jq "./reports/$TOPIC_SLUG/state.json" > /dev/null
+```
+Where `$SELECTED_DIMS_JSON` is `["competitive", "sizing", ...]` (string array) and `$CUSTOM_DIMS_JSON` is `["custom_dim_name", ...]` (string array of non-standard dimension names, empty `[]` if none).
+
+Also update `elicitation.json` and blackboard:
+```bash
+jq '.elicitation' "./reports/$TOPIC_SLUG/state.json" > "./reports/$TOPIC_SLUG/elicitation.json"
+```
+```
+blackboard_write(scope="{topic_slug}", key="elicitation", value={updated elicitation})
+```
+
+Update progress file:
+```markdown
+## {ISO_DATE} — Dimension Selection Complete
+- Selected: {N} dimensions: {list}
+- User confirmed: yes|modified
+- Custom dimensions: {list or "none"}
+```
+
+---
+
## Phase 2: Spawn Dimension-Analysts
### Step 2.1: Create Tasks
@@ -206,7 +287,22 @@ Agent(
CRITICAL: Use REPORTS_DIR exactly as provided for ALL file writes.
Do NOT derive or re-slugify the output directory from the topic title.
- ..."
+
+ {If dimension is in elicitation.custom_dimensions:
+ SKILL_OVERRIDE: null
+ This is a custom dimension — no SKILL.md exists. Follow your custom dimension protocol (skip Steps 2-4, use generic methodology, enforce provenance).
+ Else:
+ Follow your MANDATORY Methodology Gating Protocol (Steps 1-6) from your agent definition:
+ - Step 1: Read elicitation from $REPORTS_DIR/state.json (or elicitation.json)
+ - Step 2: Load skills/{skill-directory}/SKILL.md — REQUIRED before any research
+ - Step 3: Extract Required Frameworks table from the skill
+ - Step 4: Write methodology_plan_{dimension}.json before proceeding
+ - Step 5: Conduct web research following the skill methodology
+ - Step 6: Self-reflect, write findings, signal completion
+
+ Do NOT proceed with research until Step 4 (methodology plan written) succeeds.
+ Do NOT substitute your own methodology for the skill's Required Frameworks.
+ }"
)
```
@@ -379,12 +475,70 @@ Update progress file:
5. **If gate = pass:** Proceed with findings as-is.
+### Step 2.75.2: Methodology Hard-Fail and Retry
+
+After receiving the codex review JSON response:
+
+**Hard-fail check:**
+If `methodology_gaps` is non-empty:
+1. Read `$REPORTS_DIR/methodology_plan_{dimension}.json` to get the list of `required: "yes"` frameworks
+2. Cross-reference: which methodology gaps are required frameworks (not conditional)?
+3. If any **required** framework is in `methodology_gaps`: override `gate` to `"fail"` — this is a hard methodology failure regardless of other criteria
+
+**Retry logic (max 2 retries per dimension):**
+
+Track retries with a local counter `methodology_retry_count` initialized to 0 per dimension.
+
+```
+WHILE gate == "fail" due to methodology gaps AND methodology_retry_count < 2:
+ methodology_retry_count += 1
+
+ 1. Build gap list: required frameworks missing from findings
+ 2. Spawn gap-fill analyst:
+ Agent(
+ subagent_type="sigint:dimension-analyst",
+ team_name="sigint-{topic_slug}-research",
+ name="dimension-analyst-{dimension}-retry{methodology_retry_count}",
+ prompt="Gap-fill retry #{methodology_retry_count} for {dimension} analysis on '{topic}'.
+
+ BLACKBOARD: {topic_slug}
+ TOPIC_SLUG: {topic_slug}
+ REPORTS_DIR: ./reports/{topic_slug}
+ Skill to load: skills/{skill-directory}/SKILL.md
+
+ PRIORITY: Address these missing required frameworks:
+ {numbered list of missing required framework names from methodology_gaps}
+
+ Step 1: Read existing findings from $REPORTS_DIR/findings_{dimension}.json
+ Step 2: Load skills/{skill-directory}/SKILL.md and find the section for each missing framework
+ Step 3: Run targeted WebSearch for each missing framework (minimum 3 searches per gap)
+ Step 4: Add new findings that cover the missing frameworks
+ Step 5: Write the COMPLETE updated findings back to $REPORTS_DIR/findings_{dimension}.json
+ Step 6: Validate with schema, signal completion via SendMessage to team-lead
+
+ Follow your MANDATORY Methodology Gating Protocol."
+ )
+ 3. Wait for retry analyst SendMessage
+ 4. Re-run codex review (same criteria) on updated findings
+ 5. Parse new gate response → update gate variable
+
+IF gate still "fail" after 2 retries:
+ - Accept current findings (do not block merge)
+ - Append unresolved gap note to findings file using jq:
+ jq --argjson gaps "$METHODOLOGY_GAPS_JSON" \
+ '.methodology_gaps_unresolved = $gaps' \
+ "$REPORTS_DIR/findings_{dimension}.json" > tmp.$$ && mv tmp.$$ "$REPORTS_DIR/findings_{dimension}.json"
+ - Log in progress file: "Methodology gaps unresolved after 2 retries: {list}"
+```
+
Update progress file:
```markdown
## {ISO_DATE} — Post-Findings Review: {dimension}
- Findings reviewed: {N}
- Quarantined: {N} ({reasons})
- Sources verified: {N}/{total} alive
+- Methodology gaps: {N} ({list or "none"})
+- Methodology retries: {N}
- Gate: {pass|fail}
```
diff --git a/skills/augment/SKILL.md b/skills/augment/SKILL.md
index b3f697b..d125e9a 100644
--- a/skills/augment/SKILL.md
+++ b/skills/augment/SKILL.md
@@ -1,7 +1,7 @@
---
name: augment
description: Deep-dive into a specific area of current research. Orchestrates a single dimension-analyst using full swarm pattern (TeamCreate, TaskCreate, SendMessage). Use when the user wants to augment current research with deeper analysis of a specific area.
-argument-hint: " [--methodology ]"
+argument-hint: " [--dimension competitive|sizing|trends|customer|tech|financial|regulatory|trend_modeling]"
allowed-tools:
- Agent
- AskUserQuestion
@@ -41,7 +41,7 @@ research state.
**Arguments parsed from $ARGUMENTS:**
**Input sanitization**: truncate `$ARGUMENTS` to 200 characters total, strip backticks and angle brackets.
- `$1` — area to investigate (e.g., "competitor pricing", "regulatory landscape")
-- `--methodology ` — optional: competitive, sizing, trends, customer, tech, financial, regulatory
+- `--dimension ` — optional: competitive, sizing, trends, customer, tech, financial, regulatory, trend_modeling
---
@@ -80,11 +80,12 @@ Map `area` to dimension and skill directory:
| technology, tech, feasibility, stack, build vs buy | tech | tech-assessment |
| revenue, economics, pricing, unit economics, SaaS | financial | financial-analysis |
| compliance, regulatory, legal, privacy, GDPR | regulatory | regulatory-review |
+| scenario, causal model, three-valued logic, trade-offs | trend_modeling | trend-modeling |
-If `--methodology` flag was provided, use that dimension directly.
+If `--dimension` flag was provided, use that dimension directly.
If the area doesn't map clearly, use `AskUserQuestion`:
-> "Which research methodology best fits '{area}'? Options: competitive / sizing / trends / customer / tech / financial / regulatory"
+> "Which research methodology best fits '{area}'? Options: competitive / sizing / trends / customer / tech / financial / regulatory / trend_modeling"
Store resolved values as `dimension` and `skill_dir`.
diff --git a/skills/report-autoresearch/evals.json b/skills/report-autoresearch/evals.json
new file mode 100644
index 0000000..da79d3d
--- /dev/null
+++ b/skills/report-autoresearch/evals.json
@@ -0,0 +1,35 @@
+{
+ "skill_name": "report-autoresearch",
+ "evals": [
+ {
+ "id": "9-section-report",
+ "name": "eval-9-section-report",
+ "description": "Verify report generates all 9 sections with not-assessed placeholders for missing dimensions",
+ "path": "iteration-0/eval-9-section-report"
+ },
+ {
+ "id": "audience-executive",
+ "name": "eval-audience-executive",
+ "description": "Verify --audience executives transform reorders sections and generates executive-summary.md",
+ "path": "iteration-0/eval-audience-executive"
+ },
+ {
+ "id": "swot-mermaid",
+ "name": "eval-swot-mermaid",
+ "description": "Verify SWOT quadrant Mermaid diagram is generated when 2+ dimensions have findings",
+ "path": "iteration-0/eval-swot-mermaid"
+ },
+ {
+ "id": "positioning-map",
+ "name": "eval-positioning-map",
+ "description": "Verify competitive positioning map is generated when 2+ competitors with 2+ attributes",
+ "path": "iteration-0/eval-positioning-map"
+ },
+ {
+ "id": "html-format",
+ "name": "eval-html-format",
+ "description": "Verify --format html produces valid HTML output with inline CSS",
+ "path": "iteration-0/eval-html-format"
+ }
+ ]
+}
diff --git a/skills/report-autoresearch/iteration-0/eval-9-section-report/grading.json b/skills/report-autoresearch/iteration-0/eval-9-section-report/grading.json
new file mode 100644
index 0000000..075b631
--- /dev/null
+++ b/skills/report-autoresearch/iteration-0/eval-9-section-report/grading.json
@@ -0,0 +1,23 @@
+{
+ "eval_id": "9-section-report",
+ "skill": "report",
+ "description": "Verify report generates all 9 sections with not-assessed placeholders for missing dimensions",
+ "input": "Generate report with competitive and sizing findings only",
+ "context": {
+ "findings_present": ["competitive", "sizing"],
+ "findings_absent": ["trends", "customer", "tech", "financial", "regulatory"]
+ },
+ "expected": {
+ "sections_generated": 9,
+ "sections_with_data": ["executive-summary", "market-overview", "market-sizing", "competitive"],
+ "sections_with_placeholder": ["trends", "swot", "recommendations", "risk", "appendix"],
+ "placeholder_contains_augment_suggestion": true
+ },
+ "grading_criteria": [
+ "All 9 sections are present in the generated report",
+ "Sections with findings contain real content",
+ "Sections without findings contain 'not assessed' placeholder and /sigint:augment suggestion",
+ "Executive summary is always generated regardless of dimension coverage",
+ "SWOT is generated (partial) from available competitive and sizing data"
+ ]
+}
diff --git a/skills/report-autoresearch/iteration-0/eval-9-section-report/outputs/.gitkeep b/skills/report-autoresearch/iteration-0/eval-9-section-report/outputs/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/skills/report-autoresearch/iteration-0/eval-audience-executive/grading.json b/skills/report-autoresearch/iteration-0/eval-audience-executive/grading.json
new file mode 100644
index 0000000..9a06526
--- /dev/null
+++ b/skills/report-autoresearch/iteration-0/eval-audience-executive/grading.json
@@ -0,0 +1,23 @@
+{
+ "eval_id": "audience-executive",
+ "skill": "report",
+ "description": "Verify --audience executives transform reorders sections and generates executive-summary.md",
+ "input": "--audience executives",
+ "context": {
+ "findings_present": ["competitive", "sizing", "trends", "regulatory"]
+ },
+ "expected": {
+ "section_order_starts_with": ["executive-summary", "recommendations", "risk"],
+ "standalone_executive_summary_generated": true,
+ "technical_jargon_replaced": true,
+ "strategic_implication_labels_present": true,
+ "methodology_notes_in_appendix": true
+ },
+ "grading_criteria": [
+ "Section order places executive-summary, recommendations, risk first",
+ "Standalone YYYY-MM-DD-executive-summary.md is generated",
+ "TAM is replaced with 'total market opportunity' in executive output",
+ "Strategic Implication labels prefix key findings",
+ "Methodology notes are moved to appendix"
+ ]
+}
diff --git a/skills/report-autoresearch/iteration-0/eval-audience-executive/outputs/.gitkeep b/skills/report-autoresearch/iteration-0/eval-audience-executive/outputs/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/skills/report-autoresearch/iteration-0/eval-html-format/grading.json b/skills/report-autoresearch/iteration-0/eval-html-format/grading.json
new file mode 100644
index 0000000..3737075
--- /dev/null
+++ b/skills/report-autoresearch/iteration-0/eval-html-format/grading.json
@@ -0,0 +1,23 @@
+{
+ "eval_id": "html-format",
+ "skill": "report",
+ "description": "Verify --format html produces valid HTML output with inline CSS",
+ "input": "--format html",
+ "context": {
+ "findings_present": ["competitive", "sizing"]
+ },
+ "expected": {
+ "html_file_generated": true,
+ "valid_html_structure": true,
+ "inline_css_only": true,
+ "mermaid_in_div_tags": true,
+ "no_external_css_files": true
+ },
+ "grading_criteria": [
+ "YYYY-MM-DD-report.html is generated in the reports directory",
+ "File contains valid HTML with DOCTYPE, head, body tags",
+ "Styles are inline only — no external .css file references",
+ "Mermaid code blocks are wrapped in ",
+ "Markdown tables are converted to HTML
elements"
+ ]
+}
diff --git a/skills/report-autoresearch/iteration-0/eval-html-format/outputs/.gitkeep b/skills/report-autoresearch/iteration-0/eval-html-format/outputs/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/skills/report-autoresearch/iteration-0/eval-positioning-map/grading.json b/skills/report-autoresearch/iteration-0/eval-positioning-map/grading.json
new file mode 100644
index 0000000..a2248ac
--- /dev/null
+++ b/skills/report-autoresearch/iteration-0/eval-positioning-map/grading.json
@@ -0,0 +1,26 @@
+{
+ "eval_id": "positioning-map",
+ "skill": "report",
+ "description": "Verify competitive positioning map is generated when 2+ competitors with 2+ attributes",
+ "input": "Generate competitive section with 3 competitors",
+ "context": {
+ "findings_present": ["competitive"],
+ "competitors": [
+ {"name": "Competitor A", "feature_score": 0.8, "price_score": 0.9},
+ {"name": "Competitor B", "feature_score": 0.6, "price_score": 0.4},
+ {"name": "Competitor C", "feature_score": 0.4, "price_score": 0.3}
+ ]
+ },
+ "expected": {
+ "positioning_map_generated": true,
+ "diagram_type": "quadrantChart",
+ "all_competitors_plotted": true,
+ "axes_labeled": true
+ },
+ "grading_criteria": [
+ "Competitive positioning quadrantChart is generated with 2+ competitors",
+ "All 3 competitors appear as data points",
+ "Axes are labeled (feature set vs price)",
+ "Diagram is omitted when fewer than 2 competitors have comparable attributes"
+ ]
+}
diff --git a/skills/report-autoresearch/iteration-0/eval-positioning-map/outputs/.gitkeep b/skills/report-autoresearch/iteration-0/eval-positioning-map/outputs/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/skills/report-autoresearch/iteration-0/eval-swot-mermaid/grading.json b/skills/report-autoresearch/iteration-0/eval-swot-mermaid/grading.json
new file mode 100644
index 0000000..e6dceb3
--- /dev/null
+++ b/skills/report-autoresearch/iteration-0/eval-swot-mermaid/grading.json
@@ -0,0 +1,22 @@
+{
+ "eval_id": "swot-mermaid",
+ "skill": "report",
+ "description": "Verify SWOT quadrant Mermaid diagram is generated when 2+ dimensions have findings",
+ "input": "Generate report with competitive and trends findings",
+ "context": {
+ "findings_present": ["competitive", "trends"],
+ "cross_dimension_synthesis": true
+ },
+ "expected": {
+ "swot_section_generated": true,
+ "mermaid_diagram_present": true,
+ "diagram_type": "quadrantChart",
+ "swot_quadrants": ["Strengths", "Weaknesses", "Opportunities", "Threats"]
+ },
+ "grading_criteria": [
+ "SWOT quadrantChart diagram is generated when cross-dimension synthesis is possible",
+ "All 4 SWOT quadrants are labeled correctly",
+ "Findings from multiple dimensions are synthesized into SWOT entries",
+ "Diagram follows the existing quadrantChart template pattern"
+ ]
+}
diff --git a/skills/report-autoresearch/iteration-0/eval-swot-mermaid/outputs/.gitkeep b/skills/report-autoresearch/iteration-0/eval-swot-mermaid/outputs/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/skills/report/SKILL.md b/skills/report/SKILL.md
index cdba3e4..796eb75 100644
--- a/skills/report/SKILL.md
+++ b/skills/report/SKILL.md
@@ -1,7 +1,7 @@
---
name: report
description: Generate a comprehensive market research report from current findings. Orchestrates report-synthesizer using full swarm pattern with TeamCreate, TaskCreate, SendMessage, and TeamDelete.
-argument-hint: "[--format ] [--audience ] [--sections ]"
+argument-hint: "[--format markdown|html|both] [--audience executives|pm|investors|dev|all] [--sections executive-summary,market-overview,market-sizing,competitive,trends,swot,recommendations,risk,appendix|all]"
allowed-tools: Agent, AskUserQuestion, Glob, Grep, Read, SendMessage, TaskCreate, TaskGet, TaskList, TaskUpdate, TeamCreate, TeamDelete, Write
---
diff --git a/skills/start-autoresearch/evals.json b/skills/start-autoresearch/evals.json
new file mode 100644
index 0000000..b4acbf5
--- /dev/null
+++ b/skills/start-autoresearch/evals.json
@@ -0,0 +1,23 @@
+{
+ "skill_name": "start-autoresearch",
+ "evals": [
+ {
+ "id": "dimension-selection",
+ "name": "eval-dimension-selection",
+ "description": "Verify Phase 1.5 presents dimension selection UI and respects user confirmation",
+ "path": "iteration-0/eval-dimension-selection"
+ },
+ {
+ "id": "methodology-loading",
+ "name": "eval-methodology-loading",
+ "description": "Verify dimension-analysts load SKILL.md methodology before conducting research",
+ "path": "iteration-0/eval-methodology-loading"
+ },
+ {
+ "id": "methodology-gate",
+ "name": "eval-methodology-gate",
+ "description": "Verify Phase 2.75 hard-fails on missing required frameworks and retries with gap targeting",
+ "path": "iteration-0/eval-methodology-gate"
+ }
+ ]
+}
diff --git a/skills/start-autoresearch/iteration-0/eval-dimension-selection/grading.json b/skills/start-autoresearch/iteration-0/eval-dimension-selection/grading.json
new file mode 100644
index 0000000..df3adc9
--- /dev/null
+++ b/skills/start-autoresearch/iteration-0/eval-dimension-selection/grading.json
@@ -0,0 +1,26 @@
+{
+ "eval_id": "dimension-selection",
+ "skill": "start",
+ "description": "Verify Phase 1.5 presents dimension selection UI and respects user confirmation",
+ "input": "Research the market for enterprise Kubernetes security tooling",
+ "context": {
+ "max_dimensions": 5,
+ "elicitation_complete": true,
+ "topic": "enterprise kubernetes security tooling",
+ "topic_slug": "enterprise-kubernetes-security-tooling"
+ },
+ "expected": {
+ "dimension_selection_presented": true,
+ "dimensions_shown": ["competitive", "sizing", "trends", "tech", "regulatory", "customer", "financial", "trend_modeling"],
+ "rationale_per_dimension": true,
+ "selected_count_lte_max": true,
+ "user_confirm_requested": true
+ },
+ "grading_criteria": [
+ "Phase 1.5 dimension selection UI is presented after elicitation",
+ "All 8 standard dimensions are shown with include/exclude rationale",
+ "Selected count does not exceed max_dimensions (5)",
+ "AskUserQuestion is used to capture user confirmation",
+ "State.json elicitation.dimensions is updated with confirmed selection"
+ ]
+}
diff --git a/skills/start-autoresearch/iteration-0/eval-dimension-selection/outputs/.gitkeep b/skills/start-autoresearch/iteration-0/eval-dimension-selection/outputs/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/skills/start-autoresearch/iteration-0/eval-methodology-gate/grading.json b/skills/start-autoresearch/iteration-0/eval-methodology-gate/grading.json
new file mode 100644
index 0000000..de69463
--- /dev/null
+++ b/skills/start-autoresearch/iteration-0/eval-methodology-gate/grading.json
@@ -0,0 +1,23 @@
+{
+ "eval_id": "methodology-gate",
+ "skill": "start",
+ "description": "Verify Phase 2.75 hard-fails on missing required frameworks and retries with gap targeting",
+ "input": "Research market sizing for edge computing platforms",
+ "context": {
+ "dimension": "sizing",
+ "methodology_gaps": ["TAM/SAM/SOM Hierarchy", "Scenario Modeling"],
+ "retry_count": 0
+ },
+ "expected": {
+ "gate_overridden_to_fail": true,
+ "retry_spawned": true,
+ "retry_prompt_contains_gap_list": true,
+ "max_retries_respected": 2
+ },
+ "grading_criteria": [
+ "Codex review gate overrides to fail when required frameworks are missing",
+ "Retry analyst is spawned with specific gap list in prompt",
+ "Retry analyst focuses research on missing TAM/SAM/SOM and Scenario Modeling frameworks",
+ "After 2 failed retries, findings proceed with methodology_gaps_unresolved field set"
+ ]
+}
diff --git a/skills/start-autoresearch/iteration-0/eval-methodology-gate/outputs/.gitkeep b/skills/start-autoresearch/iteration-0/eval-methodology-gate/outputs/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/skills/start-autoresearch/iteration-0/eval-methodology-loading/grading.json b/skills/start-autoresearch/iteration-0/eval-methodology-loading/grading.json
new file mode 100644
index 0000000..24c0b37
--- /dev/null
+++ b/skills/start-autoresearch/iteration-0/eval-methodology-loading/grading.json
@@ -0,0 +1,22 @@
+{
+ "eval_id": "methodology-loading",
+ "skill": "start",
+ "description": "Verify dimension-analysts load SKILL.md methodology before conducting research",
+ "input": "Research competitive landscape of AI observability platforms",
+ "context": {
+ "dimension": "competitive",
+ "skill_dir": "competitive-analysis"
+ },
+ "expected": {
+ "skill_md_loaded": true,
+ "methodology_plan_written": true,
+ "required_frameworks_extracted": ["Porter's 5 Forces", "Competitor Matrix", "Positioning Map", "Trend Indicators"],
+ "research_starts_after_plan": true
+ },
+ "grading_criteria": [
+ "Analyst reads skills/competitive-analysis/SKILL.md before starting research",
+ "methodology_plan_competitive.json is written before any WebSearch calls",
+ "methodology_plan includes all 4 required frameworks from the skill",
+ "Analyst proceeds to research only after Step 4 STOP CHECK passes"
+ ]
+}
diff --git a/skills/start-autoresearch/iteration-0/eval-methodology-loading/outputs/.gitkeep b/skills/start-autoresearch/iteration-0/eval-methodology-loading/outputs/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/skills/start/SKILL.md b/skills/start/SKILL.md
index 52c8a02..4e0a498 100644
--- a/skills/start/SKILL.md
+++ b/skills/start/SKILL.md
@@ -1,7 +1,7 @@
---
name: start
description: Begin a new market research session. Thin launcher that delegates to the research-orchestrator agent for all phase management.
-argument-hint: "[--quick] []"
+argument-hint: "[--quick] [--dimensions (competitive,sizing,trends,customer,tech,financial,regulatory,trend_modeling)] []"
allowed-tools:
- Agent
- AskUserQuestion
@@ -37,6 +37,7 @@ This skill initializes a research session and delegates to the `research-orchest
Parse `$ARGUMENTS` before any other processing. **Input sanitization**: truncate `$ARGUMENTS` to 200 characters total, strip backticks and angle brackets.
- `--quick` — Abbreviated elicitation (3 questions instead of 8)
+- `--dimensions ` — Optional: pre-select specific dimensions (comma-separated). Valid values: `competitive`, `sizing`, `trends`, `customer`, `tech`, `financial`, `regulatory`, `trend_modeling`. Passed to the orchestrator as `REQUESTED_DIMENSIONS` — Phase 1.5 skips interactive selection when this is set.
- Remaining text after flag extraction is the initial topic hint (may be empty)
---
@@ -115,6 +116,7 @@ Agent(
MAX_DIMENSIONS: {max_dimensions}
CONTEXT_FILE_CONTENT: {context_content if non-null, else ""}
QUICK_MODE: {true if --quick flag}
+ REQUESTED_DIMENSIONS: {comma-separated dimension list from --dimensions flag, or "interactive" if omitted}
{If resuming: PRIOR_ELICITATION: {prior elicitation JSON}}
Execute the full research orchestration:
diff --git a/skills/trend-analysis-autoresearch/evals.json b/skills/trend-analysis-autoresearch/evals.json
new file mode 100644
index 0000000..a6d9655
--- /dev/null
+++ b/skills/trend-analysis-autoresearch/evals.json
@@ -0,0 +1,17 @@
+{
+ "skill_name": "trend-analysis-autoresearch",
+ "evals": [
+ {
+ "id": "mermaid-scenario",
+ "name": "eval-mermaid-scenario",
+ "description": "Verify report generates trend scenario state diagram when INC/DEC/CONST signals present",
+ "path": "iteration-0/eval-mermaid-scenario"
+ },
+ {
+ "id": "trend-tables",
+ "name": "eval-trend-tables",
+ "description": "Verify trends section generates macro and micro trend tables from findings",
+ "path": "iteration-0/eval-trend-tables"
+ }
+ ]
+}
diff --git a/skills/trend-analysis-autoresearch/iteration-0/eval-mermaid-scenario/grading.json b/skills/trend-analysis-autoresearch/iteration-0/eval-mermaid-scenario/grading.json
new file mode 100644
index 0000000..67bf3af
--- /dev/null
+++ b/skills/trend-analysis-autoresearch/iteration-0/eval-mermaid-scenario/grading.json
@@ -0,0 +1,26 @@
+{
+ "eval_id": "mermaid-scenario",
+ "skill": "trend-analysis",
+ "description": "Verify report generates trend scenario state diagram when INC/DEC/CONST signals present",
+ "input": "Generate report with trend findings containing INC and DEC signals",
+ "context": {
+ "findings_present": ["trends"],
+ "trend_signals": [
+ {"signal": "INC", "driver": "AI adoption"},
+ {"signal": "DEC", "driver": "Legacy spend"},
+ {"signal": "CONST", "driver": "Regulatory pace"}
+ ]
+ },
+ "expected": {
+ "mermaid_diagram_generated": true,
+ "diagram_type": "stateDiagram-v2",
+ "states_present": ["Current", "GrowthScenario", "ConsolidationScenario"],
+ "transitions_labeled_with_signals": true
+ },
+ "grading_criteria": [
+ "Trend scenario graph is generated when INC/DEC/CONST signals exist in findings",
+ "Mermaid stateDiagram-v2 format is used",
+ "Each transition is labeled with the signal direction and driver",
+ "Terminal scenario states are present"
+ ]
+}
diff --git a/skills/trend-analysis-autoresearch/iteration-0/eval-mermaid-scenario/outputs/.gitkeep b/skills/trend-analysis-autoresearch/iteration-0/eval-mermaid-scenario/outputs/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/skills/trend-analysis-autoresearch/iteration-0/eval-trend-tables/grading.json b/skills/trend-analysis-autoresearch/iteration-0/eval-trend-tables/grading.json
new file mode 100644
index 0000000..cf91f4c
--- /dev/null
+++ b/skills/trend-analysis-autoresearch/iteration-0/eval-trend-tables/grading.json
@@ -0,0 +1,22 @@
+{
+ "eval_id": "trend-tables",
+ "skill": "trend-analysis",
+ "description": "Verify trends section generates macro and micro trend tables from findings",
+ "input": "Generate trends section from trend findings",
+ "context": {
+ "findings_present": ["trends"],
+ "macro_trends": ["AI regulation", "Cloud cost pressure"],
+ "micro_trends": ["Edge inference", "Model compression"]
+ },
+ "expected": {
+ "macro_trends_table": true,
+ "micro_trends_table": true,
+ "trend_indicators_present": true
+ },
+ "grading_criteria": [
+ "Macro Trends table is generated with findings from trends dimension",
+ "Micro Trends table is generated separately from macro",
+ "INC/DEC/CONST indicators are present on each trend",
+ "Not-assessed placeholder is NOT generated when trend findings exist"
+ ]
+}
diff --git a/skills/trend-analysis-autoresearch/iteration-0/eval-trend-tables/outputs/.gitkeep b/skills/trend-analysis-autoresearch/iteration-0/eval-trend-tables/outputs/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/skills/update/SKILL.md b/skills/update/SKILL.md
index 29bf41b..076f543 100644
--- a/skills/update/SKILL.md
+++ b/skills/update/SKILL.md
@@ -1,7 +1,7 @@
---
name: update
description: Refresh existing research with latest data using swarm orchestration and delta detection. Delegates to the research-orchestrator agent in update mode.
-argument-hint: "[--topic ] [--area ] [--since ] [--no-delta] [--dimensions ]"
+argument-hint: "[--topic ] [--area ] [--since ] [--no-delta] [--dimensions (competitive,sizing,trends,customer,tech,financial,regulatory,trend_modeling)]"
allowed-tools:
- Agent
- AskUserQuestion