Skip to content

feat(ui): first-run setup wizard + live-edit config drawer#150

Merged
mabry1985 merged 6 commits intomainfrom
feat/config-drawer-live-reload
Apr 23, 2026
Merged

feat(ui): first-run setup wizard + live-edit config drawer#150
mabry1985 merged 6 commits intomainfrom
feat/config-drawer-live-reload

Conversation

@mabry1985
Copy link
Copy Markdown
Contributor

@mabry1985 mabry1985 commented Apr 23, 2026

Summary

Two related surfaces that together turn the template into a download-run-wizard-chat experience instead of a fork-sed-build-deploy one.

  1. First-run setup wizard (chat_ui.py, server.py, graph/config_io.py, autostart.py). A fresh clone boots without env vars and lands on a 4-step wizard (Connect / Identity / Tools / Profile+Security+Autostart). Launch writes config/langgraph-config.yaml, config/SOUL.md, and a .setup-complete marker, then hot-swaps the wizard pane for the chat pane. No restart.
  2. Live-edit config drawer (right-side Gradio sidebar). All wizard fields plus the advanced ones (temperature, max_tokens, max_iterations, knowledge store, subagent max_turns) edit → Save & Reload → graph rebuilt in place. Includes a "Re-run setup wizard" action that flips the marker back and re-opens the wizard with every current value pre-filled.

Ships ruamel.yaml for comment-preserving YAML round-trip; preserves the fork-shipped memory/skills sections the dataclass doesn't model.

What's in the diff

  • Wizardchat_ui.py adds a visibility-toggled wizard column in front of a wrapped chat column. app.load re-syncs visibility on every page load so external POST /api/config/setup / browser refreshes don't desync.
  • Drawer — typed form sections for Model (with /v1/models dropdown populated from the live gateway), Worker Subagent, Middleware, Knowledge Store, Persona. Save & Reload does the full write-yaml + write-soul + graph-rebuild cycle.
  • Lazy graph initserver.py::_init_langgraph_agent skips create_agent_graph when the setup marker is absent; chat endpoints return a friendly "setup required" message. finish_setup flips the marker BEFORE the reload so the graph actually compiles on wizard completion (caught and fixed during end-to-end test).
  • Identity/auth/runtime sections in LangGraphConfig + YAML so the wizard's agent name, operator, A2A token, and autostart flag round-trip cleanly. agent_name() resolver prefers YAML identity.name → env → "protoagent", so the agent card + OpenAI-compat model id reflect the wizard value without a restart.
  • SOUL presetsconfig/soul-presets/{generic-assistant,research,coding,blank}.md. Wizard dropdown lists them, "Load preset" button pastes into the textarea.
  • Autostartautostart.py with macOS LaunchAgent install/uninstall (Linux/Windows stubs return a clear "not yet supported"). Captures sys.executable so venv-based runs survive a reboot. Opt-in via wizard checkbox; toggleable from the drawer.
  • DockerfileVOLUME /opt/protoagent/config so setup persists across docker run without a -v flag; ruamel.yaml added to the pip install list.
  • New HTTP endpointsGET /api/config, POST /api/config, GET /api/config/models, GET /api/config/setup-status, POST /api/config/setup, POST /api/config/reset-setup, GET /api/config/presets/{name}.
  • Docsdocs/tutorials/first-agent.md rewritten for clone→pip→run→wizard. Old fork/sed/docker content moved to new docs/guides/customize-and-deploy.md. README + TEMPLATE + docs index + VitePress sidebar updated.

Test plan

  • pytest tests/test_config_io.py — 29/29 passing (7 new tests: setup marker lifecycle, preset discovery, preset content, shipped-starter presence, directory fallback)
  • Smoke-test: boot server with no marker → wizard mode, no graph compiled, /api/chat returns setup-required message
  • End-to-end: POST /api/config/setup writes YAML + SOUL + marker, reloads graph in correct order, agent card reflects wizard name (product-director tested), OpenAI-compat /v1/models returns the new id, chat unblocks
  • Reset flow: POST /api/config/reset-setup removes marker; GET /api/config/setup-status flips to false
  • Round-trip integrity: git diff config/langgraph-config.yaml is empty after two YAML writes that change → restore values (ruamel preserves every comment and blank line)
  • Clone test: cloned this branch to ~/dev/product-director, python -m venv .venv && pip install -r requirements.txt && python server.py, walked the full wizard via curl as if it were a new user. Fresh clone → wizard → product-director named agent → chat unblocked. All docs paths resolve.
  • Reviewer: click through the wizard in a browser (not just via API) against a reachable gateway and confirm token-usage / cost-v1 DataPart emits correctly on the post-setup chat turn.

Known tradeoffs

  • Gradio initial boot ~14s on a fresh venv (component tree construction). Not a blocker but worth a follow-up pass once this ships.
  • Prometheus metric prefix + <AGENT>_API_KEY env name still require a process restart to pick up a wizard rename — documented. Agent card + chat UI + OpenAI-compat id update live.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • First-run setup wizard, live-edit configuration drawer, SOUL persona presets, gateway model discovery, autostart support, dynamic agent identity, and runtime endpoints to view/edit config, SOUL, list models, and manage setup
  • Documentation

    • Guides and tutorial rewritten for zero‑setup wizard and a “Customize & deploy” workflow
  • Tests

    • New tests covering config persistence, SOUL I/O, model listing, presets, and setup markers
  • Chores

    • Round‑trip YAML support added (preserve comments), YAML dependency, Docker config volume, and expanded .gitignore rules

Elevates langgraph-config.yaml + SOUL.md into a typed form inside the
Gradio sidebar so forks can iterate on model / temperature / tools /
middleware / persona without a code edit + restart. Save rebuilds the
compiled graph in place; in-flight turns finish on the prior graph.

The model dropdown is populated from the connected gateway's
`/v1/models` endpoint so forks always see what's actually available
through the configured api_base + api_key, no hardcoded list to drift.

graph/config_io.py is the new I/O layer: YAML round-trip preserves
comments and unknown top-level sections (the shipped YAML's
memory/skills blocks that the dataclass doesn't model), dual-location
SOUL.md handling writes to both /sandbox/SOUL.md (runtime) and
config/SOUL.md (source), and gateway model discovery returns a
readable error string instead of raising when the endpoint is down.

Also exposes GET/POST /api/config + GET /api/config/models for
external control, and falls SOUL back to config/SOUL.md in
graph/prompts.py so local dev without a /sandbox mount still picks
up drawer edits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds runtime-editable YAML config and persona (SOUL) management with round‑trip persistence, gateway model discovery, setup‑wizard state and UI integration; server gains config endpoints, hot-reload and guarded chat behavior; introduces autostart support, SOUL presets, and related docs/tests.

Changes

Cohort / File(s) Summary
Configuration I/O & Tests
graph/config_io.py, tests/test_config_io.py
New module for YAML round-trip load/save (ruamel.yaml with fallback), LangGraphConfig↔dict conversion, nested-update merging, config validation, SOUL read/write (workspace-first, dual-write when applicable), gateway model listing, tool introspection, setup-marker helpers, SOUL preset discovery, and comprehensive tests.
Graph Config Model
graph/config.py
Expanded LangGraphConfig with identity, auth_token, and autostart_on_boot fields; YAML loader now parses identity, auth, and runtime sections.
Server & API
server.py
Server becomes setup-aware (defers graph compile until setup complete), adds endpoints for config/SOUL read/write, model probing, setup finish/reset/presets; adds save+reload flow, _reload_langgraph_agent, _apply_settings_changes, dynamic agent_name(), and guards chat endpoints when graph is uninitialized.
Chat UI Integration
chat_ui.py
Introduces a first-run setup wizard, visibility-gated settings/configuration drawer (live-edit inputs, Fetch/Save & Reload/Reload from Disk), SOUL preset loading, gated legacy panels, and callbacks to finish/restart/reset setup.
A2A Auth Runtime
a2a_handler.py
Registration accepts auth_token, adds set_a2a_token() to support runtime token updates; bearer validation reads mutable token at call time.
Autostart & Container
autostart.py, Dockerfile
New macOS autostart module (install/uninstall/status); Dockerfile adds ruamel.yaml dependency, adjusts config ownership in container, and declares /opt/protoagent/config as a VOLUME.
Dependencies & Ignore
requirements.txt, .gitignore
Add ruamel.yaml>=0.18; extend .gitignore to include .venv-*/, logs/, and .proto/.
SOUL Presets & Prompt Behavior
config/soul-presets/*.md, graph/prompts.py
Add SOUL presets (blank.md, coding.md, generic-assistant.md, research.md); prompt builder now prefers workspace /sandbox/SOUL.md with repo config/SOUL.md fallback.
Docs & Tutorials
docs/... (docs/guides/customize-and-deploy.md, docs/tutorials/first-agent.md, docs/index.md, docs/guides/index.md, docs/.vitepress/config.mts)
Docs updated for zero‑setup wizard, local Python workflow, deployment/customization guidance, and sidebar nav changes.
Minor UI & LLM Edits
chat_ui.py, graph/llm.py
UI rearrangements and new callbacks; create_llm now sets a custom User-Agent via default_headers.
Repo Ignore & Minor Files
.gitignore
Expanded ignore rules for virtualenv variants and local artifacts.

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant UI as chat_ui (Wizard/Config Drawer)
    participant Server as server.py (FastAPI)
    participant ConfigIO as graph/config_io.py
    participant Disk as YAML/Disk
    participant Graph as LangGraph

    User->>UI: Open Settings / Config Drawer
    UI->>Server: GET /api/config
    Server->>ConfigIO: config_to_dict()
    ConfigIO->>Disk: load_yaml_doc()
    Disk-->>ConfigIO: YAML doc
    ConfigIO-->>Server: config dict
    Server-->>UI: Editable config payload

    User->>UI: Edit fields & click "Save & Reload"
    UI->>Server: POST /api/config (updates, optional SOUL)
    Server->>ConfigIO: validate_config_dict(updates)
    ConfigIO-->>Server: validation result
    alt valid
        Server->>ConfigIO: apply_updates_to_yaml(doc, updates)
        ConfigIO->>Disk: save_yaml_doc()
        Disk-->>ConfigIO: saved
        Server->>Server: _apply_settings_changes -> _reload_langgraph_agent()
        Server->>Graph: rebuild _graph if setup complete
        Graph-->>Server: updated
        Server-->>UI: success
    else invalid
        Server-->>UI: error message
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 56.31% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: a setup wizard UI and a live-edit config drawer, which are the two primary features introduced.
Description check ✅ Passed The PR description is comprehensive and well-structured with Summary, Closes, and Test plan sections, though it exceeds typical conciseness. It clearly explains the two main features, the test coverage, and known tradeoffs. All required template sections are meaningfully filled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/config-drawer-live-reload

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@chat_ui.py`:
- Around line 498-500: The helper load_provider_choices calls
settings["get_current_provider"]() without verifying the callback exists; update
load_provider_choices to first check that both settings["get_provider_choices"]
and settings["get_current_provider"] are present and callable before invoking
them (or alternatively gate the entire provider dropdown rendering on the
presence of both callbacks); specifically, guard calls to
settings["get_provider_choices"]() and settings["get_current_provider"]() in
load_provider_choices and handle the missing-callback case (e.g., return an
empty list or skip rendering) to avoid KeyError.
- Around line 406-414: The payload currently forces zero defaults that fail
validation: replace the int(... or 0) pattern for "max_tokens", "max_iterations"
(variables max_tokens, max_iter) and "max_turns" (variable worker_max_turns
inside the "worker" subagent) with either a sensible minimum (e.g. use max(1,
int(...)) or similar) or pass None/omit the field when the input is empty;
update the code that builds the payload in chat_ui.py where these keys are set
so the API receives values >=1 or no key instead of 0 to satisfy validation.

In `@graph/config_io.py`:
- Around line 99-114: The _FIELD_MAP dictionary in graph/config_io.py is defined
but unused; either remove the _FIELD_MAP variable declaration entirely or, if
you intend to keep it for future use, add a clear TODO comment above the
definition like "TODO: keep for future config mapping; remove if unused" and
mark it with a FIXME/TODO tag; update or run a quick search to confirm no other
code references _FIELD_MAP (look for the symbol "_FIELD_MAP") before committing
the removal or commenting.
- Line 186: The error message returned where the temperature is validated uses
an en-dash character in the string f"temperature must be 0.0–2.0, got {temp}";
update that literal to use an ASCII hyphen instead (f"temperature must be
0.0-2.0, got {temp}") so the returned message uses standard ASCII characters;
locate the temperature validation return that references the variable temp and
make this one-character change.

In `@graph/prompts.py`:
- Around line 52-58: The fallback file-reading logic that sets soul in
prompts.py duplicates the precedence already implemented by
config_io.read_soul(); replace the manual _read_file/Path(...) logic in
prompts.py with a call to config_io.read_soul() (e.g., soul =
config_io.read_soul(workspace) or the appropriate signature) and remove the
duplicated reads; to avoid a circular import between prompts.py and
config_io.py, import config_io inside the function or block where soul is
resolved (rather than at module top level) so you get the single source of truth
without introducing import cycles.

In `@server.py`:
- Around line 487-515: The config endpoints are unauthenticated; protect them by
reusing the same API-key check as the A2A routes: require the
{AGENT_NAME}_API_KEY (or configured _graph_config.api_key) and validate incoming
requests in _api_get_config, _api_post_config and _api_list_models before
performing actions (or restrict access to localhost/internal network). Implement
this by adding an authentication dependency or inline header check that verifies
the provided api_key or Authorization header against the expected key and
returns 401 on mismatch; apply it to the ConfigReloadRequest-handling POST and
both GET handlers so calls to _apply_settings_changes, config_to_dict/read_soul,
and list_gateway_models only run after successful auth.

In `@tests/test_config_io.py`:
- Around line 276-288: The test reveals that config_io.list_gateway_models is
catching httpx.HTTPError but should catch httpx.RequestError (ConnectError is a
subclass of RequestError), so update the except clause in the
list_gateway_models function to catch httpx.RequestError (or the more general
httpx.RequestError | httpx.HTTPStatusError if status errors are also intended)
and ensure the error variable is passed into the returned/logged message; adjust
the except block in list_gateway_models to reference httpx.RequestError instead
of httpx.HTTPError and include the caught exception in the err string.
- Around line 140-148: Change the first argument to pytest.mark.parametrize from
a comma-separated string to a tuple of argument names to follow pytest best
practices: update the decorator usage in the test (the pytest.mark.parametrize
call that currently has "bad_value,expected_error_fragment") to use a tuple like
("bad_value", "expected_error_fragment") so test discovery/parsing is robust.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b04b45a0-e62f-420b-a548-2013b19047f8

📥 Commits

Reviewing files that changed from the base of the PR and between f1dcd3f and f49c012.

📒 Files selected for processing (7)
  • .gitignore
  • chat_ui.py
  • graph/config_io.py
  • graph/prompts.py
  • requirements.txt
  • server.py
  • tests/test_config_io.py

Comment thread chat_ui.py Outdated
Comment thread chat_ui.py Outdated
Comment thread graph/config_io.py Outdated
Comment thread graph/config_io.py Outdated
Comment thread graph/prompts.py
Comment on lines +52 to +58
# 1. Identity — prefer the runtime workspace (entrypoint.sh copies
# config/SOUL.md to /sandbox/SOUL.md at container start). Fall back
# to the repo source so local `python server.py` runs without a
# /sandbox mount still pick up persona edits made via the drawer.
soul = _read_file(f"{workspace}/SOUL.md")
if not soul:
soul = _read_file(Path(__file__).parent.parent / "config" / "SOUL.md")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider reusing config_io.read_soul() to reduce duplication.

This fallback logic duplicates the precedence implemented in graph/config_io.read_soul(). While the current implementation is correct, consolidating to a single source of truth would prevent drift.

♻️ Proposed refactor
+from graph.config_io import read_soul
+
 def build_system_prompt(
     workspace: str = "/sandbox",
     include_subagents: bool = True,
     context: str = "",
 ) -> str:
     ...
     parts = []
 
-    # 1. Identity — prefer the runtime workspace (entrypoint.sh copies
-    # config/SOUL.md to /sandbox/SOUL.md at container start). Fall back
-    # to the repo source so local `python server.py` runs without a
-    # /sandbox mount still pick up persona edits made via the drawer.
-    soul = _read_file(f"{workspace}/SOUL.md")
-    if not soul:
-        soul = _read_file(Path(__file__).parent.parent / "config" / "SOUL.md")
+    # 1. Identity — prefer runtime, fall back to repo source
+    soul = read_soul()
     if soul:

Note: This would require ensuring no circular import between prompts.py and config_io.py.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@graph/prompts.py` around lines 52 - 58, The fallback file-reading logic that
sets soul in prompts.py duplicates the precedence already implemented by
config_io.read_soul(); replace the manual _read_file/Path(...) logic in
prompts.py with a call to config_io.read_soul() (e.g., soul =
config_io.read_soul(workspace) or the appropriate signature) and remove the
duplicated reads; to avoid a circular import between prompts.py and
config_io.py, import config_io inside the function or block where soul is
resolved (rather than at module top level) so you get the single source of truth
without introducing import cycles.

Comment thread server.py
Comment on lines +487 to +515
# --- Live config / SOUL editing ----------------------------------------
# GET returns the current config + persona so external clients (the
# Gradio drawer is one; curl is another) can mirror what's running.
# POST accepts partial edits — pass only the sections you want to
# change. Reload is automatic.
class ConfigReloadRequest(PydanticBaseModel):
config: dict | None = None
soul: str | None = None

@fastapi_app.get("/api/config")
async def _api_get_config():
from graph.config_io import config_to_dict, read_soul
return {
"config": config_to_dict(_graph_config),
"soul": read_soul(),
}

@fastapi_app.post("/api/config")
async def _api_post_config(req: ConfigReloadRequest):
ok, messages = _apply_settings_changes(config=req.config, soul=req.soul)
return {"ok": ok, "messages": messages}

@fastapi_app.get("/api/config/models")
async def _api_list_models(api_base: str = "", api_key: str = ""):
from graph.config_io import list_gateway_models
base = api_base or (_graph_config.api_base if _graph_config else "")
key = api_key or (_graph_config.api_key if _graph_config else "")
models, error = list_gateway_models(base, key)
return {"models": models, "error": error}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if there's any auth middleware applied to these routes
rg -n -A5 'api/config' server.py | head -40
rg -n 'middleware|auth|api_key' server.py | grep -v '#'

Repository: protoLabsAI/protoAgent

Length of output: 1828


🏁 Script executed:

#!/bin/bash
# Check for FastAPI middleware definitions and global auth setup
rg -n 'add_middleware|@app\.|@fastapi_app\.' server.py | head -50

# Look for A2A endpoints to compare auth patterns
rg -n -i 'a2a|agent.*agent' server.py | head -30

# Check initialization of fastapi_app and any global auth setup
rg -n 'fastapi_app.*=' server.py | head -20

Repository: protoLabsAI/protoAgent

Length of output: 3330


🏁 Script executed:

#!/bin/bash
# Look for any auth decorators or checks in the file
rg -n 'Depends|require|protect|permission' server.py | head -20

# Check if config endpoints are marked as internal or documented
rg -n -B2 'api/config' server.py | grep -E '(#|""")'

Repository: protoLabsAI/protoAgent

Length of output: 146


🏁 Script executed:

#!/bin/bash
# Find a2a_handler location
fd -type f -name '*a2a*' | head -20

# Check if config endpoints have any auth in their implementation
rg -n -B5 'async def _api_get_config|async def _api_post_config' server.py

Repository: protoLabsAI/protoAgent

Length of output: 683


🏁 Script executed:

#!/bin/bash
# Look for global middleware setup on fastapi_app
rg -n 'fastapi_app\.add_middleware|CORSMiddleware|AuthenticationMiddleware' server.py

# Check the beginning of the server initialization to see if middleware is setup elsewhere
head -500 server.py | tail -100

Repository: protoLabsAI/protoAgent

Length of output: 3861


🏁 Script executed:

#!/bin/bash
# Find where a2a_handler.register_a2a_routes is called and how auth is applied
rg -n 'register_a2a_routes|a2a_handler' server.py | head -20

# Look for imports and setup of a2a_handler
rg -n 'from.*a2a|import.*a2a' server.py

Repository: protoLabsAI/protoAgent

Length of output: 415


🏁 Script executed:

#!/bin/bash
# Check if there's a main application setup that applies middleware globally
# Look further in _main function
sed -n '460,650p' server.py

Repository: protoLabsAI/protoAgent

Length of output: 7722


Consider access control for config endpoints.

The /api/config endpoints (POST /api/config for modification, GET /api/config for retrieval, and GET /api/config/models) operate without authentication. While this appears intentional to support the internal Gradio UI, if these endpoints are exposed to untrusted networks, add protection—either restrict network access or gate them behind the same authentication as the A2A routes (which use api_key from the {AGENT_NAME}_API_KEY env var).

🧰 Tools
🪛 Ruff (0.15.10)

[warning] 497-497: Missing return type annotation for private function _api_get_config

(ANN202)


[warning] 505-505: Missing return type annotation for private function _api_post_config

(ANN202)


[warning] 510-510: Missing return type annotation for private function _api_list_models

(ANN202)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server.py` around lines 487 - 515, The config endpoints are unauthenticated;
protect them by reusing the same API-key check as the A2A routes: require the
{AGENT_NAME}_API_KEY (or configured _graph_config.api_key) and validate incoming
requests in _api_get_config, _api_post_config and _api_list_models before
performing actions (or restrict access to localhost/internal network). Implement
this by adding an authentication dependency or inline header check that verifies
the provided api_key or Authorization header against the expected key and
returns 401 on mismatch; apply it to the ConfigReloadRequest-handling POST and
both GET handlers so calls to _apply_settings_changes, config_to_dict/read_soul,
and list_gateway_models only run after successful auth.

Comment thread tests/test_config_io.py
Comment on lines +140 to +148
@pytest.mark.parametrize("bad_value,expected_error_fragment", [
({"model": {"temperature": 3.0}}, "temperature"),
({"model": {"temperature": -0.1}}, "temperature"),
({"model": {"max_tokens": 0}}, "max_tokens"),
({"model": {"max_iterations": 0}}, "max_iterations"),
({"subagents": {"worker": {"max_turns": 0}}}, "max_turns"),
({"subagents": {"worker": {"tools": "not-a-list"}}}, "list"),
({"knowledge": {"top_k": 0}}, "top_k"),
])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Minor: Use tuple for parametrize argument names.

Per pytest best practices, use a tuple for the first argument of @pytest.mark.parametrize rather than a comma-separated string.

🔧 Proposed fix
-@pytest.mark.parametrize("bad_value,expected_error_fragment", [
+@pytest.mark.parametrize(("bad_value", "expected_error_fragment"), [
     ({"model": {"temperature": 3.0}}, "temperature"),
     ...
 ])
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@pytest.mark.parametrize("bad_value,expected_error_fragment", [
({"model": {"temperature": 3.0}}, "temperature"),
({"model": {"temperature": -0.1}}, "temperature"),
({"model": {"max_tokens": 0}}, "max_tokens"),
({"model": {"max_iterations": 0}}, "max_iterations"),
({"subagents": {"worker": {"max_turns": 0}}}, "max_turns"),
({"subagents": {"worker": {"tools": "not-a-list"}}}, "list"),
({"knowledge": {"top_k": 0}}, "top_k"),
])
`@pytest.mark.parametrize`(("bad_value", "expected_error_fragment"), [
({"model": {"temperature": 3.0}}, "temperature"),
({"model": {"temperature": -0.1}}, "temperature"),
({"model": {"max_tokens": 0}}, "max_tokens"),
({"model": {"max_iterations": 0}}, "max_iterations"),
({"subagents": {"worker": {"max_turns": 0}}}, "max_turns"),
({"subagents": {"worker": {"tools": "not-a-list"}}}, "list"),
({"knowledge": {"top_k": 0}}, "top_k"),
])
🧰 Tools
🪛 Ruff (0.15.10)

[warning] 140-140: Wrong type passed to first argument of pytest.mark.parametrize; expected tuple

Use a tuple for the first argument

(PT006)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_config_io.py` around lines 140 - 148, Change the first argument to
pytest.mark.parametrize from a comma-separated string to a tuple of argument
names to follow pytest best practices: update the decorator usage in the test
(the pytest.mark.parametrize call that currently has
"bad_value,expected_error_fragment") to use a tuple like ("bad_value",
"expected_error_fragment") so test discovery/parsing is robust.

Comment thread tests/test_config_io.py
Comment on lines +276 to +288
def test_list_gateway_models_http_error(monkeypatch):
from graph import config_io

fake_client = MagicMock()
fake_client.__enter__ = lambda self: fake_client
fake_client.__exit__ = lambda *args: None
fake_client.get.side_effect = httpx.ConnectError("no route to host")

monkeypatch.setattr("httpx.Client", lambda **kw: fake_client)

models, err = config_io.list_gateway_models("http://bad-host/v1")
assert models == []
assert "connection failed" in err
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Test exposes exception handling bug in production code.

This test correctly simulates a connection error with httpx.ConnectError, but the production code at config_io.py:286 catches httpx.HTTPError instead of httpx.RequestError. This test will fail because ConnectError is not a subclass of HTTPError.

See the related issue flagged in graph/config_io.py lines 283-287.

🧰 Tools
🪛 Ruff (0.15.10)

[warning] 280-280: Unused lambda argument: self

(ARG005)


[warning] 281-281: Unused lambda argument: args

(ARG005)


[warning] 284-284: Unused lambda argument: kw

(ARG005)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_config_io.py` around lines 276 - 288, The test reveals that
config_io.list_gateway_models is catching httpx.HTTPError but should catch
httpx.RequestError (ConnectError is a subclass of RequestError), so update the
except clause in the list_gateway_models function to catch httpx.RequestError
(or the more general httpx.RequestError | httpx.HTTPStatusError if status errors
are also intended) and ensure the error variable is passed into the
returned/logged message; adjust the except block in list_gateway_models to
reference httpx.RequestError instead of httpx.HTTPError and include the caught
exception in the err string.

mabry1985 and others added 2 commits April 22, 2026 17:34
Turns the "fork a template and edit code" onboarding into a
download-and-run flow. A fresh clone boots without any env vars,
lands in a 4-step wizard (Connect / Identity / Tools / Profile),
and writes out config + SOUL.md + a .setup-complete marker on
Launch — the chat UI then appears on the same page, drawer
pre-populated with the wizard's values.

Key pieces:

- Wizard UI in chat_ui.py: visibility-toggled wizard pane vs
  chat pane, populated from the live config so re-runs pre-fill.
  4 ship-with presets in config/soul-presets/ (generic-assistant,
  research, coding, blank) power the persona dropdown.
- Lazy graph init in server.py: no model required at boot. The
  chat endpoints return a friendly "setup required" message
  until the wizard completes. After wizard save, the marker is
  flipped BEFORE the graph reload so the rebuild actually
  compiles (this order matters — earlier iteration reloaded
  before marking complete and left _graph=None).
- Identity/auth/runtime sections added to LangGraphConfig so
  the wizard-captured name, operator, A2A token, and autostart
  flag round-trip through the existing YAML infrastructure.
  agent_name() resolver prefers YAML identity.name → env →
  "protoagent" so the agent card + OpenAI-compat model id
  reflect the wizard value without a process restart.
- autostart.py: macOS LaunchAgent install/uninstall with
  Linux/Windows stubs. Captures sys.executable at install time
  so venv-based runs survive a reboot. Opt-in via wizard
  checkbox; toggle from drawer anytime.
- Dockerfile: config volume declared so setup persistence
  survives docker run without a -v flag.
- Docs: first-agent.md rewritten for clone→pip→run→wizard
  flow; old fork/sed/docker content moved to new
  customize-and-deploy.md guide.

Tests: 29 passing (7 new — setup marker lifecycle, preset
discovery, preset content shape, shipped starter presence).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Without this, a browser refresh after the marker was written
externally (POST /api/config/setup, or /api/config/reset-setup
from another tab) kept Gradio serving its initial visibility
snapshot — wizard visible even though setup is done, or vice
versa. app.load runs per page visit so visibility tracks
is_setup_complete() live.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mabry1985 mabry1985 changed the title feat(ui): live-edit config drawer with model discovery + SOUL.md editor feat(ui): first-run setup wizard + live-edit config drawer Apr 23, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (6)
chat_ui.py (2)

656-658: ⚠️ Potential issue | 🟡 Minor

Missing callback existence check for get_current_provider.

settings["get_current_provider"]() is called without first checking if the key exists. If get_provider_choices is provided but get_current_provider is missing, this raises a KeyError.

🔧 Proposed fix
                         def load_provider_choices():
                             choices = settings["get_provider_choices"]()
-                            current = settings["get_current_provider"]()
+                            current = settings.get("get_current_provider", lambda: None)()
                             return gr.update(choices=choices, value=current)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@chat_ui.py` around lines 656 - 658, In load_provider_choices, avoid calling
settings["get_current_provider"]() unguarded; first check that the key exists
and is callable (e.g., "get_current_provider" in settings and
callable(settings["get_current_provider"])) before invoking it, and provide a
safe fallback (such as None or the first entry from
settings["get_provider_choices"]()) when it's missing or not callable so no
KeyError is raised.

565-572: ⚠️ Potential issue | 🟡 Minor

Zero defaults may trigger validation errors.

The int(value or 0) pattern for max_tokens, max_iterations, and max_turns produces 0 when fields are cleared, which fails validation (>= 1 required).

🔧 Proposed fix
-                                    "max_tokens": int(max_tokens or 0),
-                                    "max_iterations": int(max_iter or 0),
+                                    "max_tokens": int(max_tokens or 1),
+                                    "max_iterations": int(max_iter or 1),
                                 },
                                 "subagents": {
                                     "worker": {
                                         "enabled": bool(worker_enabled),
                                         "tools": list(worker_tools or []),
-                                        "max_turns": int(worker_max_turns or 0),
+                                        "max_turns": int(worker_max_turns or 1),
                                     },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@chat_ui.py` around lines 565 - 572, The current construction of the payload
uses int(max_tokens or 0), int(max_iter or 0), and int(worker_max_turns or 0)
which turns empty/cleared fields into 0 and fails validation (requires >=1);
update those fields to emit None (or omit) when the input is empty so validation
treats them as absent: replace int(max_tokens or 0) with (int(max_tokens) if
max_tokens not in (None, '') else None), replace int(max_iter or 0) with
(int(max_iter) if max_iter not in (None, '') else None), and replace
int(worker_max_turns or 0) with (int(worker_max_turns) if worker_max_turns not
in (None, '') else None) in the code that builds the dict (the entries for
"max_tokens", "max_iterations" and the "worker" -> "max_turns").
tests/test_config_io.py (1)

148-156: 🧹 Nitpick | 🔵 Trivial

Minor: Use tuple for parametrize argument names.

Per pytest best practices, use a tuple for the first argument of @pytest.mark.parametrize rather than a comma-separated string.

🔧 Proposed fix
-@pytest.mark.parametrize("bad_value,expected_error_fragment", [
+@pytest.mark.parametrize(("bad_value", "expected_error_fragment"), [
     ({"model": {"temperature": 3.0}}, "temperature"),
     ...
 ])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_config_io.py` around lines 148 - 156, Change the pytest
parametrize decorator to use a tuple of argument names instead of a
comma-separated string: replace the first argument
`"bad_value,expected_error_fragment"` with a tuple `("bad_value",
"expected_error_fragment")` in the `@pytest.mark.parametrize` decorator in
tests/test_config_io.py so pytest recognizes the parameter names correctly.
server.py (1)

686-714: 🧹 Nitpick | 🔵 Trivial

Consider access control for config endpoints.

The /api/config endpoints operate without authentication. While this is intentional for the Gradio UI, if these endpoints are exposed to untrusted networks they could allow unauthorized configuration changes.

If the server will be exposed beyond localhost, consider gating these endpoints behind the same auth as A2A routes, or restricting to localhost access only via middleware.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server.py` around lines 686 - 714, The config endpoints currently allow
unauthenticated access; update the handlers (_api_get_config, _api_post_config,
_api_list_models) to enforce access control by reusing the existing
auth/middleware used for A2A routes or by restricting requests to localhost: add
an authentication dependency or middleware check that validates the same
token/session used by A2A routes (or checks request.client.host ==
"127.0.0.1"/"::1" for localhost-only mode) before calling
ConfigReloadRequest/_apply_settings_changes/list_gateway_models; return a
401/403 when auth fails so unauthorized clients cannot read or modify config.
graph/config_io.py (2)

208-208: ⚠️ Potential issue | 🟡 Minor

Use ASCII hyphen in validation error text.

Line [208] still contains an en-dash (0.0–2.0), which triggers ambiguous-character lint and can be confusing in logs/errors.

Proposed fix
-            return False, f"temperature must be 0.0–2.0, got {temp}"
+            return False, f"temperature must be 0.0-2.0, got {temp}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@graph/config_io.py` at line 208, The validation error string that returns
"temperature must be 0.0–2.0, got {temp}" uses an en-dash; locate the validation
code that produces this message (the return statement referencing temp in
graph/config_io.py, e.g., the temperature validation function) and replace the
en-dash with a plain ASCII hyphen so it reads "0.0-2.0"; update the string and
run the linter to confirm the ambiguous-character warning is resolved.

111-126: 🧹 Nitpick | 🔵 Trivial

Remove stale unused _FIELD_MAP mapping.

Line [111] defines _FIELD_MAP, but it is still unused and adds dead maintenance surface.

Proposed cleanup
-# Nested dotted path → LangGraphConfig attribute.
-_FIELD_MAP: dict[str, str] = {
-    "model.provider": "model_provider",
-    "model.name": "model_name",
-    "model.api_base": "api_base",
-    "model.api_key": "api_key",
-    "model.temperature": "temperature",
-    "model.max_tokens": "max_tokens",
-    "model.max_iterations": "max_iterations",
-    "middleware.knowledge": "knowledge_middleware",
-    "middleware.audit": "audit_middleware",
-    "middleware.memory": "memory_middleware",
-    "knowledge.db_path": "knowledge_db_path",
-    "knowledge.embed_model": "embed_model",
-    "knowledge.top_k": "knowledge_top_k",
-}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@graph/config_io.py` around lines 111 - 126, Remove the dead _FIELD_MAP
mapping declaration from graph/config_io.py: delete the _FIELD_MAP dict and its
entries (the symbol name _FIELD_MAP) and ensure there are no remaining
references to it elsewhere (search for _FIELD_MAP and LangGraphConfig related
mapping code). After removing the constant, run tests/lint to confirm no missing
imports or usages remain and adjust any code that expected this mapping to use
the canonical LangGraphConfig attributes directly if needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autostart.py`:
- Around line 195-245: The plist is built via f-strings in
_render_launchagent_plist and user-controlled fields (notably agent_name, also
label, server_py, working_dir, stdout_log, stderr_log) must be XML-escaped to
avoid malformed or injectable plists; fix by escaping XML entities before
interpolation (e.g., use a standard escaper like xml.sax.saxutils.escape or a
small helper that replaces &, <, >, " and '), and ensure the escaped values are
used in the returned string (or alternatively validate/reject names with
disallowed characters earlier in the flow such as during wizard input).
- Around line 59-72: install_autostart is defined to take a port but callers
invoke it without passing the actual port, so the generated plist always uses
the default 7870; update the call sites to forward the chosen port. Locate the
calls to install_autostart in server.py (the places that currently call
install_autostart() around startup/CLI handling) and change them to pass the
runtime port variable (e.g., install_autostart(agent_name, port)). Ensure the
port value you pass is the same one used to start the server so that
_install_macos_launchagent receives the correct port to embed in the plist.

In `@chat_ui.py`:
- Around line 802-808: The outputs list passed to app.load duplicates
wizard_pane when sidebar_block is None causing duplicate updates; update the
call site to avoid returning wizard_pane twice by either (A) replacing the
fallback with a no-op state component (e.g., a gr.State instance) so outputs
become [wizard_pane, chat_pane, some_noop_state] when sidebar_block is None, or
(B) remove the third output entirely and adjust the _sync_visibility handler
signature/return paths to support only two outputs when sidebar_block is None;
ensure changes reference _sync_visibility, wizard_pane, chat_pane, and
sidebar_block accordingly.

In `@Dockerfile`:
- Around line 47-51: Document that the Dockerfile's VOLUME declaration (VOLUME
["/opt/protoagent/config"]) creates an anonymous volume by default and explain
the lifecycle semantics: that anonymous volumes persist across container/image
upgrades and can retain stale config; instruct operators to either mount an
explicit named volume or host path with -v /opt/protoagent/config:... to control
persistence, include recommended cleanup commands (docker volume rm /
docker-compose volume names) and a note to update runbooks to advise
version-upgrade workflows when replacing images.
- Around line 29-31: The Dockerfile currently installs an unconstrained
ruamel.yaml which mismatches requirements.txt; update the package entry
"ruamel.yaml" in the pip install list to include the same constraint (>=0.18)
used in requirements.txt so Docker builds use the same version range (locate the
ruamel.yaml token in the long pip install line and change it to include >=0.18).

In `@graph/config_io.py`:
- Around line 405-407: The code constructs a path with PRESETS_DIR /
f"{name}.md" allowing path traversal via `name`; fix by validating/sanitizing
`name` before building or by resolving and checking the final path stays inside
PRESETS_DIR. Either enforce a safe-name pattern (e.g. only allow alphanumerics,
hyphen/underscore) for `name` or compute path = (PRESETS_DIR /
f"{name}.md").resolve() and then verify path.is_file() and
str(path).startswith(str(PRESETS_DIR.resolve()) + os.sep); if the check fails,
return "" or raise an error. Use the PRESETS_DIR symbol and the `name` variable
in the fix so lookups remain safe.

In `@server.py`:
- Around line 814-826: register_a2a_routes currently ignores the api_key
parameter and always reads A2A_AUTH_TOKEN from the environment; change
register_a2a_routes (in a2a_handler.py) to prefer the passed api_key argument
when setting the module-level _a2a_token (fall back to
os.environ.get("A2A_AUTH_TOKEN") only if api_key is empty/None), so that the
api_key passed from the caller (the resolved auth_key from YAML or env) is
actually used for bearer token validation.

---

Duplicate comments:
In `@chat_ui.py`:
- Around line 656-658: In load_provider_choices, avoid calling
settings["get_current_provider"]() unguarded; first check that the key exists
and is callable (e.g., "get_current_provider" in settings and
callable(settings["get_current_provider"])) before invoking it, and provide a
safe fallback (such as None or the first entry from
settings["get_provider_choices"]()) when it's missing or not callable so no
KeyError is raised.
- Around line 565-572: The current construction of the payload uses
int(max_tokens or 0), int(max_iter or 0), and int(worker_max_turns or 0) which
turns empty/cleared fields into 0 and fails validation (requires >=1); update
those fields to emit None (or omit) when the input is empty so validation treats
them as absent: replace int(max_tokens or 0) with (int(max_tokens) if max_tokens
not in (None, '') else None), replace int(max_iter or 0) with (int(max_iter) if
max_iter not in (None, '') else None), and replace int(worker_max_turns or 0)
with (int(worker_max_turns) if worker_max_turns not in (None, '') else None) in
the code that builds the dict (the entries for "max_tokens", "max_iterations"
and the "worker" -> "max_turns").

In `@graph/config_io.py`:
- Line 208: The validation error string that returns "temperature must be
0.0–2.0, got {temp}" uses an en-dash; locate the validation code that produces
this message (the return statement referencing temp in graph/config_io.py, e.g.,
the temperature validation function) and replace the en-dash with a plain ASCII
hyphen so it reads "0.0-2.0"; update the string and run the linter to confirm
the ambiguous-character warning is resolved.
- Around line 111-126: Remove the dead _FIELD_MAP mapping declaration from
graph/config_io.py: delete the _FIELD_MAP dict and its entries (the symbol name
_FIELD_MAP) and ensure there are no remaining references to it elsewhere (search
for _FIELD_MAP and LangGraphConfig related mapping code). After removing the
constant, run tests/lint to confirm no missing imports or usages remain and
adjust any code that expected this mapping to use the canonical LangGraphConfig
attributes directly if needed.

In `@server.py`:
- Around line 686-714: The config endpoints currently allow unauthenticated
access; update the handlers (_api_get_config, _api_post_config,
_api_list_models) to enforce access control by reusing the existing
auth/middleware used for A2A routes or by restricting requests to localhost: add
an authentication dependency or middleware check that validates the same
token/session used by A2A routes (or checks request.client.host ==
"127.0.0.1"/"::1" for localhost-only mode) before calling
ConfigReloadRequest/_apply_settings_changes/list_gateway_models; return a
401/403 when auth fails so unauthorized clients cannot read or modify config.

In `@tests/test_config_io.py`:
- Around line 148-156: Change the pytest parametrize decorator to use a tuple of
argument names instead of a comma-separated string: replace the first argument
`"bad_value,expected_error_fragment"` with a tuple `("bad_value",
"expected_error_fragment")` in the `@pytest.mark.parametrize` decorator in
tests/test_config_io.py so pytest recognizes the parameter names correctly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2e13db2a-2bd6-4581-8ca3-d1a908508868

📥 Commits

Reviewing files that changed from the base of the PR and between f49c012 and 4f74c7f.

⛔ Files ignored due to path filters (2)
  • README.md is excluded by !*.md
  • TEMPLATE.md is excluded by !*.md
📒 Files selected for processing (17)
  • .gitignore
  • Dockerfile
  • autostart.py
  • chat_ui.py
  • config/soul-presets/blank.md
  • config/soul-presets/coding.md
  • config/soul-presets/generic-assistant.md
  • config/soul-presets/research.md
  • docs/.vitepress/config.mts
  • docs/guides/customize-and-deploy.md
  • docs/guides/index.md
  • docs/index.md
  • docs/tutorials/first-agent.md
  • graph/config.py
  • graph/config_io.py
  • server.py
  • tests/test_config_io.py

Comment thread autostart.py
Comment thread autostart.py
Comment thread chat_ui.py Outdated
Comment thread Dockerfile Outdated
Comment thread Dockerfile
Comment thread graph/config_io.py Outdated
Comment thread server.py Outdated
Comment on lines +814 to +826
# A2A bearer token: YAML ``auth.token`` wins if set, else falls back
# to the legacy ``<AGENT>_API_KEY`` env var — so the wizard can set
# auth without an env restart.
auth_env = f"{AGENT_NAME_ENV.upper()}_API_KEY"
auth_key = (
(_graph_config.auth_token if _graph_config else "")
or os.environ.get(auth_env, "")
)
register_a2a_routes(
app=fastapi_app,
chat_stream_fn_factory=_chat_langgraph_stream,
chat_fn=chat,
api_key=os.environ.get(auth_env, ""),
api_key=auth_key,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how api_key parameter is used in register_a2a_routes
rg -n -A20 'def register_a2a_routes' a2a_handler.py | head -40

# Check where _a2a_token is set
rg -n '_a2a_token\s*=' a2a_handler.py

Repository: protoLabsAI/protoAgent

Length of output: 1092


A2A authentication ignores api_key parameter, making YAML auth.token ineffective.

At lines 814–821, auth_key is resolved from _graph_config.auth_token (YAML) with env fallback and passed to register_a2a_routes(api_key=auth_key). However, in a2a_handler.py lines 938–939, register_a2a_routes sets _a2a_token directly from the A2A_AUTH_TOKEN environment variable, ignoring the api_key parameter entirely. Bearer token validation uses _a2a_token from the environment only.

Result: Wizard-configured auth.token values do not protect A2A endpoints. The feature is non-functional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server.py` around lines 814 - 826, register_a2a_routes currently ignores the
api_key parameter and always reads A2A_AUTH_TOKEN from the environment; change
register_a2a_routes (in a2a_handler.py) to prefer the passed api_key argument
when setting the module-level _a2a_token (fall back to
os.environ.get("A2A_AUTH_TOKEN") only if api_key is empty/None), so that the
api_key passed from the caller (the resolved auth_key from YAML or env) is
actually used for bearer token validation.

mabry1985 and others added 2 commits April 22, 2026 19:40
Cloudflare's managed WAF in front of api.proto-labs.ai (and likely
other gateways behind a default WAF config) blocks the OpenAI
Python SDK's `OpenAI/Python <ver>` User-Agent with a 403 "Your
request was blocked". /v1/models went through fine because the
gateway's model-list handler doesn't gate on UA the same way —
only /v1/chat/completions 403'd, which made this look like a key
or model-alias problem rather than what it actually was.

tools/lg_tools.py already sets a custom UA on its outbound httpx
fetches for exactly this reason; graph/llm.py had no equivalent,
so ChatOpenAI fell back to the SDK default. Threading the same
identifier through default_headers makes every protoAgent egress
present a consistent allowlisted UA.

Verified: product-director wizard → chat turn → 200 OK from
api.proto-labs.ai with the groq-llama-70b alias.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Critical — path traversal in preset loader (graph/config_io.py:
read_soul_preset):
  Inputs like "../secret" escaped config/soul-presets/ and read
  arbitrary .md files anywhere on disk. Resolve both the preset
  root and the candidate path and require the latter live inside
  the former before reading. 7 parametrized tests cover the
  malicious inputs I could think of (../, ../../, absolute paths,
  bare "..", mid-path ../../).

Major — YAML auth.token was non-functional for A2A bearer:
  register_a2a_routes captured _a2a_token at register time, so
  wizard-set tokens were ignored until process restart. Promoted
  _a2a_token to a module-level mutable holder (_A2A_TOKEN: list)
  that the closure reads on every request, added set_a2a_token()
  as the public mutator, and a new auth_token= arg to
  register_a2a_routes as the seed source (env still the fallback).
  server.py's reload path now calls set_a2a_token on every YAML
  change so the wizard → live bearer enforcement flow works with
  no restart — verified: fresh boot open → wizard token set →
  401 on wrong token / 200 on right → drawer clears token → open
  again.

Major — plist XML injection in autostart.py:
  Agent names containing <, >, & produced malformed plists (and
  could theoretically inject nodes). xml.sax.saxutils.escape()
  every interpolated string field before embedding.

Major — install_autostart defaulted to port 7870 regardless of
--port flag (autostart.py / server.py):
  Captured the active port in a module-level _active_port at
  _main() time and threaded it through both finish_setup's
  autostart sync and the drawer's toggle_autostart callback. The
  generated LaunchAgent now reboots on whatever port the operator
  launched with.

Minor — chat_ui polish:
  * Numeric fields (max_tokens, max_iterations, worker_max_turns)
    fall back to sensible defaults (4096/50/20) instead of 0 when
    cleared — validate_config_dict rejects zero, so "or 0" blocked
    legitimate saves with a confusing validation error.
  * _sync_visibility no longer aliases the sidebar output slot to
    wizard_pane when the sidebar is absent; split into two closures
    with matching output arities so Gradio doesn't receive duplicate
    updates to the same component.
  * Legacy load_provider_choices handler guards get_current_provider
    existence — KeyError risk when a fork provides get_provider_choices
    alone.

Nitpicks:
  * Remove unused _FIELD_MAP from config_io.py.
  * ASCII hyphen (U+002D) instead of en-dash (U+2013) in the
    temperature validation error.
  * Pin ruamel.yaml>=0.18 in Dockerfile to match requirements.txt.
  * Document the VOLUME anonymous-volume lifecycle and named-volume
    recommendation in the Dockerfile comment.

Not addressed (deliberate):
  * CodeRabbit flagged test_list_gateway_models_http_error as
    expecting httpx.ConnectError to be caught by except httpx.HTTPError
    — false positive, ConnectError → NetworkError → TransportError →
    RequestError → HTTPError, test already passes.
  * "Reuse config_io.read_soul() in graph/prompts.py" — kept the
    inline check to avoid introducing an import dependency from
    prompts.py (loaded early, widely used) into config_io.py.
  * "Use tuple form for @pytest.parametrize" — stylistic; comma-
    separated string works identically.

Test surface: 36 passing (7 new — the path-traversal parametrize set).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
a2a_handler.py (1)

1022-1027: ⚠️ Potential issue | 🟡 Minor

Keep the built-in agent card in sync with live token changes.

set_a2a_token() updates request auth immediately, but securitySchemes["bearer"] is still only seeded during route registration. With register_card_route=True, clients fetching the card after a token add/remove will get stale auth metadata until restart.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@a2a_handler.py` around lines 1022 - 1027, The agent_card's static seeding of
securitySchemes based on _A2A_TOKEN causes stale metadata; update the card
whenever the token changes by modifying set_a2a_token() (or the token setter) to
add or remove agent_card["securitySchemes"]["bearer"] depending on _A2A_TOKEN
state, or alternatively make the card route return a fresh card that computes
securitySchemes from current _A2A_TOKEN; touch the symbols _A2A_TOKEN,
set_a2a_token(), agent_card, and securitySchemes["bearer"] to keep the live card
in sync with token adds/removals when register_card_route=True.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autostart.py`:
- Around line 116-124: Sanitize the incoming agent_name to prevent path
traversal and unsafe characters before building the plist name: in
_macos_label(agent_name) (and thus _macos_plist_path(agent_name)) replace any
character that is not a lowercase letter, digit, hyphen or underscore (and
condense runs of invalid chars to a single hyphen), strip leading/trailing
punctuation/dots/hyphens, and ensure the result is non-empty (fallback to a safe
default like "agent"); then use that sanitized token when returning
f"ai.protolabs.{safe}" and when constructing the plist filename to guarantee no
slashes or “..” can escape the LaunchAgents directory.

In `@chat_ui.py`:
- Around line 376-490: The drawer UI is missing identity.name,
identity.operator, auth.token and runtime.autostart_on_boot so those fields
never round-trip; add UI controls (e.g., Textbox for identity.name, Textbox for
identity.operator, Textbox type="password" for auth.token, and Checkbox for
runtime.autostart_on_boot) to the same accordion area and include those control
variables in the _config_components list and in the _save()/_load_all wiring so
they are passed into config_to_dict()/save_all and restored on reload; update
any fetch/load handlers that build the form from saved config to read/write
these new controls (references: _config_components and the _save/_load_all
methods).

In `@server.py`:
- Around line 731-737: The GET handler _api_list_models currently accepts
api_key (and api_base) as query parameters which leaks secrets; change it to
stop binding secrets from the query and instead read them from a request header
(e.g. Authorization or X-Api-Key) or require a POST body; update the endpoint
signature for _api_list_models to remove api_key/api_base query params, extract
credentials from fastapi request.headers (or request.json() if switching to
POST), then call list_gateway_models(base, key) with those values and return the
same {"models": models, "error": error} shape; ensure any client-side usage and
tests are adjusted to pass the key in the header (or body) rather than query
string.
- Around line 126-153: The code currently assigns _graph_config and updates the
A2A bearer token before attempting create_agent_graph(new_config), so a failed
graph build can leave the running _graph bound to a new config/auth; to fix,
defer committing runtime state until after a successful graph build: first check
is_setup_complete() and if it indicates building is required, call
create_agent_graph(new_config) and let it succeed (handle exceptions as now),
then on success assign _graph_config = new_config, call
set_a2a_token(new_config.auth_token or None) inside the existing
ImportError-safe block, set _graph = new_graph, and then log the reload; ensure
the early-return path for "setup not complete" does not mutate _graph_config or
the A2A token.
- Around line 285-319: The code currently calls mark_setup_complete() before
calling _reload_langgraph_agent(), so if the reload fails the setup marker
remains and the wizard is hidden; modify finish_setup() so that after calling
_reload_langgraph_agent() you check the returned ok flag and, on failure, undo
the marker (e.g., call a new or existing inverse like unmark_setup_complete() or
explicitly remove the .setup-complete marker) and append/log a clear message
before returning; ensure the rollback happens before the final return and uses
the same messages list for user-visible output.

---

Outside diff comments:
In `@a2a_handler.py`:
- Around line 1022-1027: The agent_card's static seeding of securitySchemes
based on _A2A_TOKEN causes stale metadata; update the card whenever the token
changes by modifying set_a2a_token() (or the token setter) to add or remove
agent_card["securitySchemes"]["bearer"] depending on _A2A_TOKEN state, or
alternatively make the card route return a fresh card that computes
securitySchemes from current _A2A_TOKEN; touch the symbols _A2A_TOKEN,
set_a2a_token(), agent_card, and securitySchemes["bearer"] to keep the live card
in sync with token adds/removals when register_card_route=True.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2a5a8a2e-44d3-42a1-b829-cceac029df25

📥 Commits

Reviewing files that changed from the base of the PR and between a2c5b5b and acf2bd6.

📒 Files selected for processing (7)
  • Dockerfile
  • a2a_handler.py
  • autostart.py
  • chat_ui.py
  • graph/config_io.py
  • server.py
  • tests/test_config_io.py

Comment thread autostart.py
Comment thread chat_ui.py
Comment thread server.py Outdated
Comment thread server.py
Comment thread server.py Outdated
Major — atomic graph reload (server.py::_reload_langgraph_agent):
  Previously swapped _graph_config + set_a2a_token BEFORE calling
  create_agent_graph, so a failed build would leave the running
  _graph pinned to the OLD agent but reporting the NEW config and
  rotated bearer token. Now build first; commit config/auth/graph
  state only on success.

Major — rollback marker on failed first-run reload (finish_setup):
  mark_setup_complete fires before the reload so the graph compiles.
  If the reload fails, the marker stays and the next page load
  drops the user into chat with _graph=None and no obvious recovery
  path. finish_setup now reset_setup()s on reload failure, so the
  wizard returns for a retry.

Major — sanitize agent_name for plist path (autostart.py::_macos_label):
  Prior sanitization only lowercased + replaced spaces. `/` and `..`
  survived, so an agent name with path-traversal chars could target
  arbitrary paths relative to ~/Library/LaunchAgents/ on install /
  status / uninstall. Strip input to [a-z0-9_.-] and fall back to
  "protoagent" when the result is empty/dots-only. Verified resolved
  plist path stays inside LaunchAgents/ for every path-traversal
  payload I could think of.

Major — gateway api_key off query string (POST /api/config/models):
  GET with ?api_key=... leaks credentials into browser history,
  reverse-proxy access logs, and uvicorn's own request log. Switched
  to POST taking a small ModelsProbeRequest body. Empty body still
  falls back to stored config so the drawer's initial render works.

Major — round-trip identity/security/autostart through the drawer
(chat_ui.py + server.py):
  Drawer previously only edited model/worker/middleware/knowledge/
  SOUL, leaving the wizard's agent name, operator, bearer token,
  and autostart flag with no post-setup edit path. Added three new
  accordion sections (Identity / Security / Autostart), wired them
  into _config_components, _load_all, and _save.

  Extracted _sync_autostart_with_config so the wizard's finish_setup
  and the drawer's save_all both drive the LaunchAgent install/
  uninstall from the same code path — flipping the drawer's
  Autostart checkbox + Save & Reload now installs/removes the plist
  the same way the wizard does.

Verified end-to-end on product-director:
  * Fresh clone → wizard → "pd-renamed" via drawer → agent card
    says pd-renamed, old bearer token → 401, new bearer token → 200
  * Invalid temperature (99) → rejected at validation; YAML +
    marker untouched
  * Path traversal via /api/config/presets/../secret → 404

Tests: 36 passing (no new cases — existing coverage was already
sufficient for these fixes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server.py`:
- Around line 248-397: The nested callback implementations in
_build_settings_callbacks (get_config, list_models, save_all, finish_setup,
restart_setup, autostart_info, toggle_autostart) should be extracted to
module-level private functions to reduce function length and improve
testability; create module-level functions named e.g. _cfg_get_config,
_cfg_list_models, _cfg_save_all, _cfg_finish_setup, _cfg_restart_setup,
_cfg_autostart_info, _cfg_toggle_autostart that accept any external dependencies
they need (like _graph_config, _apply_settings_changes, _reload_langgraph_agent,
_sync_autostart_with_config, _active_port, log) or import their local helpers
internally, move the internal config_io/import blocks into those functions (or
to module imports), then simplify _build_settings_callbacks to return the
mapping to these new symbols (and still reference
read_soul/read_soul_preset/list_soul_presets/list_gateway_models/list_available_tools/is_setup_complete/mark_setup_complete/reset_setup
where appropriate). Ensure behavior is identical (preserve validation, exception
handling, mark_setup_complete order, and return types).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 83d2c0bf-0ebb-4f97-9612-d3db269f08ec

📥 Commits

Reviewing files that changed from the base of the PR and between acf2bd6 and 6f708f8.

📒 Files selected for processing (3)
  • autostart.py
  • chat_ui.py
  • server.py

Comment thread server.py
Comment on lines +248 to +397
def _build_settings_callbacks() -> dict[str, Any]:
"""Callbacks consumed by the Gradio Configuration drawer + wizard."""
from graph.config_io import (
config_to_dict,
is_setup_complete,
list_available_tools,
list_gateway_models,
list_soul_presets,
mark_setup_complete,
read_soul,
read_soul_preset,
reset_setup,
)

def get_config() -> dict[str, Any]:
return config_to_dict(_graph_config)

def list_models(api_base: str = "", api_key: str = "") -> tuple[list[str], str]:
"""UI-friendly model lookup.

Uses the form-local api_base/api_key when the user is trying a
different endpoint before saving; falls back to the currently
loaded graph config so the initial render works without
arguments.
"""
base = api_base or (_graph_config.api_base if _graph_config else "")
key = api_key or (_graph_config.api_key if _graph_config else "")
return list_gateway_models(base, key)

def save_all(config: dict | None, soul: str | None) -> tuple[bool, str]:
ok, messages = _apply_settings_changes(config=config, soul=soul)
return ok, " • ".join(messages)

def finish_setup(config: dict | None, soul: str | None) -> tuple[bool, str]:
"""Wizard terminal action — write everything, mark complete, reload.

Ordering matters:

1. Write config YAML + SOUL.md (no reload yet).
2. ``mark_setup_complete()`` — flip the marker BEFORE the
reload so ``_reload_langgraph_agent`` actually compiles
the graph. Doing it after means the reload sees
setup-incomplete and stays ``_graph = None``.
3. Sync autostart (LaunchAgent plist is independent of the
graph, so it can happen any time after the config is
written).
4. Reload — marker present, graph compiles, chat works.

Returns a single status string joining per-step messages.
"""
from graph.config_io import (
apply_updates_to_yaml,
load_yaml_doc,
save_yaml_doc,
validate_config_dict,
write_soul,
)

messages: list[str] = []

# 1. Persist
if config is not None:
ok, err = validate_config_dict(config)
if not ok:
return False, f"validation: {err}"
try:
doc = load_yaml_doc()
apply_updates_to_yaml(doc, config)
save_yaml_doc(doc)
messages.append("config saved")
except Exception as e:
log.exception("[setup] YAML write failed: %s", e)
return False, f"config write: {e}"

if soul is not None:
try:
paths = write_soul(soul)
messages.append(f"SOUL saved ({len(paths)} path{'s' if len(paths) != 1 else ''})")
except Exception as e:
log.exception("[setup] SOUL write failed: %s", e)
return False, f"soul write: {e}"

# 2. Flip the marker — MUST be before reload so the graph builds
mark_setup_complete()
messages.append("setup marked complete")

# 3. Autostart sync (shared helper — drawer path runs the same)
as_msg = _sync_autostart_with_config(config)
if as_msg:
messages.append(as_msg)

# 4. Reload — now picks up setup_complete=True and compiles.
# On failure, roll back the marker so the next page load
# drops the user back into the wizard instead of landing
# them in the chat UI with the "setup required" fallback
# and no obvious way to retry.
ok, reload_msg = _reload_langgraph_agent()
messages.append(reload_msg)
if not ok:
reset_setup()
messages.append("setup marker rolled back — re-run the wizard after fixing the error above")

return ok, " • ".join(messages)

def restart_setup() -> str:
"""Drawer action — delete the marker so the wizard runs again."""
reset_setup()
log.info("[setup] marker removed — wizard will run on next page load")
return "setup marker removed • reload the page to run the wizard"

def autostart_info() -> dict[str, Any]:
"""Report platform support + current on-disk state. The drawer
uses this to render the toggle correctly and to print the
plist path for debugging."""
try:
from autostart import autostart_status

name = (_graph_config.identity_name if _graph_config else "") or "protoagent"
return autostart_status(name)
except Exception as e:
return {"supported": False, "installed": False, "reason": str(e)}

def toggle_autostart(enabled: bool) -> tuple[bool, str]:
"""Install or uninstall the OS autostart artifact, mirroring
the YAML field. Called from the drawer's checkbox handler so
toggling takes effect immediately without waiting for Save."""
try:
from autostart import install_autostart, uninstall_autostart

name = (_graph_config.identity_name if _graph_config else "") or "protoagent"
if enabled:
return install_autostart(agent_name=name, port=_active_port)
return uninstall_autostart(agent_name=name)
except Exception as e:
return False, str(e)

return {
"get_config": get_config,
"get_soul": read_soul,
"list_models": list_models,
"list_tools": list_available_tools,
"list_soul_presets": list_soul_presets,
"read_soul_preset": read_soul_preset,
"save_all": save_all,
"finish_setup": finish_setup,
"restart_setup": restart_setup,
"is_setup_complete": is_setup_complete,
"autostart_info": autostart_info,
"toggle_autostart": toggle_autostart,
}
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider extracting nested callbacks into module-level functions.

The function has 56 statements (Ruff PLR0915 warns at 50). While functional, extracting the nested callbacks (get_config, list_models, save_all, finish_setup, etc.) into module-level private functions would improve readability and testability.

🧰 Tools
🪛 Ruff (0.15.10)

[warning] 248-248: Too many statements (56 > 50)

(PLR0915)


[warning] 319-319: Redundant exception object included in logging.exception call

(TRY401)


[warning] 327-327: Redundant exception object included in logging.exception call

(TRY401)


[warning] 367-367: Do not catch blind exception: Exception

(BLE001)


[warning] 370-370: Boolean-typed positional argument in function definition

(FBT001)


[warning] 381-381: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server.py` around lines 248 - 397, The nested callback implementations in
_build_settings_callbacks (get_config, list_models, save_all, finish_setup,
restart_setup, autostart_info, toggle_autostart) should be extracted to
module-level private functions to reduce function length and improve
testability; create module-level functions named e.g. _cfg_get_config,
_cfg_list_models, _cfg_save_all, _cfg_finish_setup, _cfg_restart_setup,
_cfg_autostart_info, _cfg_toggle_autostart that accept any external dependencies
they need (like _graph_config, _apply_settings_changes, _reload_langgraph_agent,
_sync_autostart_with_config, _active_port, log) or import their local helpers
internally, move the internal config_io/import blocks into those functions (or
to module imports), then simplify _build_settings_callbacks to return the
mapping to these new symbols (and still reference
read_soul/read_soul_preset/list_soul_presets/list_gateway_models/list_available_tools/is_setup_complete/mark_setup_complete/reset_setup
where appropriate). Ensure behavior is identical (preserve validation, exception
handling, mark_setup_complete order, and return types).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Declining. Reasoning:

  • Testability doesn't improve. Every callback is already reachable via cb = _build_settings_callbacks(); cb["save_all"](...). Extracting to _cfg_* renames the import path, same function to call, same globals referenced.
  • Globals don't go away. get_config, list_models, autostart_info, toggle_autostart read module-level _graph_config / _active_port. As closures they capture; as module-level they'd global-read. Same coupling, more ceremony.
  • Lazy imports migrate to module top, regressing boot cost. The from graph.config_io import … and from autostart import … lines currently only run when the settings dict is actually built. Hoisting them to module scope imports them on every import server, including tests that only exercise _build_agent_card. Not catastrophic but a regression for zero functional gain.
  • Most callbacks are 2–10 lines. get_config is two, restart_setup is three, toggle_autostart is eight. Twelve new _cfg_* module symbols that exist solely to back one dict entry adds noise, not readability.
  • The one genuinely long closure (finish_setup, ~50-line body) has a step-numbered docstring colocated with the code. Extraction would be a move without a rewrite.

PLR0915 fires at 50 statements; _build_settings_callbacks is at 56. The honest way to drop below 50 is exactly this refactor — which I've argued against. The alternative (per-statement micro-reductions) degrades readability to appease the counter.

No change on this one. 36 tests still green.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mabry1985, that's a well-reasoned case — particularly the lazy-import boot cost point and the closure-captures-globals-anyway argument. The colocated step-numbered docstring in finish_setup is also doing real work that extraction would scatter. Happy to leave it as-is.


✏️ Learnings added
Learnt from: mabry1985
Repo: protoLabsAI/protoAgent PR: 150
File: server.py:248-397
Timestamp: 2026-04-23T04:44:59.625Z
Learning: In `server.py` (Python, FastAPI/Gradio server), the `_build_settings_callbacks()` function intentionally uses nested closures (get_config, list_models, save_all, finish_setup, restart_setup, autostart_info, toggle_autostart) rather than module-level `_cfg_*` functions. The author's rationale: (1) testability is unchanged since callbacks are reachable via the returned dict; (2) module-level globals (_graph_config, _active_port) are captured either way; (3) lazy imports (graph.config_io, autostart) inside the function avoid boot-time cost on every `import server`; (4) most callbacks are 2–10 lines, so extraction adds noise. Do not flag PLR0915 on this function.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@mabry1985 mabry1985 merged commit 3c25c41 into main Apr 23, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant