Add code-api provisioning server, client bootstrap, and smoke tests by PenguinzTech · Pull Request #3 · penguintechinc/penguincode

PenguinzTech · 2026-02-10T23:22:08Z

Summary

37 new smoke tests covering REST API endpoints, admin CRUD, and full server→client integration pipeline
CI pipeline fully fixed: corrected stale paths (penguincode/ → penguincode_cli/), resolved all 600 ruff lint errors, fixed broken test imports, added Windows temp dir compatibility
K8s alpha smoke tests: added Docker build step, fixed namespace management, microk8s permissions, concurrency control
Helm chart fixes: added ServiceAccount template, REST port 8080 to Service/Deployment, fixed test pod to use project image, removed conflicting namespace template
Dockerfile.server fixes: corrected package path, module path, added __main__.py for python -m support

Test Results

228 tests pass, 7 skipped (deprecated APIs), 0 failures
Tests cover: provision endpoint, health check, GPU filtering, tier gating, JWT auth, all 7 CRUD entity types, config_writer pipeline, cache fallback
All 6 platform/version CI matrix combinations pass (ubuntu/macos/windows × py3.12/3.13)
K8s alpha smoke tests pass (Helm lint → Docker build → deploy → gRPC/REST/provision verification)

Test plan

pytest tests/ -v — 228 passed, 7 skipped
ruff check penguincode_cli/ tests/ — all checks passed
CI: Lint & Test Extension — pass
CI: Test CLI (all 6 platform combos) — pass
CI: alpha-smoke (K8s) — pass
CI: Socket Security — pass
Beta deployment to dal2 cluster — gRPC, REST health, provision all verified

🤖 Generated with Claude Code

Implements the managed AI coding platform architecture with three-tier design: code-api (central server), code-client (OpenCode wrapper), and code-webui (admin dashboard, placeholder). Phase 1 - code-api: - SQLite config store with CRUD for models, agents, MCP servers, plugins, skills, tools, GitHub orgs, instructions, and permissions - POST /api/v1/provision endpoint with license tier gating (community/professional/enterprise) and GPU-aware model filtering - Admin CRUD REST endpoints (JWT-protected) for all entity types - Quart REST app co-hosted with gRPC server in same async event loop - Default seed data for 9 agents, 5 models, 2 MCP servers, 6 skills Phase 2 - code-client: - Bootstrap module: provisions from code-api, writes OpenCode config files (opencode.json, AGENTS.md, agents/*.md, skills/*/SKILL.md) - Offline fallback to cached config when code-api is unreachable - Ollama model manager with required (blocking) and optional (background) pulls - GitHub org repo manager for cloning/refreshing pre-configured repos - 'penguincode launch' CLI command for full bootstrap + OpenCode exec - 'penguincode serve' updated to use real gRPC + REST server Also includes: - 8 agent prompt markdown files ported from existing Python prompts - 6 skill markdown files for common development workflows - Updated docker-compose.yml for 3-tier architecture - Test suite for config store, provisioning, tier gating, GPU filtering, and config writer Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Comprehensive test coverage (37 new tests) for the code-api provisioning server and code-client bootstrap pipeline: - test_rest_api.py: Health endpoint, provisioning responses, community tier gating, GPU filtering, response structure validation (11 tests) - test_admin_api.py: JWT auth enforcement (401/403/expired), CRUD cycles for all 7 entity types, instructions, permissions, error handling (15 tests) - test_integration.py: Full provision→config_writer pipeline, admin changes propagation, offline cache fallback, tier gating, env-var overrides, MCP server/plugin/org passthrough (12 tests) - conftest.py: Shared fixtures for ConfigStore, Quart test client, JWT tokens Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sourcery-ai · 2026-02-10T23:22:14Z

Reviewer's Guide

Introduces a new SQLite-backed configuration store and Quart-based REST API (code-api) alongside the existing gRPC server, plus a client bootstrap pipeline that provisions config from the REST API into OpenCode-compatible files, auto-manages Ollama models and GitHub org repos, and adds comprehensive smoke/integration tests around provisioning, admin CRUD, tier gating, and GPU-aware model filtering.

Sequence diagram for client bootstrap and provisioning flow

sequenceDiagram
    actor User
    participant CLI as penguincode_launch
    participant Bootstrap as bootstrap.py
    participant REST as REST_API_/api/v1/provision
    participant License as LicenseValidator
    participant Store as ConfigStore
    participant Ollama as Ollama_Server
    participant GitHub as GitHub_Repos
    participant OpenCode as OpenCode_App

    User->>CLI: run penguincode launch
    CLI->>Bootstrap: bootstrap(api_url, license_key,...)

    Bootstrap->>Bootstrap: provision(api_url, license_key)
    Bootstrap->>REST: POST /api/v1/provision
    REST->>License: validate(license_key)
    License-->>REST: license_info(tier,features)
    REST->>Store: build_provision_response(license_info, ollama_url)
    Store-->>REST: provision_config
    REST-->>Bootstrap: 200 OK + provision_config
    Bootstrap->>Bootstrap: cache_config()

    Bootstrap->>Bootstrap: write_opencode_json()
    Bootstrap->>Bootstrap: write_agents_md()
    Bootstrap->>Bootstrap: write_agent_prompts()
    Bootstrap->>Bootstrap: write_skills()

    alt skip_models == false
        Bootstrap->>Ollama: GET /api/tags (ollama_has_model)
        Ollama-->>Bootstrap: existing models
        Bootstrap->>Ollama: pull required models (subprocess ollama pull)
        par optional models
            Bootstrap->>Ollama: background pull optional models
        end
    end

    alt skip_orgs == false
        Bootstrap->>GitHub: clone/refresh org repos
    end

    alt exec_opencode == true
        Bootstrap->>OpenCode: os.execvp("opencode")
        Bootstrap-->>CLI: process replaced
    else exec_opencode == false
        Bootstrap-->>CLI: return provision_config
        CLI-->>User: print tier, agents
    end

Entity relationship diagram for ConfigStore SQLite schema

erDiagram
    models {
        TEXT name PK
        TEXT data
    }
    agents {
        TEXT name PK
        TEXT data
    }
    mcp_servers {
        TEXT name PK
        TEXT data
    }
    plugins {
        TEXT name PK
        TEXT data
    }
    skills {
        TEXT name PK
        TEXT data
    }
    tools {
        TEXT name PK
        TEXT data
    }
    github_orgs {
        TEXT org PK
        TEXT data
    }
    instructions {
        TEXT path PK
    }
    permissions {
        TEXT pattern PK
        TEXT policy
    }
    kv {
        TEXT key PK
        TEXT value
    }

    models ||--o{ agents : "referenced_by_model_name"
    models ||--o{ skills : "used_in_skill_config"
    models ||--o{ tools : "used_in_tool_config"
    mcp_servers ||--o{ tools : "mcp_server_field"
    github_orgs ||--o{ kv : "org_specific_settings_optional"

Class diagram for ConfigStore and REST service modules

classDiagram
    class OllamaModelDef {
        +str name
        +str role
        +bool required
        +int vram_estimate_mb
    }

    class AgentDef {
        +str name
        +str model
        +str mode
        +str prompt_file
        +str description
        +list~str~ tools_disabled
        +str escalation_model
    }

    class MCPServerDef {
        +str name
        +list~str~ command
        +dict~str,str~ env
        +bool enabled
    }

    class PluginDef {
        +str name
        +str source
        +str path
        +dict~str,Any~ config
    }

    class SkillDef {
        +str name
        +str description
        +str content_md
        +list~str~ permissions
        +str agent_binding
    }

    class CustomToolDef {
        +str name
        +str description
        +str mcp_server
        +str command
        +list~str~ args
        +dict~str,str~ env
    }

    class GitHubOrgDef {
        +str org
        +str token_env
        +list~str~ default_repos
    }

    class ConfigStore {
        -str _db_path
        -aiosqlite.Connection _db
        +ConfigStore(db_path: str)
        +open() void
        +close() void
        +seed_defaults() void
        +list_models() list~dict~
        +get_model(name: str) dict
        +upsert_model(data: dict) void
        +delete_model(name: str) bool
        +list_agents() list~dict~
        +get_agent(name: str) dict
        +upsert_agent(data: dict) void
        +delete_agent(name: str) bool
        +list_mcp_servers() list~dict~
        +get_mcp_server(name: str) dict
        +upsert_mcp_server(data: dict) void
        +delete_mcp_server(name: str) bool
        +list_plugins() list~dict~
        +get_plugin(name: str) dict
        +upsert_plugin(data: dict) void
        +delete_plugin(name: str) bool
        +list_skills() list~dict~
        +get_skill(name: str) dict
        +upsert_skill(data: dict) void
        +delete_skill(name: str) bool
        +list_tools() list~dict~
        +get_tool(name: str) dict
        +upsert_tool(data: dict) void
        +delete_tool(name: str) bool
        +list_github_orgs() list~dict~
        +get_github_org(org: str) dict
        +upsert_github_org(data: dict) void
        +delete_github_org(org: str) bool
        +list_instructions() list~str~
        +add_instruction(path: str) void
        +remove_instruction(path: str) bool
        +list_permissions() dict~str,str~
        +set_permission(pattern: str, policy: str) void
        +remove_permission(pattern: str) bool
        +kv_get(key: str) str
        +kv_set(key: str, value: str) void
        +build_provision_response(license_info: dict, ollama_api_url: str) dict
        -_upsert(table: str, key: str, data: dict) void
        -_get_one(table: str, key: str) dict
        -_get_all(table: str) list~dict~
        -_delete(table: str, key: str) bool
    }

    class ProvisionModule {
        <<module>>
        +init_provision(store: ConfigStore, license_validator: Any) void
        +provision(request) Response
        +health(request) Response
        -_validate_license(license_key: str) dict
        -_filter_by_tier(provision: dict, tier: str) dict
        -_filter_models_by_gpu(models: list~dict~, vram_mb: int) list~dict~
    }

    class AdminModule {
        <<module>>
        +init_admin(store: ConfigStore, jwt_secret: str) void
        +require_admin(fn) function
        +list_models() Response
        +upsert_models() Response
        +get_models(key: str) Response
        +delete_models(key: str) Response
        +list_agents() Response
        +upsert_agents() Response
        +list_mcp_servers() Response
        +list_plugins() Response
        +list_skills() Response
        +list_tools() Response
        +list_github_orgs() Response
        +list_instructions() Response
        +add_instruction() Response
        +remove_instruction(path: str) Response
        +list_permissions() Response
        +set_permission() Response
    }

    class RestAppFactory {
        +create_rest_app(config_store: ConfigStore, jwt_secret: str, license_validator: Any) Quart
    }

    class BootstrapClient {
        <<module>>
        +provision(api_url: str, license_key: str) dict
        +bootstrap(api_url: str, license_key: str, skip_models: bool, skip_orgs: bool, exec_opencode: bool) dict
        +keepalive(api_url: str, license_key: str, interval: int) void
    }

    class ModelManager {
        <<module>>
        +ollama_has_model(name: str, api_url: str) bool
        +ensure_models(models: list~dict~, api_url: str) void
    }

    class OrgManager {
        <<module>>
        +setup_github_orgs(orgs: list~dict~) void
    }

    class ConfigWriter {
        <<module>>
        +write_opencode_json(config: dict) Path
        +write_agents_md(config: dict) Path
        +write_agent_prompts(config: dict) Path
        +write_skills(config: dict) Path
    }

    ConfigStore --> OllamaModelDef : uses_defaults
    ConfigStore --> AgentDef : uses_defaults
    ConfigStore --> MCPServerDef : uses_defaults
    ConfigStore --> PluginDef : uses_defaults
    ConfigStore --> SkillDef : uses_defaults
    ConfigStore --> CustomToolDef : uses_defaults
    ConfigStore --> GitHubOrgDef : uses_defaults

    RestAppFactory --> ConfigStore : injects
    RestAppFactory --> ProvisionModule : init_provision
    RestAppFactory --> AdminModule : init_admin

    ProvisionModule --> ConfigStore : uses
    AdminModule --> ConfigStore : uses

    BootstrapClient --> ConfigWriter : calls
    BootstrapClient --> ModelManager : calls
    BootstrapClient --> OrgManager : calls

File-Level Changes

Change	Details	Files
Add SQLite-backed ConfigStore with default entities and provisioning response builder.	Introduce dataclasses for models, agents, MCP servers, plugins, skills, tools, GitHub orgs, and related schema tables. Implement async CRUD methods for all entity types, including instructions, permissions, and a small kv store. Seed the store with default models, agents, MCP servers, skills, instruction paths, and bash permissions on startup. Provide a build_provision_response helper that assembles the full provisioning payload (license, Ollama models, agents, skills, tools, orgs, permissions, instructions).	`penguincode_cli/server/models/config_store.py` `penguincode_cli/server/models/__init__.py`
Add Quart REST app exposing provisioning and JWT-protected admin CRUD APIs, wired into the existing server lifecycle.	Create a factory to build the Quart app, initialize shared ConfigStore and JWT/license dependencies, and register blueprints. Implement POST /api/v1/provision with license-tier feature gating, GPU-aware model filtering, and a health endpoint. Add generic admin CRUD routes for all entity types plus instructions and permissions, guarded by an admin-scope JWT decorator. Integrate ConfigStore and REST app startup/shutdown into PenguinCodeServer alongside the gRPC server, including Hypercorn-based serving and CLI/compose wiring.	`penguincode_cli/server/rest_app.py` `penguincode_cli/server/services/provision.py` `penguincode_cli/server/services/admin.py` `penguincode_cli/server/main.py` `penguincode_cli/main.py` `docker-compose.yml` `pyproject.toml` `tests/conftest.py` `tests/test_rest_api.py` `tests/test_admin_api.py` `tests/test_provision.py`
Add client bootstrap flow that provisions from code-api, writes OpenCode config files, manages Ollama models, and sets up GitHub org repos.	Implement bootstrap() and provision() helpers that call the REST API with license/GPU metadata, fall back to a cached config on failure, and optionally exec into the opencode binary. Generate opencode.json, AGENTS.md, per-agent prompt files, and skill SKILL.md files from a provisioning response with env-var based model overrides and tool gating for non-writing agents. Add an Ollama model manager that checks for local models via the Ollama HTTP API and pulls missing required/optional models using the CLI, with optional models pulled in the background. Add a GitHub org manager that clones or refreshes configured org repos into a local ~/.penguincode/repos directory using gh or git.	`penguincode_cli/client/bootstrap.py` `penguincode_cli/client/config_writer.py` `penguincode_cli/client/model_manager.py` `penguincode_cli/client/org_manager.py` `penguincode_cli/main.py` `tests/test_config_writer.py` `tests/test_integration.py`
Seed default agent and skill prompt content used by the provisioning and client config pipeline.	Add markdown prompt templates for multiple agents (foreman, executor, planner, reviewer, explorer, debugger, tester, researcher) describing roles, tools, workflows, and output rules. Add markdown skill descriptions for core workflows (brainstorming, executing plans, TDD, systematic debugging, code review, verification-before-completion). Wire defaults into ConfigStore seeding so they appear in provisioning responses and downstream config output.	`penguincode_cli/defaults/agents/foreman.md` `penguincode_cli/defaults/agents/executor.md` `penguincode_cli/defaults/agents/planner.md` `penguincode_cli/defaults/agents/reviewer.md` `penguincode_cli/defaults/agents/explorer.md` `penguincode_cli/defaults/agents/debugger.md` `penguincode_cli/defaults/agents/tester.md` `penguincode_cli/defaults/agents/researcher.md` `penguincode_cli/defaults/skills/brainstorming.md` `penguincode_cli/defaults/skills/executing-plans.md` `penguincode_cli/defaults/skills/systematic-debugging.md` `penguincode_cli/defaults/skills/test-driven-development.md` `penguincode_cli/defaults/skills/verification-before-completion.md` `penguincode_cli/defaults/skills/code-review.md`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've found 6 security issues, 7 other issues, and left some high level feedback:

Security issues:

Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'. (link)
Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'. (link)
Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'. (link)
Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'. (link)
Avoiding SQL string concatenation: untrusted input concatenated with raw SQL query can result in SQL Injection. In order to execute raw query safely, prepared statement should be used. SQLAlchemy provides TextualSQL to easily used prepared statement with named parameters. For complex SQL composition, use SQL Expression Language or Schema Definition Language. In most cases, SQLAlchemy ORM will be a better option. (link)
Avoiding SQL string concatenation: untrusted input concatenated with raw SQL query can result in SQL Injection. In order to execute raw query safely, prepared statement should be used. SQLAlchemy provides TextualSQL to easily used prepared statement with named parameters. For complex SQL composition, use SQL Expression Language or Schema Definition Language. In most cases, SQLAlchemy ORM will be a better option. (link)

General comments:

The REST blueprints rely on module-level globals (_config_store, _jwt_secret, _license_validator) set via init functions; consider passing these as app config or using app context to avoid hidden state and make reuse/testing in different processes or with multiple apps safer.
In ConfigStore._get_one and related helpers you use execute_fetchall and then index the first row; switching to execute_fetchone (or cursor.fetchone) would better express the intent and avoid unnecessary list allocation.
The license validation and GPU detection paths currently call potentially blocking operations (_license_validator.validate, subprocess.run via nvidia-smi) directly in async flows; consider offloading these to run_in_executor to prevent event loop stalls under load.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The REST blueprints rely on module-level globals (`_config_store`, `_jwt_secret`, `_license_validator`) set via init functions; consider passing these as app config or using app context to avoid hidden state and make reuse/testing in different processes or with multiple apps safer.
- In `ConfigStore._get_one` and related helpers you use `execute_fetchall` and then index the first row; switching to `execute_fetchone` (or `cursor.fetchone`) would better express the intent and avoid unnecessary list allocation.
- The license validation and GPU detection paths currently call potentially blocking operations (`_license_validator.validate`, `subprocess.run` via `nvidia-smi`) directly in async flows; consider offloading these to `run_in_executor` to prevent event loop stalls under load.

## Individual Comments

### Comment 1
<location> `penguincode_cli/server/models/config_store.py:360-361` </location>
<code_context>
+            await self._upsert("agents", agent.name, asdict(agent))
+        for mcp in _default_mcp_servers():
+            await self._upsert("mcp_servers", mcp.name, asdict(mcp))
+        for skill in _default_skills():
+            await self._upsert("skills", skill.name, asdict(skill))
+        for path in _default_instructions():
+            await self._db.execute(
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Default skills are seeded without loading their markdown content from the defaults directory.

`_default_skills` currently sets `content_md` to `""` and `seed_defaults` never reads from the new `penguincode_cli/defaults/skills/*.md` files, so seeded skills end up with empty content and `write_skills` falls back to placeholders. To actually ship the curated prompts, `seed_defaults` (or equivalent) should load the corresponding `penguincode_cli/defaults/skills/<name>.md` file for each skill and assign it to `content_md` before upserting.

Suggested implementation:

```python
import logging
from pathlib import Path

```

```python
from dataclasses import asdict, replace

```

```python
        for mcp in _default_mcp_servers():
            await self._upsert("mcp_servers", mcp.name, asdict(mcp))

        # Load curated markdown content for default skills, if available.
        skills_defaults_dir = (
            Path(__file__).resolve().parent.parent.parent / "defaults" / "skills"
        )

        for skill in _default_skills():
            content_md = ""

            skill_md_path = skills_defaults_dir / f"{skill.name}.md"
            if skill_md_path.is_file():
                try:
                    content_md = skill_md_path.read_text(encoding="utf-8")
                except OSError:
                    logger.warning(
                        "Failed to read default skill markdown for %s from %s",
                        skill.name,
                        skill_md_path,
                    )

            # If we successfully loaded content, override the empty content_md
            if content_md:
                skill = replace(skill, content_md=content_md)

            await self._upsert("skills", skill.name, asdict(skill))

        for path in _default_instructions():

```

- If `config_store.py` does not already import `logging` or `from dataclasses import asdict` exactly as shown, adjust the import search/replace blocks to match the existing import style.
- Confirm that `penguincode_cli/defaults/skills/<name>.md` filenames exactly match `skill.name`; if they differ (e.g. kebab-case vs snake_case), insert the appropriate name-to-filename mapping in the `skill_md_path` construction.
</issue_to_address>

### Comment 2
<location> `penguincode_cli/server/models/config_store.py:601-607` </location>
<code_context>
+        instructions = await self.list_instructions()
+        permissions = await self.list_permissions()
+
+        # Build agents dict (name -> config) for the response
+        agents_dict = {}
+        for a in agents_raw:
+            agents_dict[a["name"]] = {
+                "model": a["model"],
+                "mode": a.get("mode", "subagent"),
</code_context>

<issue_to_address>
**suggestion:** Provisioned agent configs drop useful fields like description, prompt_file, and escalation_model.

`AgentDef` exposes richer metadata (`description`, `prompt_file`, `tools_disabled`, `escalation_model`) that isn’t surfaced in the provision response, where we only return `model` and `mode`. Exposing at least `description` (and possibly prompt path / tooling constraints) in the `agents` block would let clients (e.g., `config_writer.write_agent_prompts` and related tools) derive tier-specific behavior without hardcoding agent semantics.

```suggestion
        # Build agents dict (name -> config) for the response
        agents_dict = {}
        for a in agents_raw:
            agents_dict[a["name"]] = {
                "model": a["model"],
                "mode": a.get("mode", "subagent"),
                "description": a.get("description"),
                "prompt_file": a.get("prompt_file"),
                "tools_disabled": a.get("tools_disabled"),
                "escalation_model": a.get("escalation_model"),
            }
```
</issue_to_address>

### Comment 3
<location> `penguincode_cli/server/services/provision.py:111-127` </location>
<code_context>
+    return provision
+
+
+def _filter_models_by_gpu(models: list[dict], vram_mb: int) -> list[dict]:
+    """Adjust model requirements based on client GPU VRAM."""
+    if vram_mb <= 0:
+        return models
+
+    filtered = []
+    for m in models:
+        entry = dict(m)
+        # If client VRAM is small, mark large models as not required
+        if vram_mb < 4096 and "13b" in m["name"]:
+            entry["required"] = False
+        elif vram_mb < 8192 and "34b" in m["name"]:
+            entry["required"] = False
+        elif vram_mb < 16384 and "70b" in m["name"]:
+            entry["required"] = False
+        filtered.append(entry)
+    return filtered
+
</code_context>

<issue_to_address>
**suggestion:** GPU-based model filtering relies on name substrings and ignores the vram_estimate metadata you store.

Since `OllamaModelDef` already exposes `vram_estimate_mb`, `_filter_models_by_gpu` can rely on that instead of hardcoded name patterns like `"13b"` / `"34b"` / `"70b"`. Using `m.get("vram_estimate_mb")` (with a reasonable fallback) and comparing to `vram_mb` would make the requirement tiering metadata-driven, resilient to naming changes, and easier to extend to new models.

```suggestion
def _filter_models_by_gpu(models: list[dict], vram_mb: int) -> list[dict]:
    """Adjust model requirements based on client GPU VRAM.

    Uses the model's `vram_estimate_mb` metadata (if present) instead of
    hardcoded name patterns to decide whether a model should be required.
    """
    if vram_mb <= 0:
        return models

    filtered: list[dict] = []
    for m in models:
        entry = dict(m)
        vram_estimate = entry.get("vram_estimate_mb")

        # If we have a numeric VRAM estimate and the client's VRAM is below it,
        # mark the model as not required. If no estimate is present, leave as-is.
        if isinstance(vram_estimate, (int, float)) and vram_estimate > 0:
            if vram_mb < vram_estimate:
                entry["required"] = False

        filtered.append(entry)

    return filtered
```
</issue_to_address>

### Comment 4
<location> `penguincode_cli/client/model_manager.py:55-64` </location>
<code_context>
+async def ensure_models(
</code_context>

<issue_to_address>
**issue (bug_risk):** Model pulls use the local `ollama` CLI and ignore the configured Ollama API URL.

In `ensure_models`, model existence is checked via the HTTP API (`api_url`), but pulls use `_ollama_pull_sync`, which always calls the local `ollama` CLI and ignores `api_url`. This will fail for remote/containerised Ollama instances without a local CLI. Please either perform pulls via the HTTP API (e.g. `POST /api/pull`) or make the CLI call honor a configured endpoint (e.g. env var or `api_url`), so behavior is consistent in non-local setups.
</issue_to_address>

### Comment 5
<location> `penguincode_cli/server/main.py:118` </location>
<code_context>
         logger.info("PenguinCode gRPC Server started")

+        # --- REST API -------------------------------------------------------
+        jwt_secret = self.settings.auth.jwt_secret or secrets.token_hex(32)
+
+        # Try to load penguin-licensing if available
</code_context>

<issue_to_address>
**question (bug_risk):** Randomly generating a JWT secret when none is configured makes the REST admin API effectively unusable across restarts.

Because a new random secret is generated whenever `settings.auth.jwt_secret` is empty, admin JWTs become invalid on every restart and external clients cannot construct valid tokens unless they can read the in-memory secret. If the goal is to disable admin APIs when no secret is configured, consider failing fast or skipping admin route registration instead of silently randomizing. Otherwise, derive the secret from config/env so it remains stable across restarts.
</issue_to_address>

### Comment 6
<location> `penguincode_cli/client/org_manager.py:17-26` </location>
<code_context>
+def setup_github_orgs(orgs: list[dict]) -> None:
</code_context>

<issue_to_address>
**suggestion (bug_risk):** GitHub repo clone/refresh ignores subprocess return codes and drops stderr, making failures hard to diagnose.

The clone and refresh paths should check `subprocess.run`’s `returncode` and, on failure, log a trimmed stderr (and optionally stdout). Right now failures are silent, which makes it hard to detect auth/network issues and explains missing or stale repos without clear error signals.

Suggested implementation:

```python
import os
import shutil
import subprocess
from pathlib import Path

logger = logging.getLogger(__name__)

_REPOS_DIR = Path.home() / ".penguincode" / "repos"


def _run_command(cmd: list[str], cwd: Path | None = None) -> bool:
    """Run a subprocess command and log stderr/stdout on failure.

    Returns True on success (exit code 0), False otherwise.
    """
    result = subprocess.run(
        cmd,
        cwd=cwd,
        capture_output=True,
        text=True,
    )

    if result.returncode != 0:
        stderr = (result.stderr or "").strip()
        stdout = (result.stdout or "").strip()

        msg_lines = [
            f"Command failed with exit code {result.returncode}: {' '.join(cmd)}",
        ]
        if stderr:
            msg_lines.append(f"stderr: {stderr}")
        if stdout:
            msg_lines.append(f"stdout: {stdout}")

        logger.error("\n".join(msg_lines))
        return False

    return True

```

To fully implement the comment in the clone/refresh paths in `setup_github_orgs`, update every `subprocess.run(...)` call in this function (and any helpers it uses for cloning/updating repos) to:

1. Replace direct `subprocess.run([...])` (or variants) with `_run_command([...], cwd=...)`.
2. Check the returned boolean; on `False`, either:
   - `logger.warning` or `logger.error` with org/repo context and `continue` to the next repo, or
   - propagate/raise if failures should be fatal.
3. Remove any existing manual `returncode` checks that become redundant, ensuring no paths silently ignore non‑zero exit codes or drop stderr/stdout.

For example, a previous pattern like:
```python
subprocess.run(["gh", "repo", "clone", full_name, str(target_dir)])
```
should become:
```python
if not _run_command(["gh", "repo", "clone", full_name, str(target_dir)]):
    logger.warning("Failed to clone repo %s into %s", full_name, target_dir)
    return  # or `continue` depending on the surrounding loop/logic
```

Similarly, any `git fetch`, `git pull`, or equivalent refresh commands should use `_run_command` so that auth/network issues and other failures are logged with trimmed stderr/stdout instead of failing silently.
</issue_to_address>

### Comment 7
<location> `tests/test_integration.py:209` </location>
<code_context>
+# ---------------------------------------------------------------------------
+
+
+class TestCacheFallback:
+    async def test_offline_cache_fallback(self, api_client, tmp_path, monkeypatch):
+        """Provision once → cache stored → load from cache works."""
</code_context>

<issue_to_address>
**suggestion (testing):** Consider also testing the cache-miss path that should raise a RuntimeError

Right now this only asserts the cached “happy path.” Please also add a test where both the network call and cache fail: monkeypatch `httpx.AsyncClient.post` in `bootstrap.provision` to raise, ensure `_CACHE_FILE` is absent, and assert that a `RuntimeError` with the expected message is raised.

Suggested implementation:

```python
class TestCacheFallback:
    async def test_offline_cache_fallback(self, api_client, tmp_path, monkeypatch):
        """Provision once → cache stored → load from cache works."""
        from penguincode_cli.client import bootstrap
        import httpx

        cache_dir = tmp_path / ".penguincode"
        cache_file = cache_dir / "config.cache"
        monkeypatch.setattr(bootstrap, "_CACHE_DIR", cache_dir)
        monkeypatch.setattr(bootstrap, "_CACHE_FILE", cache_file)

        # First provision via REST to get real data and populate the cache
        online_config = await _provision(api_client)
        assert cache_file.exists(), "Provisioning should write the cache file"

        # Now simulate being offline: network calls fail, but cache should be used
        async def _raise_http_error(*args, **kwargs):
            raise httpx.HTTPError("simulated offline error")

        monkeypatch.setattr(bootstrap.httpx.AsyncClient, "post", _raise_http_error)

        offline_config = await _provision(api_client)
        assert offline_config == online_config

    async def test_offline_cache_miss_raises_runtime_error(
        self, api_client, tmp_path, monkeypatch
    ):
        """When both network and cache fail, provision should raise RuntimeError."""
        from penguincode_cli.client import bootstrap
        import httpx
        import pytest

        cache_dir = tmp_path / ".penguincode"
        cache_file = cache_dir / "config.cache"
        monkeypatch.setattr(bootstrap, "_CACHE_DIR", cache_dir)
        monkeypatch.setattr(bootstrap, "_CACHE_FILE", cache_file)

        # Ensure there's no cache present
        if cache_file.exists():
            cache_file.unlink()
        if cache_dir.exists():
            # remove empty cache dir so we cover the "no cache at all" branch
            try:
                cache_dir.rmdir()
            except OSError:
                # directory not empty — tests may adjust this as needed
                pass

        # Simulate network failure in bootstrap.provision
        async def _raise_http_error(*args, **kwargs):
            raise httpx.HTTPError("simulated offline error")

        monkeypatch.setattr(bootstrap.httpx.AsyncClient, "post", _raise_http_error)

        with pytest.raises(RuntimeError) as excinfo:
            await _provision(api_client)

        # Ensure we hit the "no cache + offline" error path
        msg = str(excinfo.value)
        assert msg, "RuntimeError message should not be empty"
        assert "cache" in msg.lower() or "offline" in msg.lower()

```

1. If `httpx` or `pytest` are already imported at the top of `tests/test_integration.py`, you can remove the inline imports inside the test methods and rely on the module-level imports instead.
2. Adjust the final assertion on the error message in `test_offline_cache_miss_raises_runtime_error` to match the exact error text raised by `bootstrap.provision`, if your codebase specifies a particular message.
3. If `_provision(api_client)` is not the correct helper to trigger `bootstrap.provision` (e.g., if it requires additional arguments), update the calls in both tests accordingly.
</issue_to_address>

### Comment 8
<location> `penguincode_cli/client/bootstrap.py:60-65` </location>
<code_context>
            result = subprocess.run(
                [nvidia_smi, "--query-gpu=memory.total,name", "--format=csv,noheader,nounits"],
                capture_output=True,
                text=True,
                timeout=5,
            )
</code_context>

<issue_to_address>
**security (python.lang.security.audit.dangerous-subprocess-use-audit):** Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

*Source: opengrep*
</issue_to_address>

### Comment 9
<location> `penguincode_cli/client/model_manager.py:38-43` </location>
<code_context>
        result = subprocess.run(
            [ollama, "pull", name],
            capture_output=True,
            text=True,
            timeout=600,
        )
</code_context>

<issue_to_address>
**security (python.lang.security.audit.dangerous-subprocess-use-audit):** Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

*Source: opengrep*
</issue_to_address>

### Comment 10
<location> `penguincode_cli/client/org_manager.py:47-52` </location>
<code_context>
                    subprocess.run(
                        [git or "git", "pull", "--ff-only"],
                        cwd=str(repo_dir),
                        capture_output=True,
                        timeout=60,
                    )
</code_context>

<issue_to_address>
**security (python.lang.security.audit.dangerous-subprocess-use-audit):** Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

*Source: opengrep*
</issue_to_address>

### Comment 11
<location> `penguincode_cli/client/org_manager.py:65` </location>
<code_context>
                    subprocess.run(cmd, capture_output=True, timeout=120)
</code_context>

<issue_to_address>
**security (python.lang.security.audit.dangerous-subprocess-use-audit):** Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

*Source: opengrep*
</issue_to_address>

### Comment 12
<location> `penguincode_cli/server/models/config_store.py:382-385` </location>
<code_context>
        await self._db.execute(
            f"INSERT OR REPLACE INTO {table} ({key_col}, data) VALUES (?, ?)",
            (key, json.dumps(data)),
        )
</code_context>

<issue_to_address>
**security (python.sqlalchemy.security.sqlalchemy-execute-raw-query):** Avoiding SQL string concatenation: untrusted input concatenated with raw SQL query can result in SQL Injection. In order to execute raw query safely, prepared statement should be used. SQLAlchemy provides TextualSQL to easily used prepared statement with named parameters. For complex SQL composition, use SQL Expression Language or Schema Definition Language. In most cases, SQLAlchemy ORM will be a better option.

*Source: opengrep*
</issue_to_address>

### Comment 13
<location> `penguincode_cli/server/models/config_store.py:405-407` </location>
<code_context>
        cursor = await self._db.execute(
            f"DELETE FROM {table} WHERE {key_col} = ?", (key,),
        )
</code_context>

<issue_to_address>
**security (python.sqlalchemy.security.sqlalchemy-execute-raw-query):** Avoiding SQL string concatenation: untrusted input concatenated with raw SQL query can result in SQL Injection. In order to execute raw query safely, prepared statement should be used. SQLAlchemy provides TextualSQL to easily used prepared statement with named parameters. For complex SQL composition, use SQL Expression Language or Schema Definition Language. In most cases, SQLAlchemy ORM will be a better option.

*Source: opengrep*
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-02-10T23:23:53Z

penguincode_cli/server/models/config_store.py

+        for skill in _default_skills():
+            await self._upsert("skills", skill.name, asdict(skill))


suggestion (bug_risk): Default skills are seeded without loading their markdown content from the defaults directory.

_default_skills currently sets content_md to "" and seed_defaults never reads from the new penguincode_cli/defaults/skills/*.md files, so seeded skills end up with empty content and write_skills falls back to placeholders. To actually ship the curated prompts, seed_defaults (or equivalent) should load the corresponding penguincode_cli/defaults/skills/<name>.md file for each skill and assign it to content_md before upserting.

Suggested implementation:

import logging from pathlib import Path

from dataclasses import asdict, replace

for mcp in _default_mcp_servers(): await self._upsert("mcp_servers", mcp.name, asdict(mcp)) # Load curated markdown content for default skills, if available. skills_defaults_dir = ( Path(__file__).resolve().parent.parent.parent / "defaults" / "skills" ) for skill in _default_skills(): content_md = "" skill_md_path = skills_defaults_dir / f"{skill.name}.md" if skill_md_path.is_file(): try: content_md = skill_md_path.read_text(encoding="utf-8") except OSError: logger.warning( "Failed to read default skill markdown for %s from %s", skill.name, skill_md_path, ) # If we successfully loaded content, override the empty content_md if content_md: skill = replace(skill, content_md=content_md) await self._upsert("skills", skill.name, asdict(skill)) for path in _default_instructions():

If config_store.py does not already import logging or from dataclasses import asdict exactly as shown, adjust the import search/replace blocks to match the existing import style.

Confirm that penguincode_cli/defaults/skills/<name>.md filenames exactly match skill.name; if they differ (e.g. kebab-case vs snake_case), insert the appropriate name-to-filename mapping in the skill_md_path construction.

sourcery-ai · 2026-02-10T23:23:53Z

penguincode_cli/server/models/config_store.py

+        # Build agents dict (name -> config) for the response
+        agents_dict = {}
+        for a in agents_raw:
+            agents_dict[a["name"]] = {
+                "model": a["model"],
+                "mode": a.get("mode", "subagent"),
+            }


suggestion: Provisioned agent configs drop useful fields like description, prompt_file, and escalation_model.

AgentDef exposes richer metadata (description, prompt_file, tools_disabled, escalation_model) that isn’t surfaced in the provision response, where we only return model and mode. Exposing at least description (and possibly prompt path / tooling constraints) in the agents block would let clients (e.g., config_writer.write_agent_prompts and related tools) derive tier-specific behavior without hardcoding agent semantics.

Suggested change

# Build agents dict (name -> config) for the response

agents_dict = {}

for a in agents_raw:

agents_dict[a["name"]] = {

"model": a["model"],

"mode": a.get("mode", "subagent"),

}

# Build agents dict (name -> config) for the response

agents_dict = {}

for a in agents_raw:

agents_dict[a["name"]] = {

"model": a["model"],

"mode": a.get("mode", "subagent"),

"description": a.get("description"),

"prompt_file": a.get("prompt_file"),

"tools_disabled": a.get("tools_disabled"),

"escalation_model": a.get("escalation_model"),

}

sourcery-ai · 2026-02-10T23:23:53Z

penguincode_cli/server/services/provision.py

+def _filter_models_by_gpu(models: list[dict], vram_mb: int) -> list[dict]:
+    """Adjust model requirements based on client GPU VRAM."""
+    if vram_mb <= 0:
+        return models
+
+    filtered = []
+    for m in models:
+        entry = dict(m)
+        # If client VRAM is small, mark large models as not required
+        if vram_mb < 4096 and "13b" in m["name"]:
+            entry["required"] = False
+        elif vram_mb < 8192 and "34b" in m["name"]:
+            entry["required"] = False
+        elif vram_mb < 16384 and "70b" in m["name"]:
+            entry["required"] = False
+        filtered.append(entry)
+    return filtered


suggestion: GPU-based model filtering relies on name substrings and ignores the vram_estimate metadata you store.

Since OllamaModelDef already exposes vram_estimate_mb, _filter_models_by_gpu can rely on that instead of hardcoded name patterns like "13b" / "34b" / "70b". Using m.get("vram_estimate_mb") (with a reasonable fallback) and comparing to vram_mb would make the requirement tiering metadata-driven, resilient to naming changes, and easier to extend to new models.

Suggested change

def _filter_models_by_gpu(models: list[dict], vram_mb: int) -> list[dict]:

"""Adjust model requirements based on client GPU VRAM."""

if vram_mb <= 0:

return models

filtered = []

for m in models:

entry = dict(m)

# If client VRAM is small, mark large models as not required

if vram_mb < 4096 and "13b" in m["name"]:

entry["required"] = False

elif vram_mb < 8192 and "34b" in m["name"]:

entry["required"] = False

elif vram_mb < 16384 and "70b" in m["name"]:

entry["required"] = False

filtered.append(entry)

return filtered

def _filter_models_by_gpu(models: list[dict], vram_mb: int) -> list[dict]:

"""Adjust model requirements based on client GPU VRAM.

Uses the model's `vram_estimate_mb` metadata (if present) instead of

hardcoded name patterns to decide whether a model should be required.

"""

if vram_mb <= 0:

return models

filtered: list[dict] = []

for m in models:

entry = dict(m)

vram_estimate = entry.get("vram_estimate_mb")

# If we have a numeric VRAM estimate and the client's VRAM is below it,

# mark the model as not required. If no estimate is present, leave as-is.

if isinstance(vram_estimate, (int, float)) and vram_estimate > 0:

if vram_mb < vram_estimate:

entry["required"] = False

filtered.append(entry)

return filtered

sourcery-ai · 2026-02-10T23:23:53Z

penguincode_cli/client/model_manager.py

+async def ensure_models(
+    models: list[dict[str, Any]],
+    api_url: str = "http://localhost:11434",
+) -> None:
+    """Ensure all listed models are available in Ollama.
+
+    Required models are pulled synchronously (blocking).
+    Optional models are pulled in background tasks.
+    """
+    required = [m for m in models if m.get("required", True)]


issue (bug_risk): Model pulls use the local ollama CLI and ignore the configured Ollama API URL.

In ensure_models, model existence is checked via the HTTP API (api_url), but pulls use _ollama_pull_sync, which always calls the local ollama CLI and ignores api_url. This will fail for remote/containerised Ollama instances without a local CLI. Please either perform pulls via the HTTP API (e.g. POST /api/pull) or make the CLI call honor a configured endpoint (e.g. env var or api_url), so behavior is consistent in non-local setups.

sourcery-ai · 2026-02-10T23:23:53Z

penguincode_cli/server/main.py

        logger.info("PenguinCode gRPC Server started")

+        # --- REST API -------------------------------------------------------
+        jwt_secret = self.settings.auth.jwt_secret or secrets.token_hex(32)


question (bug_risk): Randomly generating a JWT secret when none is configured makes the REST admin API effectively unusable across restarts.

Because a new random secret is generated whenever settings.auth.jwt_secret is empty, admin JWTs become invalid on every restart and external clients cannot construct valid tokens unless they can read the in-memory secret. If the goal is to disable admin APIs when no secret is configured, consider failing fast or skipping admin route registration instead of silently randomizing. Otherwise, derive the secret from config/env so it remains stable across restarts.

sourcery-ai · 2026-02-10T23:23:54Z

penguincode_cli/client/model_manager.py

+        result = subprocess.run(
+            [ollama, "pull", name],
+            capture_output=True,
+            text=True,
+            timeout=600,
+        )


security (python.lang.security.audit.dangerous-subprocess-use-audit): Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

Source: opengrep

sourcery-ai · 2026-02-10T23:23:54Z

penguincode_cli/client/org_manager.py

+                    subprocess.run(
+                        [git or "git", "pull", "--ff-only"],
+                        cwd=str(repo_dir),
+                        capture_output=True,
+                        timeout=60,
+                    )


security (python.lang.security.audit.dangerous-subprocess-use-audit): Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

Source: opengrep

sourcery-ai · 2026-02-10T23:23:54Z

penguincode_cli/client/org_manager.py

+                        cmd = [gh, "repo", "clone", f"{org}/{repo}", str(repo_dir)]
+                    else:
+                        cmd = [git or "git", "clone", clone_url, str(repo_dir)]
+                    subprocess.run(cmd, capture_output=True, timeout=120)


security (python.lang.security.audit.dangerous-subprocess-use-audit): Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

Source: opengrep

sourcery-ai · 2026-02-10T23:23:54Z

penguincode_cli/server/models/config_store.py

+        await self._db.execute(
+            f"INSERT OR REPLACE INTO {table} ({key_col}, data) VALUES (?, ?)",
+            (key, json.dumps(data)),
+        )


security (python.sqlalchemy.security.sqlalchemy-execute-raw-query): Avoiding SQL string concatenation: untrusted input concatenated with raw SQL query can result in SQL Injection. In order to execute raw query safely, prepared statement should be used. SQLAlchemy provides TextualSQL to easily used prepared statement with named parameters. For complex SQL composition, use SQL Expression Language or Schema Definition Language. In most cases, SQLAlchemy ORM will be a better option.

Source: opengrep

sourcery-ai · 2026-02-10T23:23:54Z

penguincode_cli/server/models/config_store.py

+        cursor = await self._db.execute(
+            f"DELETE FROM {table} WHERE {key_col} = ?", (key,),
+        )


security (python.sqlalchemy.security.sqlalchemy-execute-raw-query): Avoiding SQL string concatenation: untrusted input concatenated with raw SQL query can result in SQL Injection. In order to execute raw query safely, prepared statement should be used. SQLAlchemy provides TextualSQL to easily used prepared statement with named parameters. For complex SQL composition, use SQL Expression Language or Schema Definition Language. In most cases, SQLAlchemy ORM will be a better option.

Source: opengrep

socket-security · 2026-02-10T23:24:57Z

No dependency changes detected. Learn more about Socket for GitHub.

👍 No dependency changes detected in pull request

…oke tests - Dockerfile.server: Fix package path (penguincode → penguincode_cli), fix CMD module path, expose REST port 8080 - Add server/__main__.py for python -m penguincode_cli.server - Add missing ServiceAccount template to Helm chart - Add REST API port (8080) to Service and Deployment specs - Update Helm test to use project image (avoids Docker Hub pulls in air-gapped clusters), add REST health + provision smoke tests - Include Helm chart, Makefile, k8s smoke test scripts, CI workflow Verified on dal2-beta: gRPC connection, REST health, provision endpoint all passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…x test imports - CI workflow: update ruff/mypy/pytest paths from penguincode/ to penguincode_cli/ - CI workflow: make vsix-extension eslint and mypy continue-on-error (pre-existing debt) - K8s smoke test: export microk8s kubeconfig for GH Actions runner permissions - Alpha smoke script + Makefile: use kubectl instead of microk8s kubectl - Ruff config: add targeted ignores for N802/N806/SIM105/SIM117, per-file-ignores - Auto-fix 568 ruff errors (UP006/UP045/UP035/I001/F401/F541/UP004/UP015) - Manual-fix 31 ruff errors (B904/F841/E741/SIM102/SIM118/B007/B017/C416) - Fix all test imports from old penguincode.* to penguincode_cli.* - Add pytest.skip guards for tests referencing removed APIs (agents, config, memory, ollama) - All 212 tests pass, 8 gracefully skipped, 0 failures Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ld step - Alpha smoke: remove manual namespace creation, use --create-namespace - Alpha smoke: add Docker build step to build and import image into microk8s - Fix Windows CI: use tempfile.gettempdir() instead of hardcoded /tmp/ in debug.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Remove namespace.yaml template (conflicts with --create-namespace) - Let Helm manage namespace creation via --create-namespace flag - Add concurrency group to prevent duplicate runs from racing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tches - Add from_dict() classmethods to Message, GenerateResponse, ChatResponse in ollama/types.py (were expected by tests but never implemented) - Extract _ensure_client() method in ollama/client.py for proper validation - Tighten test_ollama.py guard to catch only ImportError (was silently skipping all 16 tests due to overly broad exception handling) - Add missing intent patterns ("tell me about", "difference between", "compare") to agents/intent.py for researcher routing - Fix config.yaml model names to match installed Ollama models (llama3.2:3b → llama3.2:latest, deepseek-coder → llama3.1:8b) - Add live integration tests against Ollama (15 tests, 3 scenarios) - Update README quickstart with both client options (native chat vs OpenCode bootstrap) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Avoids global pip install issues across Linux, macOS, and Windows. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The wrapper script now auto-creates the venv and installs deps on first run, removing the need for manual setup. README Quick Start simplified. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mem0's OllamaEmbedding imports the ollama package but doesn't declare it as a hard dependency, instead attempting a runtime auto-install that fails on PEP 668 systems (externally-managed-environment). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Use sys.executable -m pip instead of bare pip to ensure dependency installation targets the active venv, not the PEP 668 protected system Python. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Detects when setup runs outside a virtual environment and skips the pip install step to avoid PEP 668 errors on modern Debian/Ubuntu. Shows a message directing the user to ./penguincode instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Loop detection and consecutive error guards already handle runaways, so the low iteration caps were just cutting off legitimate multi-step work. Executor/debugger: 50, tester/refactor: 40, others: 30. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Every AgentResult now includes a categorized summary (completed/errors) built from the tool call log. The foreman uses this to decide whether to spawn a continuation agent or report results. Replaces the max-iterations-only hack with a universal summary in BaseAgent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…er upgrade Route explicit "plan" requests (e.g., "create a plan for...") to the planner agent before researcher/executor patterns can match. Persist plans to ~/.config/penguincode/plans/ as human-readable .plan files so state survives crashes. Auto-upgrade complex executor tasks to planner to protect smaller models from being overwhelmed by multi-step work. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The old loop detector only caught "AAA" patterns (3 identical consecutive calls). The agent was stuck in a "bash(run), read(file), bash(run), read(file)" cycle that never triggered detection because consecutive calls differed. Now detects repeating cycles of length 2-4 (ABAB, ABCABC). Also expand complexity patterns to classify "website", "web app", "finish my", "build my" as complex — these multi-file tasks should route through the planner instead of overwhelming small executor models. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When the planner LLM doesn't output explicit PARALLEL_GROUPS (common with smaller models), the fallback was putting every step in its own group — making execution fully sequential. Now uses topological level assignment: steps with no dependencies run together in group 1, steps depending only on group 1 go in group 2, etc. Maximizes parallelism while respecting dependency ordering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Phase 1: Skills can specify a preferred LLM model in frontmatter (model: field). ChatAgent saves/restores its model on skill activate/deactivate. Phase 2: /config command for viewing and modifying runtime settings with dot-path traversal, auto-type-casting, save/reset support. Phase 3: 38 new PenguinCode-specific skills covering git, testing, Docker, Kubernetes, CI/CD, code quality, infrastructure, and workflow operations. Expanded suggest_skill() keyword matching. Updated config_store defaults to include all 51 skills. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Comprehensive test coverage for all skill system components: - SkillLoader: discovery (51 skills), frontmatter parsing, model override - ChatAgent: model save/restore across activate/deactivate cycles - Intent: suggest_skill() keyword matching for all 51 skills - Config: get/set/save utilities with type casting - ConfigStore: default skills sync with discovered skills - Cross-references: chain resolution, deduplication, broken ref detection Also fixes keyword matching order in suggest_skill() to prevent false positives (cherry-pick vs commit, ci vs audit, design vs api). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add three major capabilities: 1. **MCP Tool Discovery & Injection** — MCPToolManager lazily discovers tools from configured MCP servers and injects them into all agents (Explorer, Executor, Researcher) via MCPToolWrapper(BaseTool). Tools are namespaced as mcp_{server}_{tool} to avoid collisions. Graceful degradation per-server; MCP initialize handshake added. 2. **Organizational Config Pull** — OrgConfigClient fetches MCP servers, skills, and model configs from a management API at startup. Local config takes priority on name collision when merging. 3. **Shared-Key Authentication** — Teams set PENGUINCODE_SHARED_KEY on both server and client; the client exchanges it for a JWT automatically. No API key distribution needed. New files: wrapper.py, manager.py, org_config.py Tests: 33 new tests in test_mcp_tools.py (0 regressions) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add Helm values-alpha/beta, Kustomize overlays (alpha/beta), manifests, and deploy-beta.sh script for consistent k8s deployment across all repos. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Clean up unnecessary README, quick-reference, and summary files from k8s/ directories. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…localhost.local Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

5 YAML form templates (bug, feature, chore, docs, security) with required labels, priority/component dropdowns, and acceptance criteria. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@sha256

- Docker FROM lines: add @sha256 digests for all external base images - GitHub Actions: pin uses: to commit SHAs (not mutable version tags) - Trivy: standardize to trivy-action@v0.35.0 with trivy-version=v0.69.3 - package.json: remove ^ and ~ version prefixes (exact versions) - requirements.txt: flag files needing pip-compile --generate-hashes migration - README/docs: update Trivy version references and supply chain notes Follows updated immutable dependency standards in .claude/rules/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

PenguinzTech and others added 2 commits February 10, 2026 12:27

sourcery-ai bot reviewed Feb 10, 2026

View reviewed changes

PenguinzTech and others added 25 commits February 11, 2026 15:16

Add venv instructions to README quickstart

5d7fe71

Avoids global pip install issues across Linux, macOS, and Windows. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Auto-bootstrap venv in penguincode shell script

ea66f33

The wrapper script now auto-creates the venv and installs deps on first run, removing the need for manual setup. README Quick Start simplified. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix setup command to use venv pip instead of system pip

ad24192

Use sys.executable -m pip instead of bare pip to ensure dependency installation targets the active venv, not the PEP 668 protected system Python. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add standardized k8s deployment configuration

fa494b7

Add Helm values-alpha/beta, Kustomize overlays (alpha/beta), manifests, and deploy-beta.sh script for consistent k8s deployment across all repos. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Updating documentation to new base standard

2e4c367

Updating documentation to new base standard

2f2f21e

Remove extra k8s documentation files

ca28345

Clean up unnecessary README, quick-reference, and summary files from k8s/ directories. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Updating documentation to new base standard

8ad9f3c

fix(k8s): standardize ingress TLDs - beta=.penguintech.cloud, alpha=.…

b49e206

…localhost.local Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove app.md template (replaced by {subject}.local.md convention)

399e8e5

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

PenguinzTech and others added 16 commits February 19, 2026 09:03

Updating documentation to new base standard

30f2a68

Updating documentation to new base standard

7558d35

Updating documentation to new base standard

7f2e192

Updating documentation to new base standard

8948601

Updating documentation to new base standard

71ec0a5

Add standardized GitHub issue templates

eba393d

5 YAML form templates (bug, feature, chore, docs, security) with required labels, priority/component dropdowns, and acceptance criteria. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Updating documentation to new base standard

c5ef59f

Updating documentation to new base standard

44ca48c

Updating documentation to new base standard

ad0234d

Updating documentation to new base standard

9440990

Updating documentation to new base standard

efc9c28

Updating documentation to new base standard

30a6a52

Updating documentation to new base standard

c020797

periodic save

f582c79

Updating documentation to new base standard

64aa020

		for skill in _default_skills():
		await self._upsert("skills", skill.name, asdict(skill))

Conversation

PenguinzTech commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Results

Test plan

Uh oh!

sourcery-ai bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for client bootstrap and provisioning flow

Entity relationship diagram for ConfigStore SQLite schema

Class diagram for ConfigStore and REST service modules

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

socket-security bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PenguinzTech commented Feb 10, 2026 •

edited

Loading

sourcery-ai bot commented Feb 10, 2026 •

edited

Loading

socket-security bot commented Feb 10, 2026 •

edited

Loading