Skip to content

System Architecture

SpdrByte edited this page Mar 4, 2026 · 1 revision

System Architecture

Gemma CLI is designed with a Modular Library Architecture. By separating the "Brain" (API logic), the "Body" (System tools), and the "Voice" (UI rendering), the workstation remains stable, fast, and easy to extend.


🏗️ Core Architectural Components

The workstation is divided into the main entry point and four specialized libraries.

1. GemmaCLI.ps1 (The Orchestrator)

The heartbeat of the system. It manages the main interactive loop, handles user input, parses model responses, and coordinates between all other modules. It is responsible for:

  • Global state management (API keys, settings, history).
  • Routing /commands to their respective logic.
  • The "Tool Execution Loop" (Parsing XML -> Requesting Permission -> Executing).

2. lib/Api.ps1 (The Reasoning Layer)

Manages all communication with the Google Gemini API.

  • Asynchronous Jobs: API calls run in background jobs so the UI (spinner) remains responsive.
  • Rate Limiting (RPM): Implements independent quota buckets for Gemma and Gemini backends.
  • Dual-Agent Pipelines: Contains the logic for bigBrother and littleSister chaining.
  • Automatic Retries: Gracefully handles 429 Resource Exhausted errors with exponential backoff.

3. lib/ToolLoader.ps1 (The Capability Layer)

Dynamically expands Gemma's brain by discovering tools on disk.

  • Auto-Registration: Scans the tools/ folder and populates the $script:TOOLS registry.
  • Tiered Guidance Engine: Injects either ToolUseGuidanceMajor or ToolUseGuidanceMinor into the system prompt based on the active model's intelligence tier.

4. lib/History.ps1 (The Memory Layer)

Manages the conversation's context window to prevent crashes and high token costs.

  • Smart Trim: Uses semantic embeddings (gemini-embedding-001) to score history turns. It keeps only the most relevant context for your current query.
  • Role Alternation: Ensures the history always follows the strict User -> Model pattern required by the API.

5. lib/UI.ps1 (The Rendering Layer)

Handles the "look and feel" of the workstation.

  • Custom Rendering: Provides the Draw-Box and Show-ArrowMenu functions.
  • Status Bar: A real-time tracker for token usage, model type, and context pressure.
  • Async Spinner: A thread-safe loading indicator that doesn't stutter during API calls.

🔄 Data Flow: A Single Turn

  1. User Input: You type a request in the terminal.
  2. Prompt Assembly: GemmaCLI.ps1 gathers the history and uses ToolLoader to build the list of available tools.
  3. API Call: Api.ps1 sends the payload to Google.
  4. Response Parsing: GemmaCLI.ps1 detects if the model wants to use a tool (via <tool_call> tags).
  5. Permission: The UI asks you to Allow/Deny.
  6. Execution: The tool runs (either in a job or main session for GUI tasks like clipboard).
  7. Synthesis: The tool result is fed back to the model for a final human-friendly response.

🔒 Security Architecture: DPAPI

Gemma CLI uses Windows Data Protection API (DPAPI) to store your API key.

  • The key is encrypted using your Windows User SID as the primary key.
  • Result: Even if a hacker steals your apikey.xml file, they cannot decrypt it on another machine or under a different user account.

Next Steps: Dive into the Tool Development Guide to start building your own capabilities.

Clone this wiki locally