Skip to content

Comments

feat: Enhance tool calling, classification, and progress reporting#55

Merged
veerareddyvishal144 merged 1 commit intoFast-Editor:feature/model-routerfrom
MichaelAnders:feature/model-router
Feb 23, 2026
Merged

feat: Enhance tool calling, classification, and progress reporting#55
veerareddyvishal144 merged 1 commit intoFast-Editor:feature/model-routerfrom
MichaelAnders:feature/model-router

Conversation

@MichaelAnders
Copy link
Contributor

CRITICAL: All test scripts now require NODE_ENV=test environment variable. This ensures tests run in test mode with proper isolation from production code paths (e.g., disables live API calls, mocking, test fixtures). See package.json test:* scripts.

Apply comprehensive improvements to Lynkr's tool execution pipeline, including per-model tool parsers (based on vLLM), LLM-based classification, real-time progress monitoring, and advanced agent routing with dual-provider support for cloud-based tool execution.

Key Changes

Test Infrastructure

  • NODE_ENV=test: Added to ALL test scripts (test:unit, test:memory, etc.)
    • Ensures isolated test environment without production side effects
    • Enables test fixtures and mocking frameworks
    • Prevents accidental API calls during testing
    • IMPORTANT: This is a breaking change if tests are run without this var

New Major Features

Tool Calling & Parsing (vLLM-Inspired)

  • Per-model tool parsers for GLM-4.7, Qwen3, and generic models
    • Implementation follows vLLM's ToolParser hierarchy (Apache 2.0 license)
    • GLM-4.7 parser: Handles native XML format + fallback patterns
    • Qwen3 parser: Markdown extraction with robust error handling
    • Generic parser: Extensible base for any model format
  • Ollama fallback handling for malformed responses
  • Tool call deduplication and cleaning

Dual-Provider Tool Execution

  • TOOL_EXECUTION_PROVIDER: Route tool calls to specialized providers
    • Enables using cheap/fast/local models for chat while using reliable models (Claude Sonnet) for tool calling
    • Reduces token usage and improves tool accuracy
  • TOOL_EXECUTION_COMPARE_MODE: Compare tool calls from both providers
  • OLLAMA_CLOUD_ENDPOINT: Support for cloud-based Ollama models
    • Enables "cloud-only" setups without local Ollama
    • Automatic routing: cloud models use cloud endpoint
    • Hybrid support: mix local and cloud models in same session

Tool Classification

  • LLM-based tool needs classification (whitelist + LLM fallback)
  • Per-model classification accuracy with pattern matching
  • Tool execution provider routing based on classification
  • Workspace access permission system for external file operations

Progress Reporting (Real-Time Monitoring)

  • WebSocket server (port 8765) broadcasting execution events
  • Progress events: agent loop, model invocation, tool execution
  • Built-in Python listener (tools/progress-listener.py)
    • Color-coded output with timestamps
    • Agent hierarchy tracking (parent/child relationships)
    • Token and duration metrics
    • Remote monitoring support
  • Event tracking for debugging and observability

Configuration Enhancements

New environment variables (see .env.example for defaults):

  • OLLAMA_CLOUD_ENDPOINT: Cloud Ollama instance URL
  • OLLAMA_API_KEY: Cloud Ollama API authentication
  • TOOL_EXECUTION_PROVIDER: Provider for tool calling decisions
  • TOOL_EXECUTION_MODEL: Model override for tool execution
  • TOOL_EXECUTION_COMPARE_MODE: Enable provider comparison
  • POLICY_MAX_DURATION_MS: Single agent loop turn timeout
  • POLICY_TOOL_LOOP_THRESHOLD: Max tool results before termination
  • POLICY_MAX_TOOL_CALLS_PER_REQUEST: Parallel tool call limit
  • TOOL_NEEDS_CLASSIFICATION_*: Classification whitelist and LLM config

Files Changed

61 files modified (9,061 insertions, 432 deletions):

Test Infrastructure

  • package.json: NODE_ENV=test on all test:* scripts

Core Parser System (vLLM-Based)

  • src/parsers/base-tool-parser.js: Base class hierarchy
  • src/parsers/glm47-tool-parser.js: GLM-4.7 tool parsing
  • src/parsers/generic-tool-parser.js: Extensible generic parser
  • src/parsers/index.js: Parser registry and selection

Tool Execution & Classification

  • src/tools/tool-call-cleaner.js: Response cleanup and deduplication
  • src/tools/tool-classification-*.js: Classification system
  • src/agents/tool-agent-mapper.js: Tool-agent relationship mapping

Provider & Routing

  • src/clients/ollama-utils.js: Dual endpoint support (local + cloud)
  • src/api/router.js: Provider routing and conversion
  • src/providers/context-window.js: NEW - Context detection

Progress & Observability

  • src/progress/server.js: NEW - WebSocket server
  • src/progress/emitter.js: NEW - Event broadcasting
  • src/progress/client.js: NEW - Client monitoring
  • tools/progress-listener.py: NEW - Python listener tool

Configuration & Documentation

  • .env.example: OLLAMA_CLOUD_ENDPOINT, TOOL_EXECUTION_, POLICY_
  • config/tool-whitelist-*.json: Classification patterns

Tests (14 new files, 490/490 passing)

  • Tool parser tests (GLM, Qwen3, generic)
  • Tool classification and accuracy tests
  • Dual endpoint and cloud Ollama tests
  • Tool execution provider tests with comparison mode
  • Subagent auto-spawning tests
  • Progress reporting integration tests

Attribution

  • Per-model tool parsers: Based on vLLM's tool calling implementation (Apache License 2.0, https://github.com/vllm-project/vllm)
  • Progress reporting: Real-time WebSocket event system
  • Agent routing: Dual-provider architecture for cost optimization

on Ollama

CRITICAL: All test scripts now require NODE_ENV=test environment variable.
This ensures tests run in test mode with proper isolation from production
code paths (e.g., disables live API calls, mocking, test fixtures).
See package.json test:* scripts.

Apply comprehensive improvements to Lynkr's tool execution pipeline,
including per-model tool parsers (based on vLLM), LLM-based classification,
real-time progress monitoring, and advanced agent routing with dual-provider
support for cloud-based tool execution.

## Key Changes

### Test Infrastructure
- NODE_ENV=test: Added to ALL test scripts (test:unit, test:memory, etc.)
  * Ensures isolated test environment without production side effects
  * Enables test fixtures and mocking frameworks
  * Prevents accidental API calls during testing
  * IMPORTANT: This is a breaking change if tests are run without this var

## New Major Features

### Tool Calling & Parsing (vLLM-Inspired)
- Per-model tool parsers for GLM-4.7, Qwen3, and generic models
  * Implementation follows vLLM's ToolParser hierarchy (Apache 2.0 license)
  * GLM-4.7 parser: Handles native XML format + fallback patterns
  * Qwen3 parser: Markdown extraction with robust error handling
  * Generic parser: Extensible base for any model format
- Ollama fallback handling for malformed responses
- Tool call deduplication and cleaning

### Dual-Provider Tool Execution
- TOOL_EXECUTION_PROVIDER: Route tool calls to specialized providers
  * Enables using cheap/fast/local models for chat while using
    reliable models (Claude Sonnet) for tool calling
  * Reduces token usage and improves tool accuracy
- TOOL_EXECUTION_COMPARE_MODE: Compare tool calls from both providers
- OLLAMA_CLOUD_ENDPOINT: Support for cloud-based Ollama models
  * Enables "cloud-only" setups without local Ollama
  * Automatic routing: cloud models use cloud endpoint
  * Hybrid support: mix local and cloud models in same session

### Tool Classification
- LLM-based tool needs classification (whitelist + LLM fallback)
- Per-model classification accuracy with pattern matching
- Tool execution provider routing based on classification
- Workspace access permission system for external file operations

### Progress Reporting (Real-Time Monitoring)
- WebSocket server (port 8765) broadcasting execution events
- Progress events: agent loop, model invocation, tool execution
- Built-in Python listener (tools/progress-listener.py)
  * Color-coded output with timestamps
  * Agent hierarchy tracking (parent/child relationships)
  * Token and duration metrics
  * Remote monitoring support
- Event tracking for debugging and observability

## Configuration Enhancements

New environment variables (see .env.example for defaults):
- OLLAMA_CLOUD_ENDPOINT: Cloud Ollama instance URL
- OLLAMA_API_KEY: Cloud Ollama API authentication
- TOOL_EXECUTION_PROVIDER: Provider for tool calling decisions
- TOOL_EXECUTION_MODEL: Model override for tool execution
- TOOL_EXECUTION_COMPARE_MODE: Enable provider comparison
- POLICY_MAX_DURATION_MS: Single agent loop turn timeout
- POLICY_TOOL_LOOP_THRESHOLD: Max tool results before termination
- POLICY_MAX_TOOL_CALLS_PER_REQUEST: Parallel tool call limit
- TOOL_NEEDS_CLASSIFICATION_*: Classification whitelist and LLM config

## Files Changed

61 files modified (9,061 insertions, 432 deletions):

### Test Infrastructure
- package.json: NODE_ENV=test on all test:* scripts

### Core Parser System (vLLM-Based)
- src/parsers/base-tool-parser.js: Base class hierarchy
- src/parsers/glm47-tool-parser.js: GLM-4.7 tool parsing
- src/parsers/generic-tool-parser.js: Extensible generic parser
- src/parsers/index.js: Parser registry and selection

### Tool Execution & Classification
- src/tools/tool-call-cleaner.js: Response cleanup and deduplication
- src/tools/tool-classification-*.js: Classification system
- src/agents/tool-agent-mapper.js: Tool-agent relationship mapping

### Provider & Routing
- src/clients/ollama-utils.js: Dual endpoint support (local + cloud)
- src/api/router.js: Provider routing and conversion
- src/providers/context-window.js: NEW - Context detection

### Progress & Observability
- src/progress/server.js: NEW - WebSocket server
- src/progress/emitter.js: NEW - Event broadcasting
- src/progress/client.js: NEW - Client monitoring
- tools/progress-listener.py: NEW - Python listener tool

### Configuration & Documentation
- .env.example: OLLAMA_CLOUD_ENDPOINT, TOOL_EXECUTION_*, POLICY_*
- config/tool-whitelist-*.json: Classification patterns

### Tests (14 new files, 490/490 passing)
- Tool parser tests (GLM, Qwen3, generic)
- Tool classification and accuracy tests
- Dual endpoint and cloud Ollama tests
- Tool execution provider tests with comparison mode
- Subagent auto-spawning tests
- Progress reporting integration tests

## Attribution

- **Per-model tool parsers**: Based on vLLM's tool calling implementation
  (Apache License 2.0, https://github.com/vllm-project/vllm)
- **Progress reporting**: Real-time WebSocket event system
- **Agent routing**: Dual-provider architecture for cost optimization

Co-Authored-By: Claude Haiku 4.5, Sonnet 4.6, Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: GLM-4.7-cloud <noreply@z.ai> on Ollama
@MichaelAnders
Copy link
Contributor Author

Two things:

  1. vLLM tool calling implementations are added - please adjust if needed the code to reflect it properly (I mentioned vLLM in "Attribution)
  2. With these changes GLM-4.7 can be used for some code analysis&corrections - sometimes it gets stuck ("Let me do XYZ..." responses) which can be overcome with "do XYZ" - that is a WIP I'll look into (several ideas), but 1st this has to be merged. Then other models will be enabled using the vLLM parser implementations which are really good!

@veerareddyvishal144
Copy link

Thanks @MichaelAnders for your contribution I am merging it.

@veerareddyvishal144 veerareddyvishal144 merged commit e260fb2 into Fast-Editor:feature/model-router Feb 23, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants