Skip to content

Conversation

@ybdarrenwang
Copy link
Collaborator

Description

Introduces ToolSimulator framework for simulating realistic tool responses during agent evaluation without calling production APIs. Enables systematic testing of agents with API-based, Python function-based, and MCP-based tools through LLM-powered dynamic simulation.

Key capabilities:

  • Three simulation modes: Dynamic (LLM-generated), static (predefined responses), and mock (custom functions)
  • Shared state management across multiple tools via share_state_id for stateful testing scenarios
  • Decorator-based registration for function tools (@ToolSimulator.function_tool), MCP tools (@ToolSimulator.mcp_tool), and API tools (@ToolSimulator.api_tool)
  • Integration with Strands Evals workflow including Experiment, Case, and evaluators
  • Multi-agent support with tool simulation across sub-agents (agent-as-tool pattern)

Design principles:

  • Centralized registry for tool management and state tracking
  • Context-aware response generation using initial state descriptions and conversation history
  • Seamless integration with existing Strands tool decorator patterns
  • Comprehensive unit test coverage for all simulation modes

Related Issues

#93

Documentation PR

strands-agents/docs#500

Type of Change

New feature

Testing

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant