Asynchronous Tool Execution #1398

LucaButBoring · 2025-09-25T02:43:30Z

This PR implements the required changes for modelcontextprotocol/modelcontextprotocol#1391, which adds asynchronous tool execution.

This is a large PR, and I expect that if the associated SEP is accepted, we might want to break this down into several smaller PRs for SDK reviewers. I tried to generally have separate commits for each step of the implementation, to try and make this easier to review in its current form.

Motivation and Context

Today, most applications integrate with tools in a straightforward but naive manner, choosing to have agents invoke tools synchronously with the conversation instead of allowing agents to multitask where possible. There are a few reasons why we believe this is the case, including the lack of clarity around tool interfaces (single tool or multiple for job tracking), model failures when manually polling on operations, and not having a way to retrieve results with a well-defined TTL, among other problems (described in more detail in the linked issue). Here, we introduce an alternative API that establishes a clear integration path for async job-style use cases that are typically on the order of minutes to hours.

The ultra high-level overview is as follows:

Tools now support synchronous or asynchronous invocation modes
A single tool only advertises itself as either sync or async to a given client, controlled by protocol version
Sync tools behave just like they always did
Async tools are split into start/poll/retrieve stages:
- Starting a call:
  - tools/call begins an async tool call
  - The result is a CallToolResult containing an operation token, which is used to interact with the async tool call across multiple RPC calls
- Polling:
  - The operation token is used to call tools/async/status, which returns the current operation status
  - The client should poll this method until the status reaches a terminal value
- Result retrieval:
  - The operation token is used to call tools/async/result, which has the final tool output

Whether a tool is sync, async, or both (on old/new protocol versions) is defined by tool implementors. This enables remote server operators to control this based on how long each tool is expected to take to execute, rather than potentially serving HTTP requests with widely varying execution times on the same endpoint. This also makes it much more clear to client applications what the "time contract" of a tool is, so that fast tools can still be executed synchronously while allowing long-running tools to be immediately backgrounded.

Usage

Defining an async-compatible tool is just a matter of adjusting the @mcp.tool() decorator to include an invocation_modes parameter, which is a list of "sync" and "async":

@mcp.tool(invocation_modes=["async", "sync"])
async def data_processing_tool(dataset: str, operations: list[str], ctx: Context) -> dict[str, str]:
    await ctx.info(f"Starting data processing pipeline for {dataset}")

    results: dict[str, str] = {}
    total_ops = len(operations)

    for i, operation in enumerate(operations):
        await ctx.debug(f"Executing operation: {operation}")
        await asyncio.sleep(0.5 + (i * 0.2))  # Simulate processing time
        progress = (i + 1) / total_ops  # Report progress
        await ctx.report_progress(progress, 1.0, f"Completed {operation}")
        results[operation] = f"Result of {operation} on {dataset}"  # Store result

    await ctx.info("Data processing pipeline complete!")
    return results

If invocation_modes contains "async", the tool is async-compatible and will only be called in async mode by clients on new versions, while if it contains "sync", the tool is sync-compatible and will be called in sync mode if async mode is not supported (a client will never have the option to choose one or the other itself).

Behind the scenes, the SDK handles branching the behavior to either run synchronously (like today) or asynchronously (immediate return with job tracking) depending on if the client version supports async tools yet or not.

To control how long the results are kept for to retrieve with tools/async/result, we can use the keep_alive parameter:

@mcp.tool(invocation_modes=["async", "sync"], keep_alive=30)  # retain result for 30s following completion

We can also customize the content returned in the immediate CallToolResult with the immediate_result parameter:

async def immediate_feedback(operation: str) -> list[types.ContentBlock]:
    return [types.TextContent(type="text", text=f"Starting {operation}... This may take a moment.")]

@mcp.tool(invocation_modes=["async", "sync"], immediate_result=immediate_feedback)

On the client side, we just add the polling and result retrieval like so:

async def demonstrate_data_processing(session: ClientSession):
    """Demonstrate data processing pipeline."""
    print("\n=== Data Processing Pipeline Demo ===")

    # Just like before
    operations = ["validate", "clean", "transform", "analyze", "export"]
    result = await session.call_tool(
        "data_processing_tool", arguments={"dataset": "customer_data.csv", "operations": operations}
    )

    # We could choose to sent the immediate result content to an agent from here before continuing

    # New parts
    if result.operation:
        token = result.operation.token
        print(f"Data processing started with token: {token}")

        # Poll for completion
        while True:
            status = await session.get_operation_status(token)
            print(f"Status: {status.status}")

            if status.status == "completed":
                final_result = await session.get_operation_result(token)

                # Show structured result if available
                if final_result.result.structuredContent:
                    print("Processing results:")
                    for op, result_text in final_result.result.structuredContent.items():
                        print(f"  {op}: {result_text}")
                break
            elif status.status == "failed":
                print(f"Processing failed: {status.error}")
                break
            elif status.status in ("canceled", "unknown"):
                print(f"Processing ended with status: {status.status}")
                break

            await asyncio.sleep(0.8)

How Has This Been Tested?

Unit tests, integration tests, and new example snippets.

Breaking Changes

Existing users will not need to update their applications to continue using synchronous tool calls. Asynchronous tool calls will require minor code changes that will be documented.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update

Checklist

I have read the MCP Documentation
My code follows the repository's style guidelines
New and existing tests pass locally
I have added appropriate error handling
I have added or updated documentation as needed

Additional context

There were a bunch of decisions in the implementation we may want to discuss further, some of which were due to ambiguity in the proposal (which will be revised again) and some of which were due to working things into the SDK implementation.

I added a faux-version called next to deal with the requirement that sessions on the current-latest version always advertise as sync-only. The tests and examples explicitly set the advertised client protocol version to next when calling async-only tools.
Reusing the existing tool/call method creates some ambiguities in how outputSchema should be handled, as the immediate tool call result (communicating an accepted state) would no longer have meaningful structuredContent. The output that should be validated is actually the result of GetOperationPayloadResult, so for now I'm skipping validation of the immediate CallToolResult (only in async execution) and only validating GetOperationPayloadResult (sync tool executions are always validated, just like before).
keepAlive should have a sentinel value representing "no expiration," and I'm leaning towards None. However, in SDK implementations, that becomes somewhat ambiguous with sync tool calls, which also implicitly have a keepAlive of None already. For now, I default it to 1 hour if not specified/None, but this should probably be changed before this is merged.
In sHTTP, the SDK has behavior to send tool-related server messages on the same SSE stream that the server used as a response to the client's CallToolRequest, by attaching a related_request_id to the stream for fast lookups and session resumption. To support sampling and elicitation, we keep a map of operation tokens to their original request IDs to reuse the same event store entry between related calls.
The client session needs to cache a mapping of in-flight operation tokens to tool names for validating structuredContent in async tool calls, as it otherwise has no way to look up the cached outputSchema. We could consider including a toolName in GetOperationPayloadResult to avoid the inconvenience, but in this draft I'm using a cache expiry based on keepAlive to avoid holding that mapping forever.

LucaButBoring · 2025-09-26T03:13:13Z

Latest commit implements working stream binding to support elicitation and sampling. There's definitely room for refactoring that one 😔

LucaButBoring · 2025-09-29T17:48:46Z

Added support for configuring the immediate result and made some changes to avoid smuggling parameters through _meta (now we do it through unserialized fields).

…-sdk into feat/async-tools

Copilot

Pull Request Overview

This PR implements asynchronous tool execution for the MCP (Model Context Protocol) Python SDK, enabling long-running operations that execute in the background while clients poll for status and results. The implementation introduces operation tokens for tracking execution state and supports configurable keep-alive durations for result availability.

Key changes include:

Added async operation management system with token-based tracking
Extended MCP types to support async operation parameters and results
Implemented immediate feedback capability for async tools
Added client-side polling mechanisms for operation status and results

Reviewed Changes

Copilot reviewed 35 out of 35 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/mcp/types.py	Extended protocol types with async operation support and operation tokens
src/mcp/shared/async_operations.py	Core async operation management classes for client and server
src/mcp/client/session.py	Client-side async operation tracking and polling methods
src/mcp/server/fastmcp/server.py	FastMCP integration with async tools and invocation mode filtering
src/mcp/server/fastmcp/tools/base.py	Tool base class extensions for async modes and immediate results
src/mcp/server/lowlevel/server.py	Low-level server async operation handlers and execution logic
tests/server/fastmcp/test_server.py	Comprehensive test coverage for async tool functionality
examples/snippets/servers/async_tool_*.py	Example implementations demonstrating async tool patterns

Comments suppressed due to low confidence (2)

src/mcp/server/fastmcp/tools/base.py:162

Function _is_async_callable is referenced on line 120 but not defined in this file. Either add the function definition or import it properly.

def _is_async_callable(obj: Any) -> bool:

tests/shared/test_progress_notifications.py:1

[nitpick] The operation_token=None parameter addition maintains backward compatibility but consider adding a comment explaining when this would be non-None for clarity in test scenarios.

from typing import Any, cast

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

src/mcp/server/lowlevel/server.py

src/mcp/server/fastmcp/server.py

src/mcp/server/streamable_http.py

Kludex · 2025-09-30T09:12:54Z

src/mcp/server/lowlevel/server.py

+                                logger.exception(f"Async execution failed for {tool_name}")
+                                self.async_operations.fail_operation(operation.token, str(e))
+
+                        asyncio.create_task(execute_async())


Let's stop using asyncio in this code source. Use anyio please.

Fixed here, still need to fix AsyncOperationManager. Still trying to figure out how to do that properly with cancellation scopes.

I can help if you need help.

Kludex · 2025-09-30T09:13:38Z

Why would a tool have 2 flavors (async & sync)?

Is there a hint to the client when it should try to retrieve the result, e.g. what retry-after is for HTTP.

LucaButBoring · 2025-10-01T19:14:56Z

Is there a hint to the client when it should try to retrieve the result, e.g. what retry-after is for HTTP.

As in for how often it should poll? There is not, but that's a good callout - I'll amend the proposal to include that.

LucaButBoring · 2025-10-01T19:21:05Z

Why would a tool have 2 flavors (async & sync)?

In the case where a tool is being migrated from sync to async, and may support clients that haven't yet supported the latest (or next, in this case) protocol version, it's useful to support both on the same tool - so clients that don't support the latest version will have a single long request, while clients that do support it will use async tool call semantics.

There will be some use cases where that's desirable, and some where it's not, so it's optional behavior.

Kludex · 2025-10-02T09:44:56Z

If the choice of using async/sync is made by the client, why should we add flavors when defining a tool on the server side?

Kludex · 2025-10-02T09:50:42Z

If the choice of using async/sync is made by the client, why should we add flavors when defining a tool on the server side?

Answering my own question... You need to signal in the tool definition which flavor it supports.

Kludex · 2025-10-02T09:54:55Z

Okay. I don't think we should be adding multiple flavors to a tool, I think it makes things more complicated. Also, the spec anyway supports the invocationMode keyword, and not invocationModes, which I think it makes sense.

I think a better API is to have a new method e.g. async_tool (although "async" is not the best keyword for this in Python). Is it still possible to find a synonym to "async"? "long-run"? 😅

Kludex · 2025-10-02T10:55:44Z

src/mcp/types.py

+
+    token: str
+    """Server-generated token to use for checking status and retrieving results."""
+    keepAlive: int


Suggested change

keepAlive: int

keep_live: int = Field(alias="keepAlive")

Kludex · 2025-10-02T10:56:09Z

src/mcp/types.py


    name: str
    arguments: dict[str, Any] | None = None
+    operation_params: AsyncRequestProperties | None = Field(serialization_alias="operation", default=None)


What's the problem with the real name "operation"?

The base RequestParams type also has a _operation used for association metadata, which needs to be aliased to operation to not be treated as a private/protected field by pyright wherever it gets used.

error: "_operation" is protected and used outside of the class in which it is declared (reportPrivateUsage)

I didn't want to ignore pyright just because of a naming conflict, but now that I'm looking at this again there are only 3 places where we'd need to do so.

Kludex · 2025-10-02T11:13:20Z

src/mcp/server/lowlevel/server.py

+                                self.async_operations.fail_operation(operation.token, str(e))
+
+                        async with anyio.create_task_group() as tg:
+                            tg.start_soon(execute_async)


I think we need to make this implementation a bit more abstract.

What if I want to execute my tasks in another machine/node/process than the MCP server? The world seems to like a lot durable execution products like Temporal as well.

I think we need something like what this design proposes: https://github.com/pydantic/fasta2a/tree/main?tab=readme-ov-file#design

I made AsyncOperationManager a Server parameter with the intention that consumers would implement something like that for that use case, but that's not enough due to missing a broker to enable that pattern, yeah. Will adjust this accordingly.

LucaButBoring · 2025-10-02T14:48:32Z

I think a better API is to have a new method e.g. async_tool (although "async" is not the best keyword for this in Python). Is it still possible to find a synonym to "async"? "long-run"? 😅

On the SEP, we're going to be moving to referring to these uniformly as long-running operations, so an @mcp.long_running would work.

Do we want to force tool implementors to make a clean break from existing tools, though? Or are you suggesting that in the case where both modes are desired on a tool in an interim state, they would add both @mcp.tool and @mcp.long_running to the same function?

If so, that then requires the tool manager to support adding the same tool more than once, and updating the parameters if long_running is present and a tool is already registered (since the decorators could be applied in either order).

Kludex · 2025-10-02T19:40:59Z

Do we want to force tool implementors to make a clean break from existing tools, though?

As for what I understood, since the invokeMode is a string, you can only have a tool that is either "long-running" or "short-running". If that's the case, then a short running tool is different from a long-running tool. So... I'm not sure if I'm missing something, or the spec reflects something different.

Or are you suggesting that in the case where both modes are desired on a tool in an interim state, they would add both @mcp.tool and @mcp.long_running to the same function?

I wouldn't like to add two decorators. 🤔

If a tool can be short, and long-running, why the invokeMode is not a boolean? supportsAsync?

LucaButBoring · 2025-10-02T19:49:41Z

As for what I understood, since the invokeMode is a string, you can only have a tool that is either "long-running" or "short-running". If that's the case, then a short running tool is different from a long-running tool. So... I'm not sure if I'm missing something, or the spec reflects something different.

A tool is exactly one or the other from the perspective of a single client. ListTools must show exactly one execution mode. In addition, version negotiation is used to hide long-running tools from clients that don't support them, yet.

However, we can still support hybrid tools for backwards-compatibility purposes if the server operator allows it. Essentially, because the core tool implementation is the same @mcp.tool, we can wrap a tool in the functionality it needs to behave in either way within the server SDK, but only allow a client to use one or the other depending on its version.

There's a few sections of the SEP which discuss this, but pulling one I think is useful:

Old Clients (pre-async support):
// tools/list response (filtered)
{
  tools: [
    // only supports sync execution
    { name: "search_web", description: "Search the web" },
    // only supports sync execution
    { name: "quick_calc", description: "Fast calculation" },
    // supports both sync and async - invocationMode field is hidden from old clients
    { name: "get_weather", description: "Get weather" }
    // async-capable tools hidden from old clients
  ]
}

New Clients (async support):
// tools/list response (complete)
{
  tools: [
    // explicitly only supports sync execution
    { name: "search_web", description: "Search the web", invocationMode: "sync" },
    // implicitly only supports sync execution
    { name: "quick_calc", description: "Fast calculation" },
    // new clients see the async invocation mode and should assume async-only execution
    { name: "get_weather", description: "Get weather", invocationMode: "async" },
    // tool is async-only, and is not shown to old clients
    { name: "deep_analysis", description: "Complex analysis", invocationMode: "async" }
    // All tools visible, async capabilities declared
  ]
}

Note that some tools are present in the "old clients" list despite presenting as LRO-only in the "new clients" list. When an old client invokes a tool represented like this, they won't get the immediate_result response, or even an operation token - the server will simply not return a response until the tool implementation completes.

If a tool can be short, and long-running, why the invokeMode is not a boolean? supportsAsync?

This was the original idea, actually, but for futureproofing it's been changed to an enum.

LucaButBoring added 16 commits September 19, 2025 13:57

Implement new data models for async tools (SEP-1391)

395194a

Add "next" protocol version to isolate async tools from existing clients

cbda6e3

Implement session functions for async tools

e5e4078

Implement server-side handling for async tool calls

7dd550b

Rename types for latest SEP-1391 revision

8d281be

Handle cancellation notifications on async ops

0dc8d43

Implement support for input_required status

e70f441

Support configuring the broadcasted client version

2079230

Pass AsyncOperations from FastMCP to Server

04bac41

Implement lowlevel async CallTool

2df5e7c

Implement async tools snippets

759a9a3

Implement optoken to tool name map on client end for validation

011a363

Support configuring async tool keepalives

e40055a

Control async op expiry by resolved_at, not created_at

600982e

Add snippet for async tool with keepalive

37fb963

Support progress in async tools

f8ca895

LucaButBoring mentioned this pull request Sep 25, 2025

SEP-1391: Asynchronous Tool Execution modelcontextprotocol/modelcontextprotocol#1391

Open

Operation token plumbing to support async elicitation/sampling

b802dc4

felixweinberger added pending publish Draft PRs need to be published for team to review pending SEP approval When a PR is attached as an implementation detail to a SEP, we mark it as such for triage. labels Sep 26, 2025

LucaButBoring added 3 commits September 26, 2025 13:33

Add decorator parameter for immediate return value in LRO

047664f

Support configuring immediate LRO result

07a2821

Fix code complexity issue in sHTTP

2943631

LucaButBoring added 4 commits September 29, 2025 12:02

Merge branch 'main' of https://github.com/modelcontextprotocol/python…

e6a12e1

…-sdk into feat/async-tools

Add basic documentation for async tools

4539c59

Remove misplaced server test

97be6dd

Split up async tool snippets to improve README readability

b0d3f30

LucaButBoring marked this pull request as ready for review September 29, 2025 21:25

LucaButBoring requested review from a team and ochafik September 29, 2025 21:25

Kludex requested a review from Copilot September 30, 2025 08:54

Copilot AI reviewed Sep 30, 2025

View reviewed changes

src/mcp/server/lowlevel/server.py Outdated Show resolved Hide resolved

src/mcp/server/fastmcp/server.py Show resolved Hide resolved

src/mcp/server/streamable_http.py Outdated Show resolved Hide resolved

Kludex reviewed Sep 30, 2025

View reviewed changes

LucaButBoring and others added 8 commits October 1, 2025 12:21

Move operations into "working" state before tool execution

b33721e

Add reconnect example for async tools

5e7bc5e

Merge branch 'main' into feat/async-tools

0a5373e

Fix README formatting

4a6c5a5

Remove usages of asyncio in tests

9375927

Update README snippets

26055d9

Use anyio instead of asyncio in lowlevel server

a8e0831

Apply Copilot suggestions

2ed562e

Kludex reviewed Oct 2, 2025

View reviewed changes

LucaButBoring and others added 3 commits October 3, 2025 15:03

Use server TaskGroup to fix operations blocking CallTool requests

76f135e

Merge branch 'main' into feat/async-tools

6be55ef

Remove vestigial session operation cancellation

7255e4f

Asynchronous Tool Execution #1398

Are you sure you want to change the base?

Asynchronous Tool Execution #1398

Conversation

LucaButBoring commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Usage

How Has This Been Tested?

Breaking Changes

Types of changes

Checklist

Additional context

Uh oh!

LucaButBoring commented Sep 26, 2025

Uh oh!

LucaButBoring commented Sep 29, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Kludex Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

LucaButBoring Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

Kludex Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kludex commented Sep 30, 2025

Uh oh!

LucaButBoring commented Oct 1, 2025

Uh oh!

LucaButBoring commented Oct 1, 2025

Uh oh!

Kludex commented Oct 2, 2025

Uh oh!

Kludex commented Oct 2, 2025

Uh oh!

Kludex commented Oct 2, 2025

Uh oh!

Kludex Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

Kludex Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

LucaButBoring Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

Kludex Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

LucaButBoring Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

LucaButBoring commented Oct 2, 2025

Uh oh!

Kludex commented Oct 2, 2025

Uh oh!

LucaButBoring commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

LucaButBoring commented Sep 25, 2025 •

edited

Loading

Kludex Oct 2, 2025 •

edited

Loading

LucaButBoring commented Oct 2, 2025 •

edited

Loading