Skip to content

Conversation

@AlexDaines
Copy link
Contributor

@AlexDaines AlexDaines commented Nov 7, 2025

Add ChatOptions.ResponseFormat support for Bedrock MEAI

Description

Implements support for ChatOptions.ResponseFormat in the AWSSDK.Extensions.Bedrock.MEAI implementation of IChatClient. When ResponseFormat is set to Json or ForJsonSchema, the client now uses Bedrock's tool mechanism to enforce structured JSON responses from models.

Implementation approach:

  • Creates a synthetic tool called "generate_response" with the provided JSON schema
  • Forces model to use this tool via toolChoice
  • Extracts JSON response from the tool use output
  • Converts Bedrock Document objects to standard JSON

Why synthetic tool?:

Bedrock lacks a native responseFormat API; all AWS SDKs (boto3, Java, now .NET) use tool calling as the official mechanism for structured output—we inject a synthetic tool with the JSON schema to implement ChatOptions.ResponseFormat transparently.

Key behavior:

  • ResponseFormat.Json: Requests JSON with generic object schema
  • ResponseFormat.ForJsonSchema: Requests JSON conforming to custom schema
  • ResponseFormat.Text: No changes to request (default behavior)
  • Throws ArgumentException if ResponseFormat is used with user-provided tools (mutual exclusivity)
  • Throws NotSupportedException for streaming requests (Bedrock limitation)

Motivation and Context

Closes #3911

Users need consistent behavior when using IChatClient across different AI providers. Currently, the Bedrock implementation ignores ChatOptions.ResponseFormat, making it impossible to request structured responses through the standardized Microsoft.Extensions.AI interface. This prevents Bedrock from being a drop-in replacement for other providers in structured data workflows.

Testing

Dryrun:

.NET v4 Build: DRY_RUN-9edf05db-56d5-4398-902c-826d8573804d

  • Added 2 core unit tests covering request creation with schemas and response extraction
  • Created local sample application demonstrating the feature with real Bedrock API calls
  • Verified JSON responses are correctly structured and parsed
  • Confirmed error handling for invalid configurations (tools + ResponseFormat, streaming)

Test coverage:

  • ResponseFormat_Json_WithSchema_CreatesSyntheticToolWithCorrectSchema: Validates synthetic tool creation with custom schema
  • ResponseFormat_Json_ModelReturnsToolUse_ExtractsJsonCorrectly: Validates JSON extraction from tool use responses

Types of changes

  • New feature (non-breaking change which adds functionality)

Checklist

  • My code follows the code style of this project
  • My change requires a change to the documentation
  • I have updated the documentation accordingly
  • I have read the README document
  • I have added tests to cover my changes
  • All new and existing tests passed

License

  • I confirm that this pull request can be released under the Apache 2 license

@GarrettBeatty GarrettBeatty requested a review from Copilot November 7, 2025 18:01
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds ResponseFormat support to the AWS Bedrock ChatClient for Microsoft.Extensions.AI, enabling structured JSON output from Bedrock models. The implementation uses Bedrock's tool mechanism with a synthetic tool to enforce structured responses, requiring models with ToolChoice support (Claude 3+ and Mistral Large).

Key changes:

  • Implemented ResponseFormat handling via synthetic tool creation that forces models to return structured JSON
  • Added error handling for unsupported models and missing structured responses
  • Added Document-to-JSON conversion utilities for extracting structured content from tool use responses

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
extensions/src/AWSSDK.Extensions.Bedrock.MEAI/BedrockChatClient.cs Core implementation of ResponseFormat support including synthetic tool creation, error handling for unsupported models, Document-to-JSON conversion, and validation that ResponseFormat conflicts with user-provided tools
extensions/test/BedrockMEAITests/BedrockChatClientTests.cs Added MockBedrockRuntime test infrastructure and two tests validating schema conversion and JSON extraction from tool use responses

Comment on lines 1048 to 1054
// Check depth to prevent stack overflow from deeply nested or circular structures
if (depth > MaxDocumentNestingDepth)
{
throw new InvalidOperationException(
$"Document nesting depth exceeds maximum of {MaxDocumentNestingDepth}. " +
$"This may indicate a circular reference or excessively nested data structure.");
}
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The depth limit check for preventing stack overflow in recursive JSON conversion only protects against excessive nesting but doesn't protect against actual circular references in the Document structure. If a Document contains a circular reference (e.g., a dictionary that references itself), this will still cause infinite recursion until the depth limit is reached.

Consider tracking visited Document instances to detect actual circular references earlier, or document that Document structures are expected to be acyclic.

Copilot uses AI. Check for mistakes.
Comment on lines 98 to 100
(ex.Message.IndexOf("toolChoice", StringComparison.OrdinalIgnoreCase) >= 0 ||
ex.Message.IndexOf("tool_choice", StringComparison.OrdinalIgnoreCase) >= 0 ||
ex.Message.IndexOf("ToolChoice", StringComparison.OrdinalIgnoreCase) >= 0);
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error detection logic uses multiple string checks for variations of "toolChoice", but this approach is fragile and could produce false positives. For example, if an error message contains "toolChoice" in a different context (e.g., "Invalid parameter value, not related to toolChoice functionality"), it would incorrectly match.

Consider either:

  1. Using a more specific error code if one exists for this scenario
  2. Using a regular expression with word boundaries to ensure "toolChoice" is a distinct term
  3. Checking for more specific error message patterns that uniquely identify this error

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@GarrettBeatty GarrettBeatty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain more a little more in the PR why we need make this synthetic tool and what not

}

// Check if ResponseFormat is set - not supported for streaming yet
if (options?.ResponseFormat is ChatResponseFormatJson)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering why this is

}

/// <summary>Converts a <see cref="Document"/> to a JSON string.</summary>
private static string DocumentToJsonString(Document document)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of this stuff i would be surprised if it doesnt exist already in a jsonutils or utils file. either way it shouldnt be in this class

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with garret, it should live in a utils class at the least

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the entire PR, but we've had requests in the past to make the Document class interop better with JSON.

It's something we should do, but we have to be aware the document type is meant to be agnostic (the service could start returning CBOR tomorrow for example). See this comment from Norm: #3915 (comment)

It'd probably make more sense to include this functionality in Core, but now I'm even wondering if it's better to do that first (and separately) from this PR.

{
response = await _runtime.ConverseAsync(request, cancellationToken).ConfigureAwait(false);
}
catch (AmazonBedrockRuntimeException ex) when (options?.ResponseFormat is ChatResponseFormatJson)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we checking bedrockruntimeexception here but down below i see we are throwing InvalidOperationException when it fails?

Copy link
Contributor

@peterrsongg peterrsongg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My long comment on testing different json responses returned by the service might not make sense depending on what this feature is supposed to do. So if this option is set, that tells Bedrock to return the response in a certain way?

I still think we shouldn't have a MockBedrockRuntime that implements IAmazonBedrockRuntime. We will have to update this class every time a new operation is released.

namespace Amazon.BedrockRuntime;

// Mock implementation to capture requests and control responses
internal sealed class MockBedrockRuntime : IAmazonBedrockRuntime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So anytime BedrockRuntime adds a new operation, we have to remember to go here and update this mock class so that the operation throws NotImplementedException right? I think we should go with a different approach to mocking the service, because if that's the case then this isn't very scalable.

Also the types of responses you can test here are severely limited. The ConverseAsync just returns a default response, so these test cases are only testing the happy path. My suggestion is to mock the httpLayer so that you can test out edge cases and different responses returned by Converse

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my suggestion (sorry it is a lot). This is what we did for fuzz testing

  1. Create a pipeline customizer
namespace NetSdkMock
{
    public class PipelineCustomizer : IRuntimePipelineCustomizer
    {
        public string UniqueName => "MockPipeline";

        public void Customize(Type type, RuntimePipeline pipeline)
        {
            pipeline.ReplaceHandler<HttpHandler<System.Net.Http.HttpContent>>(new HttpHandler<HttpContent>(new MockHttpRequestMessageFactory(), new object()));
        }
    }

    public class MockHttpRequestMessageFactory : IHttpRequestFactory<HttpContent>
    {
        public IHttpRequest<HttpContent> CreateHttpRequest(Uri requestUri)
        {
            return new MockHttpRequest(new HttpClient(), requestUri, null);
        }

        public void Dispose()
        {
            throw new NotImplementedException();
        }
    }

    public class MockHttpRequest : HttpWebRequestMessage, IHttpRequest<HttpContent>
    {
        private IWebResponseData _webResponseData;

        public MockHttpRequest(HttpClient httpClient, Uri requestUri, IClientConfig config) : base(httpClient, requestUri, config)
        {
        }
        public new IWebResponseData GetResponse()
        {
            return this.GetResponseAsync(CancellationToken.None).Result;
        }

        public new void ConfigureRequest(IRequestContext requestContext)
        {
            _webResponseData = (IWebResponseData)((IAmazonWebServiceRequest)requestContext.OriginalRequest).RequestState["response"];
        }

        public new Task<IWebResponseData> GetResponseAsync(CancellationToken cancellationToken)
        {
            return Task.FromResult(_webResponseData);
        }
    }
}
  1. Stub the web response data so that you can control the type of responses that you get
using Amazon.Runtime.Internal.Transform;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Text;
using System.Threading.Tasks;

namespace NetSdkMock
{
    public class StubWebResponseData : IWebResponseData
    {
        public StubWebResponseData(string jsonResponse, Dictionary<string, string> headers)
        {
            this.StatusCode = HttpStatusCode.OK;
            this.IsSuccessStatusCode = true;
            JsonResponse = jsonResponse;
            this.Headers = headers;

            _httpResponseBody = new HttpResponseBody(jsonResponse);
        }
        public Dictionary<string, string> Headers { get; set; }

        public string JsonResponse { get; }
        private IHttpResponseBody _httpResponseBody;
        public long ContentLength { get; set; }

        public string ContentType { get; set; }

        public HttpStatusCode StatusCode { get; set; }

        public bool IsSuccessStatusCode { get; set; }

        public IHttpResponseBody ResponseBody
        {
            get
            {
                return _httpResponseBody;
            }
        }

        public string[] GetHeaderNames()
        {
            return Headers.Keys.ToArray();
        }

        public bool IsHeaderPresent(string headerName)
        {
            return this.Headers.ContainsKey(headerName);
        }

        public string GetHeaderValue(string headerName)
        {
            if (this.Headers.ContainsKey(headerName))
                return this.Headers[headerName];
            else
                return null;
        }
    }
    public class HttpResponseBody : IHttpResponseBody
    {
        private readonly string _jsonResponse;
        private Stream stream;
        public HttpResponseBody(string jsonResponse)
        {
            _jsonResponse = jsonResponse;
        }

        public void Dispose()
        {
            stream.Dispose();
        }

        public Stream OpenResponse()
        {
            stream = new MemoryStream(UTF8Encoding.UTF8.GetBytes(_jsonResponse));
            return stream;
        }

        public Task<Stream> OpenResponseAsync()
        {
            return Task.FromResult(OpenResponse());
        }
    }
}

Now in your test you can call it as normal and pass in different types of responses:

using Amazon.BedrockRuntime;
using Amazon.BedrockRuntime.Model;
using Amazon.Runtime;
using Amazon.Runtime.Internal;
using Microsoft.Extensions.AI;
using Moq;
using System.Text.Json;
namespace NetSdkMock
{
    public class MockTests : IClassFixture<ClientFixture>
    {
        private readonly ClientFixture _fixture;
        public MockTests()
        {
            _fixture = new ClientFixture();
        }
        [Fact]
        public async Task Test1()
        {
            var messages = new[] { new ChatMessage(ChatRole.User, "Test") };
            // try to test different schemas too. 
            var schemaJson = """
            {
                "type": "object",
                "properties": {
                    "name": { "type": "string" },
                    "age": { "type": "number" }
                },
                "required": ["name"]
            }
            """;
            var schemaElement = JsonDocument.Parse(schemaJson).RootElement;
            var options = new ChatOptions
            {
                ResponseFormat = ChatResponseFormat.ForJsonSchema(schemaElement,
                    schemaName: "PersonSchema",
                    schemaDescription: "A person object")
            };
            var chatClient = _fixture.BedrockRuntimeClient.AsIChatClient("claude-3");
            ConverseRequest request = new ConverseRequest();
            var interfaceType = typeof(IAmazonWebServiceRequest);
            var requestStatePropertyInfo = interfaceType.GetProperty("RequestState");
            var requestState = (Dictionary<string, object>)requestStatePropertyInfo.GetValue(request);
            //var schemaElement = JsonDocument.Parse(schemaJson).RootElement;
            //  now you can test out all different types of json responses
            var jsonResponse = """
            {
                "name":"Bob",
                "age" : 15
            }
            """;

            ChatOptions options = new ChatOptions();
            options.RawRepresentationFactory = chatClient => request;

            var webResponseData = new StubWebResponseData(jsonResponse, new Dictionary<string, string>());
            // this is where we are injecting the stubbed web response data. 
            requestState["response"] = webResponseData;
            var response = await chatClient.GetResponseAsync(messages, options).ConfigureAwait(false);

        }
    }
    public class ClientFixture: IDisposable
    {
        public ClientFixture()
        {
            RuntimePipelineCustomizerRegistry.Instance.Register(new PipelineCustomizer());
            BedrockRuntimeClient = new AmazonBedrockRuntimeClient();
        }
        public IAmazonBedrockRuntime BedrockRuntimeClient { get; private set; }

        public void Dispose()
        {
            // Cleanup after all tests in this class
            BedrockRuntimeClient.Dispose();
        }
    }
}

I spent a few hours creating a test project to make sure this code will work and it does, so I think you can use this to create many more test cases and not just test the happy path.

}

/// <summary>Converts a <see cref="Document"/> to a JSON string.</summary>
private static string DocumentToJsonString(Document document)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with garret, it should live in a utils class at the least

// Check if this is a ToolChoice validation error (model doesn't support it)
bool isToolChoiceNotSupported =
ex.ErrorCode == "ValidationException" &&
(ex.Message.IndexOf("toolChoice", StringComparison.OrdinalIgnoreCase) >= 0 ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is checking the error message the only way to achieve this? error messages aren't gauranteed to stay the same.

}

// Assert
var tool = mock.CapturedRequest.ToolConfig.Tools[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm a bit confused, and maybe it is because i don't understand who is supposed to return the response in the provided schema (bedrock or us), but this test case just seems to be asserting that the tool has the correct schema set on it. Is there no way to test the actual functionality?

AlexDaines added a commit that referenced this pull request Nov 19, 2025
@AlexDaines AlexDaines force-pushed the adaines/support-chatoptions-responseformat branch from a459b62 to b7bd419 Compare November 19, 2025 23:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

extensions/test/BedrockMEAITests/BedrockMEAITests.NetFramework.csproj:37

  • Missing corresponding NetStandard test project. The repository follows a pattern where extension tests have both NetFramework and NetStandard project files to ensure platform compatibility (see CloudFront.SignersTests and EC2.DecryptPasswordTests as examples). A NetStandard test project targeting netcoreapp3.1;net8.0 is needed to ensure tests run on .NET Core 3.1 and .NET 8.0, as required by the contributing guidelines.
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFrameworks>net472</TargetFrameworks>
    <DefineConstants>$(DefineConstants);BCL</DefineConstants>
    <AssemblyName>BedrockMEAITests</AssemblyName>
    <PackageId>BedrockMEAITests</PackageId>

    <GenerateAssemblyTitleAttribute>false</GenerateAssemblyTitleAttribute>
    <GenerateAssemblyConfigurationAttribute>false</GenerateAssemblyConfigurationAttribute>
    <GenerateAssemblyCompanyAttribute>false</GenerateAssemblyCompanyAttribute>
    <GenerateAssemblyProductAttribute>false</GenerateAssemblyProductAttribute>
    <GenerateAssemblyDescriptionAttribute>false</GenerateAssemblyDescriptionAttribute>
    <GenerateAssemblyCopyrightAttribute>false</GenerateAssemblyCopyrightAttribute>
    <GenerateAssemblyVersionAttribute>false</GenerateAssemblyVersionAttribute>
    <GenerateAssemblyFileVersionAttribute>false</GenerateAssemblyFileVersionAttribute>

    <TreatWarningsAsErrors>true</TreatWarningsAsErrors>
    <LangVersion>Latest</LangVersion>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.Extensions.AI.Abstractions" Version="9.9.1" />
    <PackageReference Include="Moq" Version="4.8.3" />
    <PackageReference Include="xunit" Version="2.9.2" />
    <PackageReference Include="xunit.runner.visualstudio" Version="2.8.2" />
  </ItemGroup>

  <ItemGroup>
    <ProjectReference Include="../../../sdk/src/Core/AWSSDK.Core.NetFramework.csproj" />
    <ProjectReference Include="../../src/AWSSDK.Extensions.Bedrock.MEAI/AWSSDK.Extensions.Bedrock.MEAI.NetFramework.csproj" />
    <ProjectReference Include="../../../sdk/test/UnitTests/Custom/AWSSDK.UnitTestUtilities.NetFramework.csproj" />
  </ItemGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.11.1" />
  </ItemGroup>
</Project>

AlexDaines and others added 3 commits November 20, 2025 12:44
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

@peterrsongg peterrsongg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a few comments, but this looks much better with all the additional test cases i feel much more confident about this change. thanks for applying my earlier feedback


public void Customize(Type type, RuntimePipeline pipeline)
{
#if BCL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just use if #NETFRAMEWORK and else. we don't have to define our own constants anymore
source:
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/preprocessor-directives

// Act & Assert
await Assert.ThrowsAsync<NotSupportedException>(async () =>
{
await foreach (var update in client.GetStreamingResponseAsync(messages, options))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what you're trying to do here but i feel like this test in general isn't necessary. The reason I say this is because if we do add support for the streaming case, this test will fail even though we aren't making a backwards incompatible change.

{
// Detect unsupported model: ValidationException with specific tool support error messages
if (ex.ErrorCode == "ValidationException" &&
(ex.Message.IndexOf("toolChoice is not supported by this model", StringComparison.OrdinalIgnoreCase) >= 0 ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you share what the error messages look like? why do we need the ||? are there different cases and how have we determined those?

ex.Message.IndexOf("This model doesn't support tool use", StringComparison.OrdinalIgnoreCase) >= 0))
{
throw new NotSupportedException(
$"The model '{request.ModelId}' does not support ResponseFormat. " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we use stringbuilder or something similar and not use +

foreach (var content in message.Content)
{
if (content.ToolUse is ToolUseBlock toolUse &&
toolUse.Name == ResponseFormatToolName &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it ever possible that a user could define their own tool with the same name and conflict with ours? how do other sdks handle it

return null;
}

foreach (var content in message.Content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain what exactly this is doing? i understand its going through the content and seems to be finding one where the tool name matches our tool name. and somehow that tool's input is the json formatted content?

why is there multiple content and what happens if more than 1 content has the tool? main question is can you explain a bit how the content/what it is and whats in the tool in the comments

{
// Check for conflict with user-provided tools
if (toolConfig?.Tools?.Count > 0)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a limitation of only being able to have 1 tool in bedrock?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants