Skip to content

Conversation

dbschmigelski
Copy link
Member

Description

In LiteLLMModel we currently attempt to perform structured_output using the model's native capabilities. This is checked using the supports_response_schema method.

This has two edge cases which are causing customer failures.

  1. The first obvious one is that if a model does not support structured output we are failing hard. This means that if a customer changes the model id, code that previously worked will now fail with ValueError("Model does not support response_format").

  2. The second is what appears to be a bug in LiteLLM. supports_response_schema does not appear to work with proxies. Looking at their code, they are not passing along proxy information, meaning there is no ability to perform a runtime check for proxies.

To fix this, when supports_response_schema returns false, we now fallback to the similar tool approach used in other model providers. A fix in LiteLLM will be explored but in the immediate term we need to unblock customers.

Related Issues

#862
#909

Type of Change

New feature

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

if len(response.choices) > 1:
raise ValueError("Multiple choices found in the response.")
if not response.choices or response.choices[0].finish_reason != "tool_calls":
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good point was raised in #483 proposing that we can remove this check - since we are not even using the tool function arguments and instead are using the message.

But, that change is left to be completed in #483 and is considered out of scope for this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this break the logic here? are structured output responses from all models returned as tool_calls?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to #483, there are some models that don't return the responses with tool_calls. So the claim the contributor makes is we are currently blocking the models that do not.

But that is something that is currently broken not directly related to this PR.

This PR addresses the case when supports_response_schema = false. #483, is when supports_response_schema=True, but there is no tool_calls



def test_structured_output_unsupported_model(model, nested_weather):
# Mock supports_response_schema to return False to test fallback mechanism
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a real model we can use so we can avoid mocking?

Copy link
Member Author

@dbschmigelski dbschmigelski Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could create a proxy, but I wanted to avoid the case where litellm updates and then our test fails.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a proxy? Can we just provide an api key to litellm, and choose a model that would work with this? Otherwise, this is more of a unit test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still verifying that the tool extraction works

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The models that we have access to I believe are all statically defined to support structured output


yield {"output": result}

async def _structured_output_using_response_schema(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add debug logging for the different paths, so we know which is used



def test_structured_output_unsupported_model(model, nested_weather):
# Mock supports_response_schema to return False to test fallback mechanism
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a proxy? Can we just provide an api key to litellm, and choose a model that would work with this? Otherwise, this is more of a unit test.

@dbschmigelski dbschmigelski changed the title feat(models): use tool for structured_output when supports_response_schema=false feat(models): use tool for litellm structured_output when supports_response_schema=false Oct 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants