Enable valid OpenAI response_format specification#1069
Open
liamcripwell wants to merge 1 commit intoargilla-io:developfrom
Open
Enable valid OpenAI response_format specification#1069liamcripwell wants to merge 1 commit intoargilla-io:developfrom
response_format specification#1069liamcripwell wants to merge 1 commit intoargilla-io:developfrom
Conversation
Contributor
|
Hi @liamcripwell thanks for the PR! This bug was found and is already fixed in |
Author
|
Hi @plaguss, great to hear it's already been fixed. Sorry, I didn't notice this change in However, I still think the doctstring for |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When specifying the
response_formatwithin the current version of the OpenAI API it expects this via an object containing a"type"attribute, e.g.{"type": "<type>"}. However, distilabel is enforcing a string representation for this, which leads to either an error or silent failure.E.g. when using a
TextGenerationtask under the existing codebase:The OpenAI API will fail and yield
BadRequestError: Error code: 400 - {'error': {'message': "Invalid type for 'response_format': expected an object, but got a string instead.", 'type': 'invalid_request_error', 'param': 'response_format', 'code': 'invalid_type'}}.The same happens when directly calling generation from the
LLM:Presumably the same happens for requests to the batch api, which ultimately leads to
AssertionError: No output file ID was found in the batch..This pr simply wraps the string representation of the specified
response_formatinside the object expected by OpenAI.I have also added the same value checking that is done in
agenerate()tooffline_batch_generate().