-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Pass s3:// file URLs directly to API in BedrockConverseModel
#3663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
6e6f667
9631781
fb584c4
87cea32
a77eeaa
076635e
4d8b111
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should document this feature in
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @DouweM I have updated the doc. Added a paragraph for S3 + Looking at the doc, I am a bit confused. Given that we have updated Not sure if I question is clear enough. Basically, in this MR, we are passing the
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mochow13 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11,6 +11,7 @@ | |
|
|
||
| import anyio.to_thread | ||
| from botocore.exceptions import ClientError | ||
| from mypy_boto3_bedrock_runtime.type_defs import DocumentSourceTypeDef | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's move this line into the |
||
| from typing_extensions import ParamSpec, assert_never | ||
|
|
||
| from pydantic_ai import ( | ||
|
|
@@ -733,20 +734,28 @@ async def _map_user_prompt( # noqa: C901 | |
| else: | ||
| raise NotImplementedError('Binary content is not supported yet.') | ||
| elif isinstance(item, ImageUrl | DocumentUrl | VideoUrl): | ||
| downloaded_item = await download_item(item, data_format='bytes', type_format='extension') | ||
| format = downloaded_item['data_type'] | ||
| source: dict[str, Any] | ||
| if item.url.startswith('s3://'): | ||
| s3_location: dict[str, str] = {'uri': item.url.split('?')[0]} | ||
| if '?bucketOwner=' in item.url: | ||
| s3_location['bucketOwner'] = item.url.split('?bucketOwner=')[1] | ||
| source = {'s3Location': s3_location} | ||
| else: | ||
| downloaded_item = await download_item(item, data_format='bytes', type_format='extension') | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems the existing code in What check do you mean for
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same check raising an error saying that
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah ok need to stop supporting download altogether. Updated.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If it's not drastically more code, I'd prefer proper URL parsing |
||
| source = {'bytes': downloaded_item['data']} | ||
|
|
||
| if item.kind == 'image-url': | ||
| format = item.media_type.split('/')[1] | ||
| assert format in ('jpeg', 'png', 'gif', 'webp'), f'Unsupported image format: {format}' | ||
| image: ImageBlockTypeDef = {'format': format, 'source': {'bytes': downloaded_item['data']}} | ||
| image: ImageBlockTypeDef = {'format': format, 'source': cast(DocumentSourceTypeDef, source)} | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We shouldn't need to |
||
| content.append({'image': image}) | ||
|
|
||
| elif item.kind == 'document-url': | ||
| name = f'Document {next(document_count)}' | ||
| document: DocumentBlockTypeDef = { | ||
| 'name': name, | ||
| 'format': item.format, | ||
| 'source': {'bytes': downloaded_item['data']}, | ||
| 'source': cast(DocumentSourceTypeDef, source), | ||
| } | ||
| content.append({'document': document}) | ||
|
|
||
|
|
@@ -763,7 +772,7 @@ async def _map_user_prompt( # noqa: C901 | |
| 'wmv', | ||
| 'three_gp', | ||
| ), f'Unsupported video format: {format}' | ||
| video: VideoBlockTypeDef = {'format': format, 'source': {'bytes': downloaded_item['data']}} | ||
| video: VideoBlockTypeDef = {'format': format, 'source': cast(DocumentSourceTypeDef, source)} | ||
| content.append({'video': video}) | ||
| elif isinstance(item, AudioUrl): # pragma: no cover | ||
| raise NotImplementedError('Audio is not supported yet.') | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.