-
Notifications
You must be signed in to change notification settings - Fork 13
Description
When we initially implemented the signature validation, we checked 3 main cases:
W3C_CHUNKED- requests that contain a payload that is chunked using purely HTTP chunkingAWS_CHUNKED- requests that contain a payload that is chunked using AWS Chunking (determined by aContent-Encodingthat containsaws-chunked, as well as potentially other encodings), where each chunk contains a signature (See SigV4 Streaming)AWS_CHUNKED_IN_W3C_CHUNKED- an AWS Chunked payload (i.e., with signatures), itself chunked in HTTP chunks (which do not contain signatures)
There is an additional type, STREAMING-UNSIGNED-PAYLOAD-TRAILER - see the docs. This is described as
Use this when sending an unsigned payload over multiple chunks. In this case you also have a trailing header after the chunk is uploaded.
While this already existed by the time we implemented the first version of the proxy, we never saw such requests being sent by standard SDKs, but through some testing now we have seen some sent by aws-cli.
Despite the aws-chunked content encoding, this payload looks exactly like a standard W3C chunked payload with no chunk signatures.
Update
My initial understanding of this behaviour was wrong, I have left the initial wording at the end of this note for reference.
The content is doubly chunked - what we can see on the screenshot is the content after mitmproxy stripped all the W3C Chunked encoding - meaning the chunking we see is in addition to the W3C chunks that were once present but aren't displayed by the tool.
So, while it is technically correct that this looks exactly like a W3C Chunked payload with no signatures, we cannot just treat it as being identical to a W3C Chunked payload - it is doubly chunked.
Jersey handles the first level of chunking (the "standard" W3C Chunking) on our behalf, so the input stream in the request has the first level of chunk headers removed. But the second level of chunking is for us to handle.
This is similar to AWS_CHUNKED_IN_W3C_CHUNKED, but with the difference that the inner chunked data has no signatures.
Experiment
I tested this theory by attempting to upload a 5 MiB made up of the string "hello there!!!" using Boto3, to an endpoint that simply saved the HTTP request to a file (nc -lk 0.0.0.0 9999 > output for instance).
Boto behaves differently depending on whether the endpoint is https or http - so I fronted my netcat listener with nginx so I could handle HTTPS.
When using plaintext HTTP, it appears that Boto3 will not usually perform any aws-chunked or W3C chunked encoding. When using HTTPS, it will more proactively resort to chunking.
When hitting the HTTPS endpoint, I see the following:
PUT /some-bucket/foo HTTP/1.1
Host: local.gate0.net
Transfer-Encoding: chunked
Accept-Encoding: identity
x-amz-sdk-checksum-algorithm: CRC32
Content-Encoding: aws-chunked
X-Amz-Trailer: x-amz-checksum-crc32
X-Amz-Decoded-Content-Length: 5242880
X-Amz-Date: 20250923T110026Z
X-Amz-Content-SHA256:STREAMING-UNSIGNED-PAYLOAD-TRAILER
Authorization: ...
3ffa
100000
hello there!!! [...]
3ff8
hello there!!! [...]
[...]
0
x-amz-checksum-crc32:me7NsA==
0
As such, an approach @thinaih and I have discussed is to:
- Let Jersey handle the outer chunking (just as it does now)
- Make the AWS Chunked handling compatible with unsigned chunks, for the inner chunks
Old content left for reference
This contains wrongful assumptions, and is just left here for context
I think we can ignore the trailer header, which should be safe as its value is not signed - since it is sent after the body, it cannot be included in the authorization header (and, thus is not a signed header). We should however check how Jersey handles trailer headers, but given they are part of the HTTP spec I'd hope it handles them correctly.
I believe we can handle these cases gracefully by simply treating cases where we get a request with aws-chunked content encoding with any STREAMING-UNSIGNED-PAYLOAD- hash headers as being W3C Chunked