-
Notifications
You must be signed in to change notification settings - Fork 249
CLDSRV-744: add missing md5 checksum checks tests #5941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLDSRV-744: add missing md5 checksum checks tests #5941
Conversation
Hello leif-scality,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Incorrect fix versionThe
Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:
Please check the |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files
@@ Coverage Diff @@
## development/9.0 #5941 +/- ##
===================================================
+ Coverage 83.18% 83.22% +0.03%
===================================================
Files 188 189 +1
Lines 12105 12128 +23
===================================================
+ Hits 10070 10093 +23
Misses 2035 2035
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
*/ | ||
function validateChecksumsNoChunking(headers, body) { | ||
if (headers['content-md5']) { | ||
const md5 = crypto.createHash('md5').update(body, 'utf8').digest('base64'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/nodejs/node/blob/main/doc/changelogs/CHANGELOG_V20.md#crypto-implement-cryptohash
https://nodejs.org/docs/latest-v24.x/api/crypto.html#cryptohashalgorithm-data-options
The .hash
function is still marked as experimental but as RC, could be interesting to see if we better perf for small body
A utility for creating one-shot hash digests of data. It can be faster than the object-based crypto.createHash() when hashing a smaller amount of data (<= 5MB) that's readily available. If the data can be big or if it is streamed, it's still recommended to use crypto.createHash() instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could use .hash
but given that the impacted endpoints are configuration, low req/sec methods I don't think we will see any difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PutObjectRetention
is today (at least) called thousands of times per second with some solutions like Veeam. Even if this reduces later, this will remain high. I've seen platforms with 6,000 op/s on a 3 server architecture. The CPU impact could be significant, I suggest we properly test it during RC phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we properly test it during RC phase
What is the procedure to track this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the CPU cost is to elevated the only alternative I think is to disable it, but AWS does the check and looking at the body of the request it is fairly small so it should be ok
https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObjectRetention.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a variable in the config to disable the check in PutObjectRetention if needed
lib/api/bucketPutACL.js
Outdated
function bucketPutACL(authInfo, request, log, callback) { | ||
log.debug('processing request', { method: 'bucketPutACL' }); | ||
|
||
const err = validateChecksumsNoChunking(request.headers, request.post); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it the right place the validate checksum or should we rather perform the other validations first before doing CPU intensive validation ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can move it down, but I'd rather keep on top like in the other methods. I don't see how calculating the checksums could be exploited by an attacker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's another compelling reason to do the checksum earlier: we could avoid transcoding the string back to a Buffer
just for the purpose of checksumming. I don't know for sure but I suspect the cost in CPU and memory of converting to a Buffer is not negligible, even compared to the cost of checksumming itself.
I am thinking we could precompute the MD5 checksum at the same time we aggregate the buffers, i.e. around
Line 252 in d8b9b99
post.push(chunk); |
Another benefit could be that we could do generic validation against the headers for all request types, to reduce code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are we transcoding the String back to a Buffer? The crypto hash.update takes a string and does no transcoding.
Not all endpoints handle the checksum in the same way so I think we should handle them inside each endpoint and not in api.js
8094468
to
840de91
Compare
MD5Mismatch: 'md5 mismatch', | ||
MissingChecksum: 'missing checksum', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MD5Mismatch: 'md5 mismatch', | |
MissingChecksum: 'missing checksum', | |
MD5Mismatch: 'MD5Mismatch', | |
MissingChecksum: 'MissingChecksum', |
I'd prefer the same string for key and value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
lib/api/bucketPutACL.js
Outdated
|
||
const err = validateChecksumsNoChunking(request.headers, request.post); | ||
if (err && err.error !== ChecksumError.MissingChecksum) { | ||
log.debug(err, { error: errors.BadDigest }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the debug can be simplified into
log.debug(err, { error: errors.BadDigest }); | |
log.debug(err.error, err.details); |
as the BadDigest will show in the response already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the BadDigest get logged after we exit the handler? If not how do will we know that we sent a BadRequest to the client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, but there might be some debug logs to print the error sent to client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before my changes we logged the BadDigest so I rather keep it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed the BadDigest
lib/api/multiObjectDelete.js
Outdated
const err = validateChecksumsNoChunking(request.headers, request.post); | ||
if (err && err.error !== ChecksumError.MissingChecksum) { | ||
log.debug(err, { error: errors.BadDigest }); | ||
return callback(errors.BadDigest); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we always have the same pattern, it would be better to just return the arsenal error in the function, or nothing, and if there is an error, then we call the callback.
The debug log can be moved there too. Should probably be an error one as it makes the whole API fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, according to the doc:
Directory bucket - The Content-MD5 request header or a additional checksum request header (including x-amz-checksum-crc32, x-amz-checksum-crc32c, x-amz-checksum-sha1, or x-amz-checksum-sha256) is required for all Multi-Object Delete requests.
In the general case I think it's fine to ignore the MissingChecksum
error, but here, I believe we want to enforce it, so return an error if the header is not set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we have
it('should accept request when content-md5 header is missing', done => {
in the test, so I guess that was expected, but that looks not standard. Given your implementation it means there is just no need to return the MissingChecksum
error, we can just return nothing, if it's fine to not validate anything if the header is not present. That should simplify the processing in the APIs...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot require the checksum header yet because we don't support the other algorithms today, so this endpoint would break with the newer SDK's if we require it, after we implement the newer algorithms we can return an error on MissingChecksum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a ticket number as TODO so we do not forget anything when this will be done later? As this conflicts with that is in the PR message, that will help reviewers understand the context till this is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the conflict with the PR message?
I created this ticket CLDSRV-747 to enable the requirement when we have the other checksums
*/ | ||
function validateChecksumsNoChunking(headers, body) { | ||
if (headers['content-md5']) { | ||
const md5 = crypto.createHash('md5').update(body, 'utf8').digest('base64'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PutObjectRetention
is today (at least) called thousands of times per second with some solutions like Veeam. Even if this reduces later, this will remain high. I've seen platforms with 6,000 op/s on a 3 server architecture. The CPU impact could be significant, I suggest we properly test it during RC phase.
return { error: ChecksumError.MD5Mismatch, details: { calculated: md5, expected: headers['content-md5'] } }; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently do nothing of this object, when we handle errors. Should we just return the error and add a log in this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the caller should do the logging.
In the next iteration I will add SHA256Mismatch
, CRC32Mismatch
, etc. I don't see why we would want to squash all the possible errors into BadDigest
lib/api/bucketPutACL.js
Outdated
function bucketPutACL(authInfo, request, log, callback) { | ||
log.debug('processing request', { method: 'bucketPutACL' }); | ||
|
||
const err = validateChecksumsNoChunking(request.headers, request.post); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's another compelling reason to do the checksum earlier: we could avoid transcoding the string back to a Buffer
just for the purpose of checksumming. I don't know for sure but I suspect the cost in CPU and memory of converting to a Buffer is not negligible, even compared to the cost of checksumming itself.
I am thinking we could precompute the MD5 checksum at the same time we aggregate the buffers, i.e. around
Line 252 in d8b9b99
post.push(chunk); |
Another benefit could be that we could do generic validation against the headers for all request types, to reduce code duplication.
lib/api/objectPutRetention.js
Outdated
if (config.integrityChecks.PutObjectRetention) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (config.integrityChecks.PutObjectRetention) | |
{ | |
if (config.integrityChecks.PutObjectRetention) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are we transcoding the String back to a Buffer? The crypto hash.update takes a string and does no transcoding.
Not all endpoints handle the checksum in the same way so I think we should handle them inside each endpoint and not in api.js
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The crypto hash.update takes a string and does no transcoding.
Are you sure of this? I assumed that it was always necessary to convert to a Buffer internally since hashes operate on bytes, not characters (however optimizations are likely possible in theory if the encoding used for hashing is the same as the native storage format of JS strings, but I'm not sure this is used in practice and internal formats may not be utf-8
anyways, e.g. Python uses 2 bytes per character). I also asked Gemini, it seems to confirm my assumption, here's the first part of its answer:
When passing data to the Node.js crypto module's hash.update() function, using a raw Buffer is generally faster and more performant than using a string. 🚀 This performance difference stems from how Node.js handles these two data types.
Key Performance Differences
String Encoding Overhead: When you pass a string to hash.update(), Node.js must first convert that string into a sequence of bytes, or a Buffer. This conversion process requires additional CPU cycles and memory allocation to handle character encodings (like UTF-8, which is the default). For every string you pass, this encoding operation adds a small, but cumulatively significant, overhead.Direct Binary Data: A Buffer, on the other hand, is already a direct representation of raw binary data. It's essentially a block of pre-allocated memory outside the V8 JavaScript engine's heap. This means when you pass a Buffer to hash.update(), the function can directly access the underlying bytes without any conversion step. It's already in the format the cryptographic hash function needs, leading to more efficient processing.
[...]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all endpoints handle the checksum in the same way so I think we should handle them inside each endpoint and not in api.js
Do you know what are the differences in practice? I don't know about those, but maybe we could detect in some way the exceptions in the generic place? Just trying to see if we can do better, if it's indeed a pain to do it generically, no problem for me with doing it per call (but the performance point above may still apply).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a commit that moves the check to api.js
before the body gets transformed into a string.
Today the only non standard endpoint is DeleteObjects, it is the only method that requires a checksum (the requirement wont be added in this PR because we are missing the other algos).
After searching NodeJS does indeed decode into a new buffer here, I thought it did it on the fly.
97ceccc
to
9bdef3e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the POC LGTM, I'll review the final PR when completed, thanks for doing the change!
lib/Config.js
Outdated
}; | ||
|
||
if (config.integrityChecks) { | ||
if ('PutObjectRetention' in config.integrityChecks) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking that it could be even simpler to configure a map of integrity checks to disable for all possible calls, and provide this possibility for other calls than putObjectRetention
in case we need it. We would check the presence of a custom configuration with config.integrityChecks[apiCall]
instead in the generic handler, and enable the checks by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
015ec5c
to
340bbe3
Compare
@@ -1,5 +1,4 @@ | |||
const async = require('async'); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you cleanup the added/removed lines that were left over after the latest refactor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
tests/unit/api/api.js
Outdated
}); | ||
|
||
api.callApiMethod(method, requestWithBody, response, log, err => { | ||
assert(!err, `Unexpected error for ${method} with good checksum: ${err}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(!err, `Unexpected error for ${method} with good checksum: ${err}`); | |
assert.ifError(err, `Unexpected error for ${method} with good checksum: ${err}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
775732a
to
fb7f9d5
Compare
ping |
fb7f9d5
to
c63ab6d
Compare
ping |
Request integration branchesWaiting for integration branch creation to be requested by the user. To request integration branches, please comment on this pull request with the following command:
Alternatively, the |
/create_integration_branches |
ConflictA conflict has been raised during the creation of I have not created the integration branch. Here are the steps to resolve this conflict: git fetch
git checkout -B w/9.1/feature/CLDSRV-744-check-content-md5-2 origin/development/9.1
git merge origin/feature/CLDSRV-744-check-content-md5-2
# <intense conflict resolution>
git commit
git push -u origin w/9.1/feature/CLDSRV-744-check-content-md5-2 The following options are set: create_integration_branches |
/create_integration_branches |
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
The following options are set: create_integration_branches |
/approve |
ping |
In the queueThe changeset has received all authorizations and has been added to the The changeset will be merged in:
The following branches will NOT be impacted:
There is no action required on your side. You will be notified here once IMPORTANT Please do not attempt to modify this pull request.
If you need this pull request to be removed from the queue, please contact a The following options are set: approve, create_integration_branches |
Queue build failedThe corresponding build for the queue failed:
Remove the pull request from the queue
|
I have successfully merged the changeset of this pull request
The following branches have NOT changed:
Please check the status of the associated issue CLDSRV-744. Goodbye leif-scality. |
CLDSRV-744: add missing md5 checksum checks
validateChecksumsNoChunking
returns aMissingChecksum
error because some S3 methods (e.g. DeleteObjects) may require a checksum.The DeleteObjects only mentions that it requires an MD5 but in practice it requires either MD5 or a checksum using one of the new algorithms, the newer SDK's send CRC64NVME only. The requirement will be added after the new algorithms are implemented.
Support for the newer algorithms (CRC-64/NVME, CRC-32, CRC-32C, SHA-1, SHA-256) will be added inside validateChecksumsNoChunking in another PR.