-
Notifications
You must be signed in to change notification settings - Fork 12.2k
Add RLP library #5680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add RLP library #5680
Conversation
Co-authored-by: Hadrien Croubois <hadrien.croubois@gmail.com>
Co-authored-by: Arr00 <13561405+arr00@users.noreply.github.com>
🦋 Changeset detectedLatest commit: 5561d52 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
We should decide weither we are ok with the following feature:
Overall, we should also discuss naming of the functions in RLP.sol. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (9)
.changeset/lovely-cooks-add.md (1)
5-5
: Call out new public utils surfaced by this PR.This PR introduces public APIs in Bytes (Accumulator) and Memory (Slice). Consider mentioning them so downstream users see the new utilities in release notes.
Apply this diff to amend the body:
-`RLP`: Add library for Ethereum's Recursive Length Prefix encoding/decoding. +`RLP`: Add library for Ethereum's Recursive Length Prefix encoding/decoding. + +Also introduces `Bytes.Accumulator` (a memory‑only bytes builder) and `Memory.Slice` +utilities used by `RLP` and available for general use.test/utils/Bytes.t.sol (2)
10-10
: Narrow the using directive to avoid global pollution.Limit extension scope to the types you actually use.
Apply:
- using Bytes for *; + using Bytes for bytes; + using Bytes for bytes[];
227-237
: Great golden‑path test for Accumulator.Covers push/shift order and flatten result. Consider an explicit empty‑accumulator assertion as well.
Add alongside existing tests:
function testAccumulatorEmpty() public pure { Bytes.Accumulator memory acc = Bytes.accumulator(); assertEq(acc.flatten(), hex""); }contracts/utils/Bytes.sol (2)
271-299
: Return value on push/shift is unnecessary.Since memory params are passed by reference for internal calls, returning
Accumulator memory
is redundant and invites accidental copies.Apply:
- function push(Accumulator memory self, bytes memory data) internal pure returns (Accumulator memory) { + function push(Accumulator memory self, bytes memory data) internal pure { // ... - return self; } @@ - function shift(Accumulator memory self, bytes memory data) internal pure returns (Accumulator memory) { + function shift(Accumulator memory self, bytes memory data) internal pure { // ... - return self; }Note: update call sites only if they captured the return (current tests don’t).
141-161
: concat() is clear and safe; optional micro‑opt.You could preallocate and copy in one Yul loop, but current version favors readability and is fine.
contracts/utils/Memory.sol (1)
93-94
: Clarify the out-of-bounds behavior in documentation.The comment mentions that part of the return value will be out of bounds but should be ignored. This could lead to misuse where callers accidentally process invalid data.
Consider either:
- Adding a bounds check to prevent reading beyond the slice
- Updating the documentation to explicitly warn about undefined behavior
/** * @dev Read a bytes32 buffer from a given Slice at a specific offset * - * Note:If offset > length(slice) - 32, part of the return value will be out of bound and should be ignored. + * WARNING: If offset > length(slice) - 32, the return value will contain undefined data from memory + * beyond the slice bounds. Callers MUST ensure offset + 32 <= length(slice) or handle partial reads appropriately. */contracts/utils/RLP.sol (3)
216-217
: Consider removing or documenting the use case forreadRawBytes
.The TODO comment questions the usefulness of this function. If there's no clear use case, consider removing it to reduce the API surface area.
Either document a specific use case or remove this function if it's not needed.
348-349
: Replace TODO comments with proper custom errors.Multiple locations use generic
require
statements with TODO comments indicating custom errors should be added for sanity checks.Add specific custom errors for these validation cases:
+/// @dev Invalid RLP encoding: single byte string must be >= 0x80 +error RLPInvalidSingleByteString(); + +/// @dev Invalid RLP encoding: length has leading zeros +error RLPInvalidLeadingZeros(); // Line 348-349 -require(bytes1(item.load(1)) >= bytes1(SHORT_OFFSET)); // TODO: custom error for sanity checks +if (bytes1(item.load(1)) < bytes1(SHORT_OFFSET)) revert RLPInvalidSingleByteString(); // Line 356 -require(bytes1(item.load(0)) != 0x00); // TODO: custom error for sanity checks +if (bytes1(item.load(0)) == 0x00) revert RLPInvalidLeadingZeros(); // Line 376 -require(bytes1(item.load(0)) != 0x00); +if (bytes1(item.load(0)) == 0x00) revert RLPInvalidLeadingZeros();Also applies to: 356-357, 376-376
271-292
: Add validation for currentOffset to prevent potential issues.After the loop at line 285, there's no validation that
currentOffset
equalsitemLength
. This could hide malformed RLP data.Add a validation check after the loop:
uint256 itemCount; for (uint256 currentOffset = listOffset; currentOffset < itemLength; ++itemCount) { (uint256 elementOffset, uint256 elementLength, ) = _decodeLength(item.slice(currentOffset)); list[itemCount] = item.slice(currentOffset, elementLength + elementOffset); currentOffset += elementOffset + elementLength; } + +// Ensure we consumed exactly the expected bytes +require(currentOffset == itemLength, RLPContentLengthMismatch(itemLength, currentOffset)); // Decrease the array size to match the actual item count.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
.changeset/lovely-cooks-add.md
(1 hunks)contracts/mocks/Stateless.sol
(1 hunks)contracts/token/ERC20/extensions/ERC4626.sol
(1 hunks)contracts/utils/Bytes.sol
(2 hunks)contracts/utils/Memory.sol
(2 hunks)contracts/utils/README.adoc
(2 hunks)contracts/utils/RLP.sol
(1 hunks)test/utils/Bytes.t.sol
(2 hunks)test/utils/Memory.t.sol
(2 hunks)test/utils/RLP.t.sol
(1 hunks)test/utils/RLP.test.js
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
test/utils/RLP.test.js (1)
test/helpers/random.js (1)
generators
(3-10)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: halmos
- GitHub Check: coverage
- GitHub Check: slither
- GitHub Check: tests-upgradeable
- GitHub Check: tests
- GitHub Check: tests-foundry
🔇 Additional comments (16)
contracts/utils/README.adoc (2)
41-41
: Docs entry for RLP — LGTM.
142-143
: Nav token for RLP — ensure referenced page/anchor exists
LGTM; nav token present at contracts/utils/README.adoc:142 ({{RLP}}), but no corresponding RLP page/heading was found in docs/ or contracts/ — add or update the RLP page/anchor to avoid a broken docs reference.contracts/mocks/Stateless.sol (1)
47-47
: Importing RLP to exercise build graph — LGTM.If any linter complains about unused imports in mocks, consider referencing one symbol in Dummy1234 to silence it; otherwise fine.
test/utils/Bytes.t.sol (1)
239-249
: Property tests look good.These fuzz the concat equivalence both directions (push/shift). Nice coverage.
test/utils/Memory.t.sol (3)
6-6
: Bytes import for helpers — LGTM.
10-10
: Scoped using directives — LGTM.
24-37
: Solid invariants for Slice/asSlice/toBytes.Covers clamping and equivalence with Bytes.slice; good edge handling.
contracts/utils/Bytes.sol (1)
266-269
: Sentinel semantics — LGTM.Using 0x00 as an empty sentinel for head/tail is fine given EVM memory layout; just keep it documented.
.changeset/lovely-cooks-add.md (1)
2-2
: Confirm changeset package nameRepo contains packages "@openzeppelin/contracts" and "openzeppelin-solidity". .changeset/lovely-cooks-add.md targets "openzeppelin-solidity" — confirm you intended to bump that package; if you meant "@openzeppelin/contracts", rename the changeset header to match so the changeset is applied correctly.
contracts/token/ERC20/extensions/ERC4626.sol (1)
4-4
: Pragma bump to ^0.8.24 — not safe without repo-wide alignmentScan shows mixed Solidity pragmas across the repo (many files at ^0.8.20, some at ^0.8.26 and ^0.8.27) while the reviewed file is ^0.8.24.
- Action: either normalize all contracts to a single ^0.8.24+ pragma, or configure the build/CI to compile with multiple solc versions and verify CI.
- Examples from the scan: contracts/access/AccessControl.sol → ^0.8.20; contracts/account/extensions/draft-AccountERC7579.sol → ^0.8.26; contracts/utils/RLP.sol → ^0.8.27; contracts/token/ERC20/extensions/ERC4626.sol → ^0.8.24.
contracts/utils/Memory.sol (2)
47-47
: LGTM! Well-designed memory slice abstraction.The
Slice
type provides an efficient 256-bit packed representation combining length (upper 128 bits) and pointer (lower 128 bits), enabling zero-copy slicing operations.
3-3
: Resolved — pragma bump to ^0.8.24 is safe for local importersAll local files that import contracts/utils/Memory.sol declare pragma ^0.8.24 or newer — no change required.
Importers found:
- contracts/utils/Bytes.sol — pragma ^0.8.24
- contracts/utils/RLP.sol — pragma ^0.8.27
- contracts/token/ERC20/extensions/ERC4626.sol — pragma ^0.8.24
- contracts/mocks/Stateless.sol — pragma ^0.8.26
Note: test/utils/Memory.t.sol and test/utils/RLP.t.sol import @openzeppelin/contracts/utils/Memory.sol and remain pragma ^0.8.20 (unrelated to the local file).
test/utils/RLP.test.js (1)
10-18
: LGTM! Clean resolution of function overloads.The approach to resolve overloaded
encode
functions follows the established pattern from Math.test.js, providing clear typed variants for testing.test/utils/RLP.t.sol (1)
39-41
: LGTM! Good defensive check for list length.The assumption correctly limits the fuzz input to prevent excessive memory allocation during testing.
contracts/utils/RLP.sol (2)
288-290
: LGTM! Efficient array resizing using assembly.The inline assembly efficiently resizes the array to the actual item count without reallocation, avoiding memory waste.
119-126
: LGTM! Clever and efficient boolean encoding.The assembly implementation efficiently encodes booleans using bitwise operations:
false
→ 0x80 (empty item),true
→ 0x01, which aligns with the RLP specification treating booleans as 0/1 integers.
assembly ("memory-safe") { | ||
result := mload(0x40) | ||
let ptr := add(result, 0x20) | ||
for { | ||
let it := mload(self) | ||
} iszero(iszero(it)) { | ||
it := mload(it) | ||
} { | ||
let buffer := mload(add(it, 0x20)) | ||
let length := mload(buffer) | ||
mcopy(ptr, add(buffer, 0x20), length) | ||
ptr := add(ptr, length) | ||
} | ||
mstore(result, sub(ptr, add(result, 0x20))) | ||
mstore(0x40, ptr) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: free memory pointer not 32‑byte aligned in flatten().
FMP must be rounded up to the next 32‑byte boundary. Writing mstore(0x40, ptr)
leaves it potentially unaligned and can corrupt subsequent allocations.
Apply this diff:
function flatten(Accumulator memory self) internal pure returns (bytes memory result) {
assembly ("memory-safe") {
result := mload(0x40)
- let ptr := add(result, 0x20)
+ let dataStart := add(result, 0x20)
+ let ptr := dataStart
for {
let it := mload(self)
} iszero(iszero(it)) {
it := mload(it)
} {
let buffer := mload(add(it, 0x20))
let length := mload(buffer)
mcopy(ptr, add(buffer, 0x20), length)
ptr := add(ptr, length)
}
- mstore(result, sub(ptr, add(result, 0x20)))
- mstore(0x40, ptr)
+ let total := sub(ptr, dataStart)
+ mstore(result, total)
+ // advance FMP to the next 32-byte boundary after the copied data
+ let newFmp := and(add(ptr, 31), not(31))
+ mstore(0x40, newFmp)
}
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
assembly ("memory-safe") { | |
result := mload(0x40) | |
let ptr := add(result, 0x20) | |
for { | |
let it := mload(self) | |
} iszero(iszero(it)) { | |
it := mload(it) | |
} { | |
let buffer := mload(add(it, 0x20)) | |
let length := mload(buffer) | |
mcopy(ptr, add(buffer, 0x20), length) | |
ptr := add(ptr, length) | |
} | |
mstore(result, sub(ptr, add(result, 0x20))) | |
mstore(0x40, ptr) | |
} | |
function flatten(Accumulator memory self) internal pure returns (bytes memory result) { | |
assembly ("memory-safe") { | |
result := mload(0x40) | |
let dataStart := add(result, 0x20) | |
let ptr := dataStart | |
for { let it := mload(self) } iszero(iszero(it)) { it := mload(it) } { | |
let buffer := mload(add(it, 0x20)) | |
let length := mload(buffer) | |
mcopy(ptr, add(buffer, 0x20), length) | |
ptr := add(ptr, length) | |
} | |
let total := sub(ptr, dataStart) | |
mstore(result, total) | |
// advance FMP to the next 32-byte boundary after the copied data | |
let newFmp := and(add(ptr, 31), not(31)) | |
mstore(0x40, newFmp) | |
} | |
} |
🤖 Prompt for AI Agents
In contracts/utils/Bytes.sol around lines 302 to 317 the free memory pointer
stored with mstore(0x40, ptr) may be left unaligned after writing flattened
bytes; update the final free-memory store to round ptr up to the next 32-byte
boundary before storing it (i.e., compute alignedPtr = (ptr + 31) & ~31 and
mstore(0x40, alignedPtr)) so the FMP remains 32-byte aligned and subsequent
allocations are safe.
// TODO: Fail if len or ptr is larger than type(uint128).max ? | ||
assembly ("memory-safe") { | ||
result := or(shl(128, len), ptr) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add bounds validation for length and pointer values.
The TODO comment correctly identifies a potential issue. Values exceeding type(uint128).max
would cause silent truncation when packed into the 128-bit fields.
Apply this fix to add validation:
function _asSlice(uint256 len, Memory.Pointer ptr) private pure returns (Slice result) {
- // TODO: Fail if len or ptr is larger than type(uint128).max ?
+ require(len <= type(uint128).max, "Length exceeds uint128 max");
+ require(uint256(asBytes32(ptr)) <= type(uint128).max, "Pointer exceeds uint128 max");
assembly ("memory-safe") {
result := or(shl(128, len), ptr)
}
}
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In contracts/utils/Memory.sol around lines 58-62, add explicit bounds checks
before the assembly packing so neither len nor ptr exceed type(uint128).max; if
either value is > type(uint128).max revert with a clear error (or custom error)
to avoid silent truncation, then safely cast to uint128 (or mask) and perform
the existing assembly pack. Ensure both checks run on the original uint256
inputs and use descriptive revert messages or a custom error name.
// const invalidTests = [ | ||
// { name: 'short string with invalid length', input: '0x8100' }, | ||
// { name: 'long string with invalid length prefix', input: '0xb800' }, | ||
// { name: 'list with invalid length', input: '0xc100' }, | ||
// { name: 'truncated long string', input: '0xb838' }, | ||
// { name: 'invalid single byte encoding (non-minimal)', input: '0x8100' }, | ||
// ]; | ||
|
||
// invalidTests.forEach(({ name, input }) => { | ||
// it(`encodes ${name} into invalid RLP`, async function () { | ||
// const item = await this.mock.$toItem(input); | ||
// await expect(this.mock.$decodeBytes_bytes(item)).to.be.reverted; | ||
// }); | ||
// }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider uncommenting and implementing the invalid RLP tests.
The commented test cases for invalid RLP encoding scenarios would provide valuable coverage for error handling paths. These edge cases are important for security.
Would you like me to help implement these invalid RLP test cases? They would strengthen the test suite by validating proper error handling for malformed inputs.
🤖 Prompt for AI Agents
In test/utils/RLP.test.js around lines 136 to 149, the invalid RLP test cases
are commented out; re-enable and implement them so malformed encodings assert
that decoding reverts. Restore the invalidTests array and its forEach block,
convert each hex string input into an item via this.mock.$toItem(input) inside
each it() and assert await
expect(this.mock.$decodeBytes_bytes(item)).to.be.reverted; ensure each it() has
a descriptive name using the test case name and run within the same async
function context as the existing tests.
Requires:
Memory
utility library #5189reverseBits
operations to Bytes.sol #5724clz(bytes)
andclz(uint256)
functions #5725PR Checklist
npx changeset add
)