Skip to content

BFT snapshots#3

Open
lploom wants to merge 37 commits intomainfrom
bft-snapshots
Open

BFT snapshots#3
lploom wants to merge 37 commits intomainfrom
bft-snapshots

Conversation

@lploom
Copy link
Copy Markdown
Contributor

@lploom lploom commented Mar 26, 2026

Added BFT snapshotting as defined in the Yellowpaper.

New block structure
Removed "minerAddress" (20 bytes) field and added "payloadRoot" field (32 bytes), so the block headers grew from 100 bytes to 112 bytes. In addition, the block now contains variable length payload of hash of miner reward token (32 bytes) and UTB_cbor bytes (variable size depending on number of BFT nodes etc). The payloadRoot is calculated as H(reward_token_hash || UTB_hash).

Reward tokens
Replaced miner addresses with reward tokens. Every time a block is mined, a reward token is generated and stored locally, the hash of the reward token is appended to the block payload. By default the reward tokens are stored to:
$ cat /tmp/node1/reward_tokens.csv

Height,BlockHash,TokenID
1,010298102a9a88e9c483c2c9952532f433548ae0792780366b2ab00c72f44568,5196fd2c158dcd030f4a148c62590d6c80f610fff9230fd41c1b3ba3484b7093
2,38787a604501b567744d42056a7e2d6508a90e0357ddad3127de828aa1182f0b,b877190e6f7509eacf0756bd9db4c33ada2a8878bad0fe9da57d9e534b9e5093

BFT layer integration
If a block is mined then the miner queries BFT layer for new UTBs, if a new UTB is found then it is also appended to block payload. If new UTBs are not found then nothing is appended to the block and zero hash is used to calculate payloadRoot. The BFT node is configurable via --bftaddr flag which defaults to "http://127.0.0.1:25866" in testnet and mainnet. In REGTEST mode the BFT integration is disabled and can be enabled by explicitly setting the --bftaddr flag.

@lploom lploom requested a review from MastaP March 26, 2026 15:21
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the blockchain protocol to a headers-only structure with a fixed 112-byte header and a variable-length payload, integrating BFT trust base verification and a new miner reward token system. My review identified a potential issue with the specified git tag for cpp-httplib in CMakeLists.txt and an outdated seed corpus in the fuzzing build script that needs to be updated to match the new header size.

Comment thread CMakeLists.txt
FetchContent_Declare(
httplib
GIT_REPOSITORY https://github.com/yhirose/cpp-httplib.git
GIT_TAG v0.38.0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The specified git tag v0.38.0 for cpp-httplib does not appear to exist in the official yhirose/cpp-httplib repository. The latest version seems to follow a v0.major.minor scheme (e.g., v0.15.3). This could be a typo and might cause build failures for users who do not have this specific commit cached. Please verify the tag and use a valid, existing tag to ensure build reproducibility.

Comment thread fuzz/oss-fuzz/build.sh Outdated
@lploom lploom linked an issue Mar 29, 2026 that may be closed by this pull request
Copy link
Copy Markdown
Member

@MastaP MastaP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: BFT Snapshots

24 inline comments below plus 3 cross-cutting notes. Summary by severity:

  • Critical (4): Broken submitheader/submitblock RPCs, no max payload size, quorum_threshold=0 bypass, getrandom() fails on macOS
  • High (7): Wire format incompatibility, persistence migration, total_stake overflow, HTTP response injection, unbounded CBOR, mining thread crash, unvalidated trust base loading from disk
  • Medium (6): Hot-path heap allocation, CBlockIndex memory bloat, httplib in public header, secp256k1 context leak, state file permissions, fragile initializer list
  • Low (7): Debug string typo, dead typedef, unused include, duplicate code, unnecessary deep copy, redundant member, inconsistent error handling

Cross-cutting issues (not pinned to a specific line):

25. Python functional tests are broken on this branch. consensus_asert_difficulty.py and consensus_timestamp_bounds.py still build 100-byte headers with 20-byte miner field. These will fail.

26. E2E integration tests do not exercise the new payload validation. All tests in test/integration-chain/ use SetBypassPOWValidation(true), which skips CheckBlockHeader entirely. No test verifies the full payload validation pipeline end-to-end.

27. Missing negative tests for payload validation. There is no test that verifies blocks with a mismatched payloadRoot (vs actual payload content) are correctly rejected by CheckBlockHeader.

Comment thread src/network/rpc_server.cpp Outdated
if (hex.size() != 200) {
return util::JsonError("Invalid header length (expect 200 hex chars)");
// Expect exactly 224 hex chars (112 bytes)
if (hex.size() != 224) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: submitheader and submitblock RPCs are broken — external mining is impossible

Both RPCs enforce exactly 224 hex chars (112 bytes = bare header with no payload). However, CheckBlockHeader now requires vPayload.size() >= 32. Any block submitted via these RPCs will have an empty vPayload and fail validation with "bad-payload-size".

Additionally, getblocktemplate does not return payload_root, reward_token_id, or UTB data, so external miners have no way to construct valid blocks even if the size check were fixed.

Suggestion: Accept variable-length hex input (minimum 224 chars for the header + at least 64 chars for the 32-byte token ID hash):

if (hex.size() < 288) {  // 112-byte header + 32-byte minimum payload
    return util::JsonError("Invalid block data (minimum 288 hex chars: 112-byte header + 32-byte payload)");
}

And update getblocktemplate to return the payload fields.

Comment thread src/chain/validation.cpp Outdated
Comment thread src/chain/trust_base.cpp Outdated
Comment thread src/chain/token_generator.cpp Outdated
Comment thread src/network/message.cpp
Comment thread src/chain/trust_base.cpp Outdated
Comment thread src/chain/trust_base.cpp Outdated

std::vector<uint8_t> data = nlohmann::json::to_cbor(j);

// Prepend tag 1013 (d9 03f5)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low: Duplicate CBOR tag prepending in SigBytes() and ToCBOR()

The 0xd9 0x03 0xf5 tag-prepending logic is duplicated between here (lines 139-145) and ToCBOR() (lines 155-161). Extract a helper:

static std::vector<uint8_t> PrependCBORTag1013(std::vector<uint8_t> data) {
    data.insert(data.begin(), {0xd9, 0x03, 0xf5});
    return data;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahtotruu has changed the tags, need to check what's the new value is

Comment thread src/chain/trust_base.cpp Outdated
Comment thread include/chain/trust_base_manager.hpp Outdated
Comment thread src/chain/trust_base_manager.cpp Outdated
Copy link
Copy Markdown
Member

@MastaP MastaP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review after fixes (20 commits since initial review)

Fixed correctly (no further action needed)

# Finding Status
F2 MAX_PAYLOAD_SIZE Fixed — 4096 byte limit with tests
F3 quorum_threshold==0 Fixed — rejects 0 threshold and empty root_nodes, with overflow checks
F4 getrandom on macOS Fixed — replaced with cross-platform std::random_device
F7 total_stake overflow Fixed — overflow check before accumulation
F8 HTTP response body injection Fixed — truncated to 4096 bytes
F9 Unbounded CBOR Fixed — MAX_BFT_RESPONSE_SIZE (1 MB) check
F12 Serialize heap alloc Fixed — new SerializeHeader() returns std::array
F14 httplib in public header Fixed — pimpl with forward declaration
F15 secp256k1 context leak Fixed — uses secp256k1_context_static
F16 State file permissions Fixed — writes with 0600
F17 Fragile initializer list Fixed — uses parameter directly
F18 "miner=" in ToString Fixed
F19 Dead HeaderBytes Fixed — now used by SerializeHeader()
F20 Unused iostream Fixed
F22 SigBytes deep copy Fixed — serializes directly
F23 Redundant latest_trust_base_ Fixed — uses GetLatestLocked() from map
F24 Inconsistent error handling Fixed — SyncTrustBases no longer swallows exceptions

Still open (3 from original + Python tests)

# Finding Severity
F1 submitheader still broken (submitblock was fixed) Critical
F6 bft_epoch required field breaks backward compat High
F10 Mining thread has no top-level exception handler High
F21 Duplicate CBOR tag code Low
F25 Python functional tests still broken (submitheader + 100-byte headers) High

F11 (trust base disk loading without verification), F13 (vPayload memory bloat) — acknowledged as by-design / future work.

New issues (4 inline comments)

See inline comments below.

Comment thread src/network/rpc_server.cpp Outdated
Comment thread src/chain/block_manager.cpp
Comment thread src/chain/miner.cpp
Comment thread src/network/rpc_server.cpp Outdated
Comment thread src/network/rpc_server.cpp
Comment thread src/network/rpc_server.cpp Outdated
Comment thread src/chain/trust_base.cpp Outdated

std::vector<uint8_t> data = nlohmann::json::to_cbor(j);

// Prepend tag 1013 (d9 03f5)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low (from F21, still open): CBOR tag 1013 prepend logic still duplicated

Lines 133-139 (SigBytes) and 149-155 (ToCBOR) contain identical tag-prepending code. Consider extracting a helper:

static std::vector<uint8_t> PrependCBORTag1013(std::vector<uint8_t> data) {
    data.insert(data.begin(), {0xd9, 0x03, 0xf5});
    return data;
}

Comment thread src/chain/block_manager.cpp
@lploom lploom requested a review from MastaP April 8, 2026 06:10
Copy link
Copy Markdown
Member

@MastaP MastaP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional review findings:

  1. [P0] submitheader still rejects payload-bearing blocks. HandleSubmitHeader accepts inputs longer than 112 bytes, but then rejects any decoded blob whose size is not exactly CBlockHeader::HEADER_SIZE, so header+payload submissions still fail before validation.

  2. [P1] submitheader is unusable on testnet. The RPC unconditionally calls TestSetSkipPoWChecks(skip_pow), and that helper throws on every non-regtest network even when skip_pow is false.

  3. [P1] The loader still rejects pre-upgrade chainstate files. required_fields now requires both payload_root and bft_epoch, so old persisted headers are still rejected as CORRUPTED instead of migrating or defaulting.

  4. [P1] The HEADERS wire-format change is not version-gated. The protocol version remains unchanged even though HEADERS changed from fixed-size header entries to length-prefixed full block blobs, so incompatible peers can still handshake and only fail later during parsing.

  5. [P1] The cpp-httplib pin does not exist upstream. FetchContent still uses v0.38.0, which is not present on the upstream tags page, so clean builds are not reproducible.

  6. [P2] The submitblock negative tests no longer exercise the parser. Several functional/integration tests now omit the required rewardtokenid argument and pass on the usage error path before deserialization or validation runs.

I also verified that several older unresolved threads are fixed in the current branch and will resolve those separately.

@lploom
Copy link
Copy Markdown
Contributor Author

lploom commented Apr 8, 2026

Additional review findings:

  1. [P0] submitheader still rejects payload-bearing blocks. HandleSubmitHeader accepts inputs longer than 112 bytes, but then rejects any decoded blob whose size is not exactly CBlockHeader::HEADER_SIZE, so header+payload submissions still fail before validation.
  2. [P1] submitheader is unusable on testnet. The RPC unconditionally calls TestSetSkipPoWChecks(skip_pow), and that helper throws on every non-regtest network even when skip_pow is false.
  3. [P1] The loader still rejects pre-upgrade chainstate files. required_fields now requires both payload_root and bft_epoch, so old persisted headers are still rejected as CORRUPTED instead of migrating or defaulting.
  4. [P1] The HEADERS wire-format change is not version-gated. The protocol version remains unchanged even though HEADERS changed from fixed-size header entries to length-prefixed full block blobs, so incompatible peers can still handshake and only fail later during parsing.
  5. [P1] The cpp-httplib pin does not exist upstream. FetchContent still uses v0.38.0, which is not present on the upstream tags page, so clean builds are not reproducible.
  6. [P2] The submitblock negative tests no longer exercise the parser. Several functional/integration tests now omit the required rewardtokenid argument and pass on the usage error path before deserialization or validation runs.

I also verified that several older unresolved threads are fixed in the current branch and will resolve those separately.

3 and 4 are about backwards compatibility, so not relevant.
5 is fake news (https://github.com/yhirose/cpp-httplib/releases/tag/v0.38.0)
Updated 1, 2 and 6.

@lploom lploom requested a review from MastaP April 8, 2026 14:09
@MastaP
Copy link
Copy Markdown
Member

MastaP commented Apr 10, 2026

@lploom lploom requested a review from MastaP April 14, 2026 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[TGE] BFT state snapshots to PoW chain

3 participants