HPFS sync reliability: block requests sent to random peer cause hash mismatch when nodes have divergent state

## Summary

When a new node joins a running cluster where `index.js` (or any large state file) 
has been modified after the initial deployment, HPFS sync fails repeatedly with:
```
Hpfs cont sync: Skipping mismatched block response from [xxxxxxxx] for block_id:0 
(len:390541) of /state/index.js
```

The new node never syncs successfully and cannot join consensus.

## Root Cause

In `hpfs_sync.cpp`, `process_candidate_responses()` correctly requests the file 
hashmap from a peer, then generates block requests based on the received block hashes.
However in `request_state_from_peer()`, those block requests are sent to a **random 
peer** via `send_message_to_random_peer()`:
```cpp
p2p::send_message_to_random_peer(fbuf, target_pubkey); 
// todo: send to a node that hold the expected hash to improve 
// reliability of retrieving hpfs state.
```

If the random peer has a different version of the file than the peer that provided 
the hashmap, the received block data won't match the expected hash — causing 
`validate_file_block_hash()` to reject it and log the mismatch.

This is the existing TODO in the codebase. We hit this in practice when:

1. A 3-node cluster is running with a modified `index.js`
2. A new 4th node joins and requests `/state/index.js`
3. Node A provides the hashmap (based on its version of `index.js`)
4. The block request goes to Node B (random) which has a slightly different version
5. Block hash mismatch → sync fails indefinitely

## Reproduction

1. Create a 3-node cluster
2. Modify `index.js` on all 3 running nodes
3. Acquire a 4th node and deploy the contract bundle to it
4. Watch the 4th node's hp.log — it will repeatedly log `Skipping mismatched block 
   response` for `/state/index.js` and never join consensus

## Suggested Fix

Add a `source_peer` field to `sync_item` so block requests are directed to the 
same peer that provided the hashmap, rather than a random peer.

### 1. Add `source_peer` to `sync_item` in `hpfs_sync.hpp`
```cpp
struct sync_item
{
    SYNC_ITEM_TYPE type = SYNC_ITEM_TYPE::DIR;
    std::string vpath;
    int32_t block_id = -1;
    util::h32 expected_hash;
    bool high_priority = false;
    std::string source_peer; // Preferred peer for block requests (empty = random)
    uint32_t waiting_time = 0;
    // ...
};
```

### 2. Add `send_message_to_peer` to `p2p.hpp` and `p2p.cpp`
```cpp
// p2p.hpp
void send_message_to_peer(const flatbuffers::FlatBufferBuilder &fbuf, 
                          const std::string &preferred_pubkey, 
                          std::string &target_pubkey);

// p2p.cpp
void send_message_to_peer(const flatbuffers::FlatBufferBuilder &fbuf, 
                          const std::string &preferred_pubkey, 
                          std::string &target_pubkey)
{
    std::scoped_lock<std::mutex> lock(ctx.peer_connections_mutex);

    if (!preferred_pubkey.empty())
    {
        const auto it = ctx.peer_connections.find(preferred_pubkey);
        if (it != ctx.peer_connections.end())
        {
            it->second->send(msg::fbuf::builder_to_string_view(fbuf));
            target_pubkey = it->second->uniqueid;
            return;
        }
        LOG_DEBUG << "Preferred peer " << preferred_pubkey.substr(2, 8) 
                  << " not found. Falling back to random peer.";
    }

    // Fall back to random peer.
    const size_t connected_peers = ctx.peer_connections.size();
    if (connected_peers == 0)
    {
        LOG_DEBUG << "No peers to send.";
        return;
    }

    auto it = ctx.peer_connections.begin();
    std::advance(it, rand() % connected_peers);
    it->second->send(msg::fbuf::builder_to_string_view(fbuf));
    target_pubkey = it->second->uniqueid;
}
```

### 3. Update `request_state_from_peer` in `hpfs_sync.cpp`
```cpp
void hpfs_sync::request_state_from_peer(const std::string &path, const bool is_file, 
                                        const int32_t block_id,
                                        const util::h32 expected_hash, 
                                        std::string &target_pubkey,
                                        const std::string &preferred_peer = "")
{
    // ... existing code ...
    
    // Use preferred peer if specified, otherwise random.
    if (!preferred_peer.empty())
        p2p::send_message_to_peer(fbuf, preferred_peer, target_pubkey);
    else
        p2p::send_message_to_random_peer(fbuf, target_pubkey);
}
```

### 4. Update `submit_request` to pass `source_peer`
```cpp
request_state_from_peer(request.vpath, is_file, request.block_id, 
                        request.expected_hash, target_pubkey,
                        request.source_peer); // Pass preferred peer
```

### 5. Update `handle_file_hashmap_response` to tag block requests with source peer
```cpp
int hpfs_sync::handle_file_hashmap_response(std::string_view vpath, 
                                            const mode_t file_mode,
                                            const util::h32 *hashes, 
                                            const size_t hash_count,
                                            const std::set<uint32_t> &responded_block_ids,
                                            const uint64_t file_length,
                                            const std::string &from_peer) // NEW
{
    // ... existing code ...
    for (int32_t block_id = 0; block_id <= max_block_id; block_id++)
    {
        sync_item item{SYNC_ITEM_TYPE::BLOCK, std::string(vpath), block_id, hashes[block_id]};
        item.source_peer = from_peer; // Tag with hashmap source peer
        pending_requests.emplace(item);
    }
}
```

### 6. Pass `response.first` (full pubkey) to `handle_file_hashmap_response`
```cpp
// In process_candidate_responses():
handle_file_hashmap_response(vpath, file_resp.file_mode(), block_hashes, 
                             block_hash_count, responded_block_ids, 
                             file_resp.file_length(),
                             response.first); // Pass full sender pubkey
```

## Impact

This change addresses the existing TODO comment and improves HPFS sync reliability 
in scenarios where nodes have divergent state — particularly when contract files are 
updated on a running cluster. In the stable case (all nodes identical) behaviour is 
unchanged since the preferred peer will always respond correctly.

## Testing

Tested on Evernode mainnet with a 5-node cluster running HotPocket 0.6.4. The 
`Skipping mismatched block response` error occurs reproducibly when adding a new 
node after modifying `index.js` on the running cluster.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HPFS sync reliability: block requests sent to random peer cause hash mismatch when nodes have divergent state #412

Summary

Root Cause

Reproduction

Suggested Fix

1. Add `source_peer` to `sync_item` in `hpfs_sync.hpp`

2. Add `send_message_to_peer` to `p2p.hpp` and `p2p.cpp`

3. Update `request_state_from_peer` in `hpfs_sync.cpp`

4. Update `submit_request` to pass `source_peer`

5. Update `handle_file_hashmap_response` to tag block requests with source peer

6. Pass `response.first` (full pubkey) to `handle_file_hashmap_response`

Impact

Testing

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

HPFS sync reliability: block requests sent to random peer cause hash mismatch when nodes have divergent state #412

Description

Summary

Root Cause

Reproduction

Suggested Fix

1. Add source_peer to sync_item in hpfs_sync.hpp

2. Add send_message_to_peer to p2p.hpp and p2p.cpp

3. Update request_state_from_peer in hpfs_sync.cpp

4. Update submit_request to pass source_peer

5. Update handle_file_hashmap_response to tag block requests with source peer

6. Pass response.first (full pubkey) to handle_file_hashmap_response

Impact

Testing

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. Add `source_peer` to `sync_item` in `hpfs_sync.hpp`

2. Add `send_message_to_peer` to `p2p.hpp` and `p2p.cpp`

3. Update `request_state_from_peer` in `hpfs_sync.cpp`

4. Update `submit_request` to pass `source_peer`

5. Update `handle_file_hashmap_response` to tag block requests with source peer

6. Pass `response.first` (full pubkey) to `handle_file_hashmap_response`