Add Stratum V2 (SV2) protocol support#1553
Add Stratum V2 (SV2) protocol support#1553warioishere wants to merge 3 commits intobitaxeorg:masterfrom
Conversation
8882a06 to
9551b7e
Compare
|
Please add some screenshots. I've had a cursory look at the code, will do a more in-depth review later. I'm not opposed to agentic development, however, only if the author is familiar with the codebase and does first line code reviews as well. This codebase looks decent, but I did see some areas where it could benefit more from separation of concerns. For example: would it be cleaner to have a dedicated If you have Claude work on it, please make it aware of #901, and have it keep that in mind that if we're splitting off work production protocols, this is something that I might like to add as well in the future. |
Hey, I'll refactor this using a protocol coordinator pattern. I'll extract V1 into its own task (stratum_v1_task), remove the cross-protocol dependencies from sv2_task, I'm keeping create_jobs_task unified since it already works fine, and I don't think the global_state union or vtable stuff is worth the complexity. I'll remove the old stratum_task.c once everything tests out. I read #901 already, but I thought about usings sv2 capability to connect to a separate jd-client that can connect to bitcoin core via Sjors TP, that would give us also full control over templates and dezentralize things further, but yea, I can have a look onto this option directly getting templates from core as its far simplier then using a jd-client. We can look at this in further developement, okay? |
9551b7e to
0470129
Compare
|
Refactoring is done, everything squashed into a single commit. Here's what changed since last time: V1 and V2 now live in their own task files (stratum_v1_task.c and stratum_v2_task.c), with a protocol coordinator handling all the fallback and recovery logic. The old stratum_task.c is gone. Fixed the sv2_api.h naming mismatch, it's consistently sv2_protocol now. Both primary and fallback pool support SV2 selection, and the UI only shows relevant options per protocol. Heartbeat probing works for both V1 and V2 primaries, so liveness checks are covered regardless of protocol combo. Also fixed a bunch of stability issues with protocol switching - the old implementation would crash or restart the device when switching between V1 and V2 at runtime. That's sorted now. Extended channel support and GetBlockTemplate mining are left for follow-up PRs as discussed. Ready for review 🚀 |
|
No need to do squash and force-push. This makes it harder to review incremental changes. |
|
i made a backup of the branch before squashing, should I revert? |
No it's fine for this PR. Sometime a squash and/or force push is necessary to fix merges, so no problem. Just for future reference, the code will be squashed into a single commit on merge anyways. |
|
Can you see if you can update your branch, it looks like some changes that were added to master have been added here as well. These might drop out if you update. |
0470129 to
b3d3a46
Compare
Updated the branch, rebased onto latest master. The 3 duplicate commits dropped out. Should be a cleaner diff now. |
Add full Stratum V2 mining protocol support to the bitaxe, enabling encrypted communication with SV2 pools via Noise_NX handshake (secp256k1+EllSwift, ChaChaPoly, SHA256). Includes a robust protocol coordinator for clean failover between any combination of V1/V2 primary and fallback pools without device restarts. Protocol implementation: - SV2 binary protocol (components/stratum_v2/): SetupConnection, OpenStandardMiningChannel, NewMiningJob, SetNewPrevHash, SetTarget, SubmitShares with proper frame encoding/decoding - Noise encryption (sv2_noise.c): Full NX handshake with optional authority public key verification (TOFU mode when unconfigured) - libsecp256k1 v0.6.0 as git submodule for elliptic curve operations Protocol coordinator and fallback: - Non-blocking event-driven coordinator manages protocol task lifecycle - Supports all 4 failover combinations: V2->V1, V2->V2, V1->V2, V1->V1 - Timer-based heartbeat probes primary pool during fallback operation - User-selected fallback (dashboard toggle) disables auto-recovery - Clean state transitions: queue clear, share stats reset, proper task shutdown with event synchronization Key reliability fixes: - Heap-allocate sv2_conn to prevent dangling pointer after task exit - Dynamic protocol check in create_jobs_task (was cached at startup, causing memory corruption on protocol switch) - Single event per task exit (was double-signaling coordinator) - Remove esp_restart() from V1 task, notify coordinator instead - Fix V1 transport handle leak (destroy after close) - Remove close_connection race from asic_result_task Frontend and configuration: - NVS settings for SV2 authority pubkey and fallback pool protocol - Pool settings UI: protocol selector and SV2 pubkey for both pools, V1-only options hidden when SV2 selected - Display: hide block height and scriptsig in SV2 mode (not available in standard channel), show protocol indicator instead - OpenAPI spec updated with new SV2 configuration fields
b3d3a46 to
51787d9
Compare
coordinator_state_t and coordinator_event_t are only used inside protocol_coordinator.c, no need to expose them in the header.
|
Which parts of SV2 protocol does this implement, is it Mining protocol or JD as well? Does it include both extended and standard channels support? |
|
currently no extented channels, that will be part of a next PR. and no JD-client, but I am building an easy to setup full stack deployment which you could run on a raspberry pi, or even on the node itself. Contains a JD-Client, TP from Sjor, and a bitcoincore setup build with --enable-multiprocess so that the IPC Unix socket connection method can be used: |
|
jst confirmed the implementation works also againstthe original sv2 reference server: |
|
Do you have a Max (BM1397) device to test against? The others are functionally equivalent with respect to job construction/asic comms. |
ntime rolling on a bitaxe seems quite backwards? |
|
elaborating on the comment above IIUC the BM1370 doesn't support version rolling, which is forcing you to roll ntime so you can stick with a standard channel, right?
"allowed" but not definitely not encouraged... rolling ntime can have bad consequences on consensus level as a rule of thumb, ntime should only be increased when we want to reset the search space and we know we've been hashing for longer than 1s and it should happen like this:
or at least some variation that respects ntime as something that progresses together with real time, and not something that rolls indefinitely into the past or future (with undesired consequences) overall, I think it's a very good idea to support Sv2 Standard Channels as a "first class citizen" on AxeOS but for edge cases where ASICs cannot do version rolling, I would go with Extended Channels and not try to reinvent the wheel |
Can you elaborate on this? From what I see in the codebase, the BM1370 does do hardware version rolling:
These are latest-gen Antminer S21 chips — would be surprising if they dropped version rolling support. The BM1397 is the only one where That said, you're right that the ntime rolling approach is wrong regardless. Even if the ASIC does version roll, bumping ntime by +1 every 500ms job send is not how ntime should be used. I'll fix that — either remove the offset entirely (since version rolling gives enough search space) or clamp it to real elapsed wall-clock time. What would you suggest as the right approach here for standard channels? |
the BM1397 is the only ASIC where set_version_mask is a no-op — it doesn't do hardware version rolling. That means SV2 standard channels won't give it enough search space even with jst ntime rolling, which isn't a good solution (see discussion with plebhash above). I think there are two options here:
What do you think makes more sense? |
|
Excluding a chip from a protocol is a bit of a nasty dependency. Maybe adding Extended channels also puts it more in line with how SV1 currently works. I can't really oversee how much more work that is to support though. |
|
Yeah, I agree excluding a chip entirely from a protocol isn't great. I think the clean approach is:
The channel type decision is just a runtime check on the ASIC ID — use For this PR I'd scope it to standard channels only, which means BM1397 stays on SV1 for now. Extended channel support in a follow-up PR would unlock SV2 for the Max as well. |
|
I looked into the effort for Extended Channel support. The good news is most of the heavy lifting already exists — One option would be to drop Standard Channels entirely and go straight to Extended Channels. That way:
The tradeoff is that Extended Channels put more work on the miner (coinbase + merkle computation), but the ESP32 already does this for SV1 without issues. What would you suggest — Extended-only, or keep both with a runtime ASIC check? Thats about a week more effort or so. |
I'm not claiming this is true and I don't know whether BM1370 can do version rolling or not. I just got this understanding from the PR description.
|
@plebhash Its the next cheapest thing, aside from nonce in time |
if you know what you're doing yeah it's possible, I woudn't be suprised if there were other industry examples aside from BZM2 but rolling ntime has plenty of consensus-related footguns while rolling in both directions, so I don't think it's a good idea to embed it as a first-class citizen feature on a FOSS project, mainly due to the unnecessary/avoidable added complexity and maintenance burden, especially given there's viable alternatives
assuming version rolling is available, isn't that equally cheap? anyways, since you mentioned 256foundation/mujina#28 (comment) I'll tag @johnny9 to get his own take on this |
|
ntime rolling is not bad and will be increasingly necessary as its clear to me that future chips will be doing it as well. every bit adds an exponential amount of nonce space. We're talking about a few seconds difference in the timestamp. This is not a problem for consensus. It can be configured of course but in the BZM2's case it will be needed to give the mcu room to process as it will need the additional space to hit the hard real time demands of the work generation. I would try to avoid rolling ntime on the mcu itself and only let the chips do it when necessary. We should have support for whatever sv2 needs to let us roll ntime. |
Sv2 is mostly agnostic about that jobs are broadcast with a if clients break consensus by rolling too far into the future, their shares SHOULD be rejected (but this is just me saying, the spec doesn't even mention that AFAIR) |
…s detection - Add NVS config for SV2 channel type (extended/standard) per pool - Expose sv2ChannelType and fallbackSv2ChannelType in /info endpoint - Add channel type selector in pool settings with extended/standard radio buttons - Disable standard channel option for BM1397 (requires extended channels) - Add Mode label in dashboard Pool card showing active protocol - Use hasCoinbaseVisibility() helper for conditional dashboard elements - Fix extranonce_size interpretation: it is the miner's rollable portion, not total - Add testnet/regtest network auto-detection from user address prefix - Support parametric bech32 HRP (bc/tb/bcrt) and base58 version bytes - Remove ntime rolling for SV2 standard channels (version rolling only) - Remove info icons from SV2-related texts in pool configuration - Add calculate_coinbase_tx_hash_bin() for binary coinbase hashing - Add SV2 extended channel message types and parsing functions
ok after some reflection I guess I can entertain that, thanks for clarifying we have 2h=7200s to roll into the future before breaking consensus 2^12=4096 so if we limit ntime rolling to the 12 LSBs, we're risk-free on breaking consensus interestingly, Sv2 spec states that Standard Jobs are limited to 280TH/s max before rolling ntime if we lift restrictions on ntime rolling, we multiply that ceiling by 4096 so ok, I retract what I said above @johnny9 how many ntime bits are people rolling in the industry? |
The BZM2 does 7 bits max. |
interesting thanks so assuming BIP320 version rolling and 7 bits ntime rolling, Sv2 Standard Jobs have a ceiling of 280T*128=35.8PH/s |
|
For those unfamiliar, the network constraint for ntime is [median past time, consensus time + 7200s] Interestingly its a growing constraint so if we have no block for 1 hour our forward constraint grows to 3 hours sources |







Summary
Adds Stratum V2 binary protocol support alongside the existing V1 JSON-RPC implementation. Tested on a BM1370 bitaxe against a local SRI server and the SRI reference pool — full 1.3 TH/s hashrate with shares accepted.
What's included:
What works:
Open decisions
Based on review feedback, the following items are being discussed before this PR is finalized:
1. ntime rolling removal — The current implementation rolls ntime to generate unique ASIC work. This is unnecessary since BM1366/BM1368/BM1370 all do hardware version rolling, which provides sufficient search space. ntime should track real time, not be used as a work diversifier. Will be removed.
2. Standard vs Extended Channels — Currently only standard channels are implemented. The BM1397 (Bitaxe Max) doesn't support hardware version rolling, so standard channels alone won't work for it. Options under discussion:
Test plan
Test servers
Public SV2 test server:
blitzpool-test.yourdevice.ch33339bCoFxTszKCuffyywH5uS5o6WcU4vsjTH2axxc7wE86y2HhvULUSRI reference pool (confirmed working):
75.119.150.11133339auqWEzQDVyd2oe1JVGFLMLHZtCo2FFqZwtKA5gd9xbuEu7PH72For transparency: most of this implementation was done with the help of Claude (Opus 4.6). I hope that doesn't detract from the goal of bringing SV2 support to bitaxe.