Skip to content

Conversation

@hoffmang9
Copy link
Member

@hoffmang9 hoffmang9 commented Feb 5, 2026

Summary

  • Adds macOS Apple Silicon (arm64) support for building the hardware VDF client binaries (hw_vdf_client, hw_test, plus emu variants).
  • Extends the HW build workflow to run on macos-13-arm64, downloading FTDI’s LibFT4222 package and wiring headers + dylibs into src/hw/libft4222.
  • Updates the build system so HW binaries can locate the FTDI dylibs at runtime via an @executable_path rpath.
  • Documents macOS setup steps in README_ASIC.md.

Notes

  • This PR uses LibFT4222-mac-v1.4.4.190 because it ships universal dylibs including arm64.
  • Includes small macOS compatibility fixes in hw_vdf_client (socket address initialization + portable uint64_t formatting).

Test plan

  • CI: verify the HW Build (macOS arm64) job builds and uploads artifacts.
  • Local (macOS arm64): download/mount LibFT4222, copy into src/hw/libft4222, then:
./scripts/get-libft4222.sh install
cd src
make -f Makefile.vdf-client hw_test hw_vdf_client emu_hw_test emu_hw_vdf_client
./hw_vdf_client --list

Note

Medium Risk
Moderate risk due to broad build-system and low-level portability changes (CMake/Makefile, asm generation, allocator and Windows ABI/socket handling) that could affect cross-platform builds and runtime behavior even though core algorithms are largely unchanged.

Overview
Adds cross-platform CI builds for the hardware VDF client: new GitHub Actions jobs build and artifact hw_vdf_client/hw_test (and emu variants) on macOS arm64 and Windows, plus smoke tests; the main test workflow now also runs on Windows (with sanitizer exclusions) using a CMake/Ninja + clang-cl build.

Introduces helper scripts (get-libft4222.sh and get-libft4222.ps1) to download/stage FT4222H driver headers/libs and updates docs/ignore rules accordingly. Build system changes expand src/CMakeLists.txt with toggles for building client/bench/tests/HW tools, add Windows/macOS compatibility (Boost/MPIR handling, rpaths, calling convention/asm tweaks, aligned alloc, logging/format fixes), and switch shared parsing streams to thread_local plus AVX flag detection to atomics/env-controlled behavior.

Written by Cursor Bugbot for commit 41b1369. This will update automatically on new commits. Configure here.

@hoffmang9 hoffmang9 changed the base branch from main to nudupl-arm64-ci February 5, 2026 07:58
@hoffmang9
Copy link
Member Author

hoffmang@MacBook-Air-2025-07 src % ./hw_vdf_client --list
List of available devices:
Device 0: 'Chia VDF A', loc 17

hoffmang@MacBook-Air-2025-07 src % ./hw_test
2026-02-05T16:47:42.944 Setting frequency to 1100.000000 MHz
2026-02-05T16:47:43.066 Frequency is 1100.000000 MHz
2026-02-05T16:47:43.068 Board voltage is 0.849 V
2026-02-05T16:47:43.068 Setting voltage to 0.880 V
2026-02-05T16:47:43.069 Board voltage is now 0.875 V
2026-02-05T16:47:43.083 Board current is 0.590 A
2026-02-05T16:47:43.096 Board power is 0.516 W
2026-02-05T16:47:43.099 VDF 0: Allocating intermediate values, total 2 * 4096
2026-02-05T16:47:43.100 VDF 1: Allocating intermediate values, total 2 * 4096
2026-02-05T16:47:43.101 VDF 2: Allocating intermediate values, total 2 * 4096
2026-02-05T16:47:43.101 ASIC Temp = 27.08 C; Frequency = 1100.0 MHz; freq_idx = 7634
2026-02-05T16:47:44.500
2026-02-05T16:47:44.500 VDF 0: 1000000 HW iters done in 1s, HW speed: 714040 ips
2026-02-05T16:47:44.500 VDF 0: 152879 SW iters done in 0s, SW speed: 247521 ips
2026-02-05T16:47:44.500
2026-02-05T16:47:44.500
2026-02-05T16:47:44.500 VDF 1: 1000000 HW iters done in 1s, HW speed: 714364 ips
2026-02-05T16:47:44.500 VDF 1: 151803 SW iters done in 0s, SW speed: 244842 ips
2026-02-05T16:47:44.500
2026-02-05T16:47:44.500
2026-02-05T16:47:44.500 VDF 2: 1000000 HW iters done in 1s, HW speed: 714474 ips
2026-02-05T16:47:44.500 VDF 2: 150589 SW iters done in 0s, SW speed: 243553 ips
2026-02-05T16:47:44.500
2026-02-05T16:47:45.239 VDF 0: Proof done for iters=1000000, length=1000000 in 0.739s
2026-02-05T16:47:45.257 VDF 1: Proof done for iters=1000000, length=1000000 in 0.757s [checkpoint]
2026-02-05T16:47:45.269 VDF 2: Proof done for iters=1000000, length=1000000 in 0.769s [checkpoint]

@hoffmang9 hoffmang9 added the enhancement New feature or request label Feb 6, 2026
Base automatically changed from nudupl-arm64-ci to main February 7, 2026 01:54
@hoffmang9 hoffmang9 changed the title macOS arm64: build HW VDF client with FT4222H drivers (CI + docs) macOS arm64 and Windows: build HW VDF client with FT4222H drivers (CI + docs) Feb 8, 2026
res_string.resize(mpz_sizeinbase(impl, 10));
mpz_get_str(&(res_string[0]), 10, impl);
return res_string.c_str();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buffer overflow in to_string_dec for negative integers

Medium Severity

to_string_dec() allocates mpz_sizeinbase(impl, 10) bytes, which per GMP docs returns the digit count ignoring the sign. For negative numbers (like discriminants), mpz_get_str writes a leading - sign plus digits plus a null terminator, overflowing the buffer by at least one byte. The to_string() method in the same file correctly adds + 2 extra bytes for this reason.

Fix in Cursor Fix in Web

{
init_gmp();
fesetround(FE_TOWARDZERO);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newly created vdf_base_hw.cpp is unreferenced dead code

Medium Severity

vdf_base_hw.cpp is a newly created lightweight file providing VdfBaseInit without pulling in heavy VDF headers. However, it's never referenced by the CMake or Makefile build systems. The CMake HW build at line 412 uses vdf_base.cpp (which pulls in verifier.h, prover_slow.h, etc.) instead. This file appears intended to replace vdf_base.cpp for HW targets but was never wired up, leaving it as dead code and forcing an unnecessarily heavy include chain.

Additional Locations (1)

Fix in Cursor Fix in Web

reduce_form(qf2.a, qf2.b, qf2.c);
#else
reducer.reduce(qf2);
#endif
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant form reduction in Windows emulator squaring loop

Low Severity

On Windows, the emulator calls square(st->qf) which internally calls form::reduce() before returning the result. Then reduce_form(qf2.a, qf2.b, qf2.c) is called again on the already-reduced form. The second reduction is a no-op that adds unnecessary work on every iteration of the hot emulation loop. Only one of the two reduction steps is needed.

Fix in Cursor Fix in Web

qf2 = square(st->qf);
#else
nudupl_form(qf2, st->qf, st->d, st->l);
#endif
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows emulator uses different squaring algorithm than production

Medium Severity

On Windows, the emulator substitutes square() for nudupl_form() in the VDF squaring loop. While mathematically equivalent, using a fundamentally different algorithm in the emulator than what the real HW verification path expects undermines the emulator's purpose of faithfully simulating hardware behavior. The extensive check_valid diagnostics and bad-form logging added around this call confirm the substitution is producing unexpected results.

Fix in Cursor Fix in Web

Potentially revert the 2weso bug
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

reduce_form(qf2.a, qf2.b, qf2.c);
#else
reducer.reduce(qf2);
#endif
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows emulator double-reduces forms after squaring

Low Severity

On Windows, the emulator calls square(st->qf) which internally calls res.reduce() before returning, then immediately calls reduce_form(qf2.a, qf2.b, qf2.c) again on the already-reduced result. The second reduction is redundant since reducing an already-reduced form is a no-op. The non-Windows path correctly calls nudupl_form (which does not reduce) followed by a single reducer.reduce(qf2).

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant