Skip to content

Conversation

@otrho
Copy link
Contributor

@otrho otrho commented Nov 11, 2025

In memory the low (least significant) 32-bit limb is now loaded or stored from/to addr and the high 32-bit limb from/to addr+1. On the stack the limb order is still big-endian.

For testing the ToMidenRepr array of felts are now little-endian. The FromMidenRepr is still a big-endian array of felts, as per the stack.

For testing now the values are placed in memory explicitly as little-endian and are limb-swapped upon return. Changes should be made to the miden-debug crate to also support little-endian memory.

@otrho otrho requested review from bitwalker and greenhat November 11, 2025 00:38
@otrho otrho self-assigned this Nov 11, 2025
@otrho
Copy link
Contributor Author

otrho commented Nov 11, 2025

OK, it's failing on the byte-array round-trip test. (I don't know why I didn't notice this before pushing, is it new?)

This is also part of why this branch was so fiddly. To have the tests pretty much remain as-is but to support little-endian memory, I changed the ToMidenRepr to be little-endian, as mentioned above. It expects arrays of felts in little-endian order.

But FromMidenRepr is unchanged, as values are big-endian on the stack. The asymmetry is going to fail this round-trip test. You can see in the failure that the 4-byte groups are swapped, per the limb reversal.

@otrho
Copy link
Contributor Author

otrho commented Nov 11, 2025

And so the change in aeef988 fudges the test to acknowledge the asymmetry.

This smells a bit crap.

@otrho
Copy link
Contributor Author

otrho commented Nov 12, 2025

OK, now after #681 the changes I've added to ToMidenRepr and FromMidenRepr need to go in the miden-debug crate. I have no idea at this stage whether they would break things there, and/or if we should not use them for tests at all.

Copy link
Collaborator

@bitwalker bitwalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might be introducing a new problem while appearing to fix a different one here, see my other comment for details.

Also, apologies for hassle, but you'll want to cherry-pick your commit onto a fresh pull of next - we had to do some git surgery that rewrote the history of next, so those commits no longer exist and that's why your PR has those other commits of ours.

mem_load # [e1, addr]
# load first element
swap.1 mem_load # [e0, e1]
swap.1 drop # [addr]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to change the order of the bytes for all double-word loads. I think we should constrain this behavior to only loads of u64/i64, since only that type gets special treatment in terms of its operand stack layout vs heap layout. We assume pretty pervasively that the layout of any type on the operand stack is in little-endian order, and that any special-case types are handled when we emit loads/stores of that type.

AIUI, the current behavior of std::math::u64 is: big-endian limb order on the operand stack, little-endian order on the heap; but our backend assumes/assumes that the big-endian order holds for both storage locations, when that isn't true. I think the right fix here is to specifically address that bug, rather than change the underlying 'load_dw/store_dw` intrinsics.

Let me know if there is something I'm missing here, but I believe this PR would actually introduce a new bug, just one that we don't likely hit due to limited usage of these intrinsics currently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So codegen for hir.load : i64 should still emit an unchanged/big-endian intrinsics::mem::load_dw but then swap the limbs afterwards?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, except at the moment the only place load_dw is used in the compiler is to lower 64-bit hir.load.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So codegen for hir.load : i64 should still emit an unchanged/big-endian intrinsics::mem::load_dw but then swap the limbs afterwards?

Yes, basically. Or we could have an intrinsic for loading u64 values specifically, that loads them on to the operand stack in big-endian limb order - the main thing is that I would prefer load_dw and friends to maintain the little-endian byte ordering semantics that they've had up until now, as that is consistent with the rest of our memory model.

The idea behind load_dw (and load_qw, etc.) is to facilitate loading 4-byte words (so double-word here is an 8 byte load, and so on).

At the moment, we don't have loads of structs implemented in the backend, because our Rust compilation pipeline doesn't contain struct types (they've been converted to loads of some integral type). However, if one was to lower to HIR directly, we'd need to implement load_struct/store_struct in the backend using the load_* intrinsics as primitives. It's possible there are cases where we will need this even for Rust-compiled programs in the near future, but we've mostly been punting on this for now as it hasn't been needed.

In any case, the "correct" way for us to lay out std::math::u64 would be to just use load_dw to get the limbs on the operand stack in little-endian order. The problem with just using load_dw though, as you know, is that Wasm wants the limbs on the operand stack in big-endian order, so we either need to reverse the limbs after load_dw, or implement a u64-specific pair of load/store intrinsics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't quite get what you mean here.

AFAICT all reads from memory have been little-endian byte-wise, and big-endian limb-wise, and any multi-limbed values have always been big-endian on the working stack, for Wasm and for MidenVM.

I've added a load_unaligned_u64() test to this PR now, which loads 64-bit words as little-endian byte-wise and limb-wise, with the result big-endian on the stack (MSB on the top).

Is this not the behaviour we want?

The reason I even need this change is because Wasm is loading a struct with two 32bit members using a single 64bit read, alluding the the load_struct stuff you mention.

@bitwalker
Copy link
Collaborator

By the way, feel free to open PRs against miden-debug with any changes there, and I can get those released ASAP

@otrho otrho force-pushed the otrho/little-endian-mem branch 3 times, most recently from 1ba442a to 91d94b6 Compare November 26, 2025 03:53
@otrho otrho force-pushed the otrho/little-endian-mem branch from 91d94b6 to 0b9f6f2 Compare December 11, 2025 03:13
…little-endian.

In memory the low (least significant) 32-bit limb is now loaded or
stored from/to `addr` and the high 32-bit limb from/to `addr+1`.  On the
stack the limb order is still big-endian.
@otrho otrho force-pushed the otrho/little-endian-mem branch from 0b9f6f2 to 06f699f Compare December 11, 2025 04:24
@otrho
Copy link
Contributor Author

otrho commented Dec 11, 2025

Ping @bitwalker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants