-
Notifications
You must be signed in to change notification settings - Fork 50
Change the memory representation of 64-bit, dual-limbed values to be little-endian. #765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
Conversation
|
OK, it's failing on the byte-array round-trip test. (I don't know why I didn't notice this before pushing, is it new?) This is also part of why this branch was so fiddly. To have the tests pretty much remain as-is but to support little-endian memory, I changed the But |
|
And so the change in aeef988 fudges the test to acknowledge the asymmetry. This smells a bit crap. |
|
OK, now after #681 the changes I've added to |
aeef988 to
81666c1
Compare
bitwalker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might be introducing a new problem while appearing to fix a different one here, see my other comment for details.
Also, apologies for hassle, but you'll want to cherry-pick your commit onto a fresh pull of next - we had to do some git surgery that rewrote the history of next, so those commits no longer exist and that's why your PR has those other commits of ours.
codegen/masm/intrinsics/mem.masm
Outdated
| mem_load # [e1, addr] | ||
| # load first element | ||
| swap.1 mem_load # [e0, e1] | ||
| swap.1 drop # [addr] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to change the order of the bytes for all double-word loads. I think we should constrain this behavior to only loads of u64/i64, since only that type gets special treatment in terms of its operand stack layout vs heap layout. We assume pretty pervasively that the layout of any type on the operand stack is in little-endian order, and that any special-case types are handled when we emit loads/stores of that type.
AIUI, the current behavior of std::math::u64 is: big-endian limb order on the operand stack, little-endian order on the heap; but our backend assumes/assumes that the big-endian order holds for both storage locations, when that isn't true. I think the right fix here is to specifically address that bug, rather than change the underlying 'load_dw/store_dw` intrinsics.
Let me know if there is something I'm missing here, but I believe this PR would actually introduce a new bug, just one that we don't likely hit due to limited usage of these intrinsics currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So codegen for hir.load : i64 should still emit an unchanged/big-endian intrinsics::mem::load_dw but then swap the limbs afterwards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmm, except at the moment the only place load_dw is used in the compiler is to lower 64-bit hir.load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So codegen for hir.load : i64 should still emit an unchanged/big-endian intrinsics::mem::load_dw but then swap the limbs afterwards?
Yes, basically. Or we could have an intrinsic for loading u64 values specifically, that loads them on to the operand stack in big-endian limb order - the main thing is that I would prefer load_dw and friends to maintain the little-endian byte ordering semantics that they've had up until now, as that is consistent with the rest of our memory model.
The idea behind load_dw (and load_qw, etc.) is to facilitate loading 4-byte words (so double-word here is an 8 byte load, and so on).
At the moment, we don't have loads of structs implemented in the backend, because our Rust compilation pipeline doesn't contain struct types (they've been converted to loads of some integral type). However, if one was to lower to HIR directly, we'd need to implement load_struct/store_struct in the backend using the load_* intrinsics as primitives. It's possible there are cases where we will need this even for Rust-compiled programs in the near future, but we've mostly been punting on this for now as it hasn't been needed.
In any case, the "correct" way for us to lay out std::math::u64 would be to just use load_dw to get the limbs on the operand stack in little-endian order. The problem with just using load_dw though, as you know, is that Wasm wants the limbs on the operand stack in big-endian order, so we either need to reverse the limbs after load_dw, or implement a u64-specific pair of load/store intrinsics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't quite get what you mean here.
AFAICT all reads from memory have been little-endian byte-wise, and big-endian limb-wise, and any multi-limbed values have always been big-endian on the working stack, for Wasm and for MidenVM.
I've added a load_unaligned_u64() test to this PR now, which loads 64-bit words as little-endian byte-wise and limb-wise, with the result big-endian on the stack (MSB on the top).
Is this not the behaviour we want?
The reason I even need this change is because Wasm is loading a struct with two 32bit members using a single 64bit read, alluding the the load_struct stuff you mention.
|
By the way, feel free to open PRs against |
1ba442a to
91d94b6
Compare
91d94b6 to
0b9f6f2
Compare
…little-endian. In memory the low (least significant) 32-bit limb is now loaded or stored from/to `addr` and the high 32-bit limb from/to `addr+1`. On the stack the limb order is still big-endian.
0b9f6f2 to
06f699f
Compare
|
Ping @bitwalker. |
In memory the low (least significant) 32-bit limb is now loaded or stored from/to
addrand the high 32-bit limb from/toaddr+1. On the stack the limb order is still big-endian.For testing theToMidenReprarray of felts are now little-endian. TheFromMidenRepris still a big-endian array of felts, as per the stack.For testing now the values are placed in memory explicitly as little-endian and are limb-swapped upon return. Changes should be made to the
miden-debugcrate to also support little-endian memory.