Skip to content

Add support for C26 atomic reductions (without compiler mappings)#985

Open
ThomasHaas wants to merge 7 commits intodevelopmentfrom
atomic-modify-write
Open

Add support for C26 atomic reductions (without compiler mappings)#985
ThomasHaas wants to merge 7 commits intodevelopmentfrom
atomic-modify-write

Conversation

@ThomasHaas
Copy link
Collaborator

@ThomasHaas ThomasHaas commented Feb 12, 2026

  • I added a header c26.h with the new atomic reduction operations of C26.
  • I implemented all of them except min and max min/max are supported now, but only the signed versions!
  • There is also some support for C litmus style versions. @hernan-poncedeleon added them and I don't know how well they work right now.

These atomics generate a rmw-pair of events just like a standard fetch_op atomic, but add the Noreturn tag to both of them (naming follows LKMM's non-returning atomics).
There is no compilation scheme to hardware targets yet, so code has to be verified with --target=c11 (default).

What needs to be done is to relax the memory models of interest: right now atomic_op and atomic_fetch_op provide the same synchronization semantics. EDIT: Although the memory models should probably be adapted, the fact that we currently model the load part of atomic_store_op as a plain load (not even relaxed) makes it weaker than a atomic_fetch_op in terms of ordering.

@hernanponcedeleon
Copy link
Owner

@graymalkin this branch should have everything you need to play around with the model

@ThomasHaas
Copy link
Collaborator Author

FYI, atomic_store_min/max are always the signed versions for now.

@graymalkin
Copy link

Thanks, I'll check it out!

@ThomasHaas ThomasHaas changed the title [DRAFT] Add support for C26 atomic reductions Add support for C26 atomic reductions (without compilation) Feb 18, 2026
@ThomasHaas ThomasHaas changed the title Add support for C26 atomic reductions (without compilation) Add support for C26 atomic reductions (without compiler mappings) Feb 18, 2026
@hernanponcedeleon
Copy link
Owner

Code-wise I think this one is ready to merge. I will wait a few days to see if @graymalkin or @gonzalobg have comments about the memory model part (especially if it makes sense to mark the read part of the reduction as atomic) or @mmalcomson reports any issues when trying the code.

@ThomasHaas
Copy link
Collaborator Author

With #986 merged, we could in principle add compiler mappings for atomic reductions to armv8. At least the obvious one's like store_add(... RLX) -> STADD and store_add_(... REL) -> STADDL. For SC, it would not be so clear.
I cannot imagine that any real C memory model would require the mapping to be stronger than that.

Local localOp = newLocal(dummyReg, expressions.makeIntBinary(dummyReg, e.getOperator(), e.getOperand()));
RMWStore store = newRMWStoreWithMo(load, address, dummyReg, Tag.C11.storeMO(mo));

load.addTags(C11.ATOMIC, Tag.C11.NORETURN); // Note that the load has no mo, but is still atomic!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with visitAtomicFetchOp I would rather use

Load load = newRMWLoadWithMo(dummyReg, address, Tag.C11.loadMO(mo));

and rather than getting the expected ordering guarantees "by chance" as it currently happens for rc11,
let the model explicitly state if NORETURN events should provide order or not.

It also feels strange to have an atomic event with no memory order.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like these consistency arguments... those are different operations. Tag.C11.loadMO(mo) will just be RLX or SC because you cannot specify ACQ/ACQ_REL in the first place.
I think the only really sensible options are: the load has no mo, simply because it shouldn't exist in the first place, or the load has the same mo/tags as the store and the WMM removes the tags.
Anything inbetween seems arbitrary to me.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current solution of hardcoding the atomic tag seems equally arbitrary.

I guess what you are proposing is to completely get rid of Tag.C11.loadMO/storeMO) and simply used the mo from the parsing. This would require the memory model to do some "cleanup" as lkmm does, but then we can get rid of these loadMo/storeMO as we already did for lkmm in #893.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current solution of hardcoding the atomic tag seems equally arbitrary.

The atomic tag is not arbitrary, because the whole operation is an atomic one, even by name atomic_store_XYZ.
And if you look at what our compiler does:

boolean canRace = mo == null || mo.value().equals(C11.NONATOMIC);
e.addTags(canRace ? C11.NONATOMIC : C11.ATOMIC);

then every event must be tagged either way, and NONATOMIC is certainly more wrong than ATOMIC.

I guess what you are proposing is to completely get rid of Tag.C11.loadMO/storeMO) and simply used the mo from the parsing. This would require the memory model to do some "cleanup" as lkmm does, but then we can get rid of these loadMo/storeMO as we already did for lkmm in #893.

I proposed exactly that in #984 or rather suggested it as one possible way to go forward. I think rc11.cat might already adhere to that. That being said, for now, I just took the most natural solution given the current hardcoded one:

  • A load must be generated for data-flow modelling (no way around this)
  • The load must be ignored in data races. Marking it as atomic is natural as it is part of an atomic operation independent of its memory ordering.
  • The load should not provide any orderings -> both plain (no mo) and RLX seem reasonable. Plain is closer to capturing the idea of "the load should not exist" whereas RLX is closer to capturing the idea of "the load exists but it should not give orderings", which is (funnily enough) too much ordering :)

At the end of the day, I'm not the one who writes the C memory models and sets the expectation of what is assumed to happen implicitly and what is assumed to be done in the model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants