Skip to content

Conversation

@shundroid
Copy link
Collaborator

@shundroid shundroid commented Oct 18, 2025

To support the automated buffering for the speculation pass, I modified the speculation compilation flow.

The most important point is that now we have two passes for speculation which execute before and after buffering respectively.

The pre-buffering speculation pass places "pre-buffer" speculator units to preserve the join semantics:

image

Speculator is decomposed into SpecPreBuffer1 and 2, based on the join semantics. At this stage, SCCommitCtrl does not exist. It is only connected after buffering. Therefore, the control for save-commit units is connected temporarily. The network for SCCommitCtrl is partially constructed using SCIsMisspec and the loop condition.

After the buffering, two pre-buffering spec units are coalesced into a speculator. Moreover, SCCommitCtrl is connected and the control for save-commit units is final.

image

At this stage, we also place additional buffers to ensure high throughput and avoid deadlock.

I admit this method might not be ideal or optimal: because of the decomposition of speculator, the frequency regulation doesn't work. This is the quick workaround to make spec v1 comparable with new speculation and to mitigate the non-deterministic circuit construction.

Also, I fixed some bugs in speculator/save-commit implementations and speculation pass, which are needed to make the speculation integration tests work.

@shundroid shundroid requested a review from murphe67 October 18, 2025 08:17
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't realize the merge of #552
before this latencies of beta-backend units were fixed, and speculation tests are benchmarked with those old latencies

if (TailEn = '1') and (HeadEn = '0') then
-- if new tail index will reach head index
if ((Tail +1) mod {fifo_depth} = Head) then
if ((Tail + 2) mod {fifo_depth} = Head) then
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the ring-buffer thing: we cannot fill up all slots to distinguish the empty and full state.

Full is latched so we should compare using Tail + 2, instead of + 1.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean buffered? A latched value is usually accidental https://vhdlwhiz.com/why-latches-are-bad/

static inline bool canGoThroughOutsideBlocks(Operation *op) {
return isa<handshake::ForkOp, handshake::ExtUIOp, handshake::ExtSIOp,
handshake::TruncIOp>(op);
handshake::TruncIOp, handshake::BufferOp>(op);
Copy link
Collaborator Author

@shundroid shundroid Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to call isBackedge after buffering, we should include BufferOp

Comment on lines +1147 to +1149
let summary = "";
let description = [{
}];
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add these once this design is approved

Comment on lines +851 to +864
for (auto branch : funcOp.getOps<handshake::SpeculatingBranchOp>()) {
if (branch->getAttr("specv1_branchDiscardCondNonSpec")) {
unsigned bb = getLogicBB(specOp1).value();
ConditionalBranchOp controlBranch = findControlBranch(funcOp, bb);
if (controlBranch == nullptr) {
specOp1->emitError()
<< "Could not find backedge within speculation bb.\n";
return signalPassFailure();
}
auto conditionOperand = controlBranch.getConditionOperand();
branch->setOperand(0, conditionOperand);
branch->setOperand(1, conditionOperand);
}
}
Copy link
Collaborator Author

@shundroid shundroid Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operands of this SpeculatingBranch didn't follow this figure actually.
The loop condition is not necessarily produced by the speculator.
It only becomes definite after placing speculator and save-commit units (because the loop condition may be generated by either of them), so I'm doing this at the end of the pass.

image

@murphe67
Copy link
Collaborator

Thanks for this Shun!

Could you put this into the google test integration tests instead of bringing the python script back?

Could you also give an explanation of why we can't just run buffering after the previous spec pass? what is the problem with the save commit control signal?

If you could also double check your explanations, they don't quite make sense currently, e.g.

At this stage, SCCommitCtrl does not exist. It is only connected after buffering. Therefore, the control for save-commit units is connected temporarily.

I don't know what this means, is there a incorrect word in it maybe?

@murphe67
Copy link
Collaborator

also, what does

At this stage, we also place additional buffers to ensure high throughput and avoid deadlock.
mean?

You are still manually buffering the circuit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants