feat(mempool): Block EVM mempool Select on legacypool reorg #867

mattac21 · 2025-11-27T22:07:44Z

Description

Blocks EVMExperimentalMempool.Select until legacypool.Reset (the reorg loop) has been run for the height that Select is being called on. We do this to ensure that all txs in the pending pool are valid at Select height H. If we know that the reorg loop has been run at height H, then we know that the anteHandlers for all txs in the pending pool have passed at height H, so they can be safely selected for a proposal.

Closes: #XXXX

Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

tackled an existing issue or discussed with a team member
left instructions on how to review the changes
targeted the main branch

evmd/app.go

mattac21 · 2025-11-28T23:28:00Z

mempool/txpool/legacypool/legacypool.go

 			}
 			queuedEvents[addr].Put(tx)
-
+		case <-pool.reorgSubscriptionCh:


Added ability for arbitrary subscriptions to the completion of nextDone (when the "next" reorg loop will complete). This works by a a user pushing a request to subscribe onto this channel, then they must immediately listed on the pool.ReorgDoneCh. This loop will push nextDone onto that channel. The user can then wait on the closure of nextDone, which essentially broadcasts to all holders of this channel (subscribers to the next run of the reorg loop) that is has completed.

mattac21 · 2025-11-28T23:33:43Z

mempool/txpool/legacypool/legacypool.go

+		case <-ctx.Done():
+			return


likely a better way to handle this, but mempool.Select doesnt return an error, so if the context cancels during this call, then we potentially allow invalid txs to be selected, which will just be invalidated. If this happens we likely timeout propose. Not 100% sure what it means for the context to be cancelled here (is app shutting down?), based on what's actually happening panicing may be better. Likely this is a follow on to this, probably not important right now.

mattac21 · 2025-11-28T23:39:33Z

mempool/txpool/legacypool/legacypool.go

+	for pool.latestReorgHeight.Load() < height {
+		// reorg loop has not run at the target height, subscribe to the
+		// outcome of the next reorg loop iteration to know when to check again
+		sub, err := pool.SubscribeToNextReorg()


if we have not run the reorg loop for the target height yet, we wait for the outcome of the next iteration of the loop. We are explicitly not telling the reorg loop to run here, since that would simply run it again, but not increment it to a new height (since we would need to pass the latest headers to it in order for that to happen). Also if we kick off a new run here and dont increment the latestReorgHeight, then we will continuously kick off new reorg loops until the txpool sees a new block and reorg runs on a new block, essentially doing lots of wasted work.

swift1337 · 2025-12-01T14:59:50Z

mempool/txpool/legacypool/legacypool.go

+
+// SubscribeToNextReorg returns a channel that will close when the next reorg
+// loop completes. An error is returned if the loop is shutting down.
+func (pool *LegacyPool) SubscribeToNextReorg() (chan struct{}, error) {


I find this logic a bit confusing because we operate with three different channels here, but I assume this is due to the code being ported from geth

Yeah this is geth code I am adding onto, and agree the channel/subscription logic of the reorg loop is super confusing, plus this just adds another path to it which I dont love. After staring at the pattern for a bit I actually dont mind it though, given its initial learning curve.

For my and others future reference since this wasn't immediately obvious to me... At a high level they are creating a way to broadcast to multiple subscribers that the reorg loop finished. You can't easily do this with a channel and sending a message on it saying that something finished, since only a single subscriber will pull the message off. You could count the number of subscribers and send that amount of messages, but if one of those subscribers dies then you have a hanging message in the channel. To get around this when someone subscribes, the reorg loop immediately sends the subscriber a channel on the reorgDoneCh (assuming here the subscriber then immediately waits to receive the channel), this received channel is now their subscription. The reorg loop also maintains a reference to this channel and will close this channel when the next iteration of it finishes. Closing a channel will cause all readers of the channel to get a nil value off of it, so all subscribers treat that as 'something finished'.

Previously the only way to get this channel from reorgDoneCh was to request for the reorg loop to be run by passing it a new block header or telling it to promote txs. Here we dont actually want to do this, and simply want to wait on the result of the next run, whenever that happens (since we need to wait for a run to happen for a new block, and we can't simply pass it a new block here). So I created the reorgSubscriptionCh that users can push an empty struct onto to tell the reorg loop that we want to subscribe to the next runs result, but dont actually want to start a new run.

Eric-Warehime · 2025-12-01T18:05:14Z

mempool/mempool.go


+	// Wait for the legacypool to Reset at >= blockHeight (this may have
+	// already happened), to ensure all txs in pending pool are valid.
+	m.legacyTxPool.WaitForReorgHeight(ctx, ctx.BlockHeight())


Is the idea to put a timeout into the PrepareProposalHandler such that an issue with the mempool won't block consensus? Or did you have some other way you anticipated handling long waits here?

I wasn't planning on adding any app side timeout into prepare proposal or having a specific way of handling a stall here for now. My thought was just that if we have some issue here and the mempool isn't advancing, with our new design then the mempool needs to block consensus since we can't be sure that we wont propose a bad block if this condition isn't met. Instead we should just let Comet's timeout_propose run out and have a new proposer try.

We could have some maintain some kind of cursor as we validate txs in the pending pool via ante handlers and then have a timeout where if we are in the middle of validating and receive a timeout, then we only allow the txs that have been validated to be returned via select.

@swift1337 brought up a good idea during our sync this morning saying that we should add a limit to how many txs we validate during the reorg loop, since if we have a massive backlog in the pending/queued pool, then we may reach a point where we cannot validate fast enough in a single block time. We thought about adding a timeout here, but if we add a timeout without some way of then clearing out txs from the mempool, then we will just continue to timeout whenever we try to select. So I think the better solution is some sort of maxBytesValidated or maxGasValidated, similar to creating blocks (but we may also need the cursor kind of solution to know where the validation stopped).

Either way the solution is non trivial imo and we should probably just address in a new ticket later (im assuming this wont be a blocker for getting some initial numbers on how this performs).

Yeah make sense. The failure case is probably the same as we currently have right now where if max bytes/gas is sufficiently high then you'll sit there in select for minutes even when the chain has timed out already.

It makes sense to benchmark some worst-case scenarios for the reorg loop based on the number of transactions and how many need to be moved/reordered and recommend some mempool parameters based on that. Do we know, for example, how much time validating 20k transactions would take?

mattac21 added 3 commits November 26, 2025 16:58

block mempool.Select on reorg completion at height

c9beb5c

launch next run on subscribe and move updating latest height inside lock

df1fb75

wait for latest reorg height tests

c701d97

mattac21 changed the title ~~feat(mempool): Block EVM mempool Select on legacypool reorg at height~~ feat(mempool): Block EVM mempool Select on legacypool reorg Nov 27, 2025

github-advanced-security bot found potential problems Nov 27, 2025

View reviewed changes

evmd/app.go Fixed Show fixed Hide fixed

mattac21 changed the base branch from main to feat/krakatoa November 28, 2025 13:31

mattac21 added 4 commits November 28, 2025 18:22

Merge branch 'feat/krakatoa' into feat/block-select-on-reorg

926b590

typo

4e47fd4

comemnt

d38f0dc

remove launchNextRun = true during arbitrary subscriptions and fix test

56d8315

mattac21 commented Nov 28, 2025

View reviewed changes

mattac21 requested review from Eric-Warehime, swift1337 and vladjdk November 28, 2025 23:40

swift1337 approved these changes Dec 1, 2025

View reviewed changes

fix test

d2ba034

Eric-Warehime approved these changes Dec 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(mempool): Block EVM mempool Select on legacypool reorg #867

feat(mempool): Block EVM mempool Select on legacypool reorg #867

mattac21 commented Nov 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

mattac21 Nov 28, 2025

Uh oh!

mattac21 Nov 28, 2025

Uh oh!

mattac21 Nov 28, 2025

Uh oh!

swift1337 Dec 1, 2025

Uh oh!

mattac21 Dec 1, 2025

Uh oh!

mattac21 Dec 1, 2025

Uh oh!

Eric-Warehime Dec 1, 2025

Uh oh!

mattac21 Dec 1, 2025

Uh oh!

Eric-Warehime Dec 1, 2025

Uh oh!

vladjdk Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

feat(mempool): Block EVM mempool Select on legacypool reorg #867

Are you sure you want to change the base?

feat(mempool): Block EVM mempool Select on legacypool reorg #867

Conversation

mattac21 commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Author Checklist

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mattac21 commented Nov 27, 2025 •

edited

Loading