feat: add HyperSync provider for faster event indexing#381
feat: add HyperSync provider for faster event indexing#381
Conversation
HyperSync now requests Hash and ParentHash alongside Number and Timestamp. The provider uses this cached block data to skip the eth_getBlockByNumber RPC call entirely for preloaded blocks.
Remove the native Rust NAPI dependency and use the HyperSync HTTP JSON API directly. No platform-specific binaries needed.
Resolve HyperSync chainId lazily on first API call instead of at init time. This keeps init() synchronous, avoids changes to BaseIndexer, StarknetIndexer, and Container constructor.
HyperSync is now a Preloader, not a BlockFetcher. The provider always uses RpcBlockFetcher for live blocks. HypersyncPreloader is only used during getCheckpointsRange for bulk historical event fetching with block data caching.
…vider HyperSync only needs to replace preloading. Remove BlockFetcher interface and RpcBlockFetcher — restore the original viem PublicClient on EvmProvider for all live block operations. Only the Preloader interface remains for optional HyperSync bulk fetching.
Provider always delegates preloading to a Preloader instance. RpcPreloader wraps the existing getLogs logic, HypersyncPreloader uses the HyperSync HTTP API. No conditional in getCheckpointsRange.
This reverts commit 574d6ff.
- Use chunk() utility instead of manual chunking loops - Derive checkpoints centrally in provider instead of preloader - Replace spread/concat with safe loops to avoid stack overflow - Remove unnecessary type re-exports - Self-manage blockCache within getCheckpointsRange
Sekhmet
left a comment
There was a problem hiding this comment.
I think it would be better if we just have two variants of indexers.
EvmProvider and HyperSyncEvmProvider (and HyperSyncEvmIndexer to match). I think it would be better that hypersync config is not part of main config of Checkpoint as providers/indexers are meant to be independent so it can just be passed when doing new evm.HyperSyncEvmIndexer(writers, { hyperSyncApiKey: "ASDF" }).
HyperSyncEvmProvider could just extend EvmProvider. To avoid duplication we could add two methods in EvmProvider: getBlock and scanEvents that HyperSyncEvmProvider would just override.
How will you pass the API key from sx apps/api to checkpoint ? Using env var ? The setup I am thinking of is to have 2 completely independent providers, and let the container.ts decide which one to use (one for live, the other for batch), so hypersync can function 100% without rpc provider. |
|
This would basically become: const arb1Indexer = new evm.HyperSyncEvmIndexer(createWriters(arb1Config), { hyperSyncApiKey: process.env.HYPER_SYNC_API_KEY }); |
Create HyperSyncEvmProvider extending EvmProvider so HyperSync usage is an explicit provider choice rather than a config-driven internal toggle. Remove preloader abstraction layer and hypersync_api_token from shared config schema — API token is now a constructor argument on HyperSyncEvmIndexer.
The chunk-of-20 pattern was inherited from the RPC provider where eth_getLogs limits the number of addresses per call. HyperSync has no such constraint, so all sources are now queried in a single request.
Core fields (number, timestamp, hash, address, etc.) are guaranteed by the HyperSync API when requested via field_selection. Removing defensive null checks so malformed responses surface as errors instead of being silently skipped.
HyperSync API returns undefined instead of empty arrays when there are no matching results, causing "response.data.blocks is not iterable".
HyperSync API returns `data` as an array of `{ blocks, logs }` chunks,
not a single object. The old code accessed `response.data.blocks` which
was undefined on the array, causing all events to be silently skipped.
Also allows HyperSync to preload the full block range in one call
instead of using the small adaptive step designed for RPC providers.
| @@ -0,0 +1,44 @@ | |||
| import { Logger } from '../../utils/logger'; | |||
| import { BaseIndexer, Instance } from '../base'; | |||
| import { HyperSyncEvmProvider } from './hyper-sync-provider'; | |||
There was a problem hiding this comment.
Should be a single word for HyperSync filename, its like WalletConnect, we wouldnt do wallet-connect-provider.ts
| import { HyperSyncEvmProvider } from './hyper-sync-provider'; | |
| import { HyperSyncEvmProvider } from './hypersync-provider'; |
There was a problem hiding this comment.
Should we use Hypersync all lowercase too, to match?
| this.log.info('new source added, clearing logs cache'); | ||
| this.logsCache.clear(); | ||
| } | ||
|
|
There was a problem hiding this comment.
Why we need to change this file and base provider? It would be easier to review if we don't need to changes EvmProvider
There was a problem hiding this comment.
To avoid code duplication, and for proper class inheritance.
Summary
Toward https://github.com/snapshot-labs/workflow/issues/787
This PR will add a new HyperSync EVM provider, to use HyperSync (https://envio.dev/#hypersync) as events discovering.
HyperSync is able to return a list of paginated events, for the whole chain, compared to the RPC flow where we had to search for events in the range of 1000 blocks at a time.
This speed up the events searching, down from few weeks to quasi instantaneous.
The whole remaining indexing time is now due to read/write speed on the database layer (psql).
You can create a free api token, using the free plan. The free plan limits is amply enough to index arbitrum.
Test plan
In your sx-monorepo app, link your checkpoint dependency to this PR, either by using
bun link, or editing your package.json to point to your local copy usingfile:/instead of just the version number.Don't forget to run
yarn buildon checkpoint.in sx, apps/api/src/evm/index.ts, edit the arb indexer to use
Add your api token to your apps/api/.env file
Uncomment the arbitrum contracts (keep GMX commented due to issue with timestamp parsing)
Start the sx app with only arbitrum
In the logs, you should see the "preloading" action only once between each events processing, the indexer do not have to find the events by searching blocks by blocks
Start the sx app with eth
In the logs, you should see multiple "preloading" events back to back when events are not found