-
Notifications
You must be signed in to change notification settings - Fork 151
Add index for cancellation timestamp #4018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Reminder: Please update the DB Readme and comment whether migrations are reversible (include rollback scripts if applicable).
Caused by: |
squadgazzz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. If we still encounter issues with this query, we should probably start looking into restructuring the orders table.
|
/gemini review |
|
@gemini-code-assist review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request addresses a performance issue by adding a new index on cancellation_timestamp for the orders table, which significantly speeds up a key query. The changes also include removing a now-redundant composite index and, importantly, updating the database README to better document existing indexes. The SQL migration is correct and the documentation improvements are very welcome. I've made one suggestion to further improve the documentation accuracy.
| - order_owner: hash(`owner`) | ||
| - order_quoting_parameters: btree(`sell_token`, `buy_token`, `sell_amount`) | ||
| - order_sell_buy_tokens: btree(`sell_token`, `buy_token`) | ||
| - user_order_creation_timestamp: btree(`owner`, `creation_timestamp`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the missing indexes to the documentation! Just a small correction: this index is on creation_timestamp in descending order.
It also looks like order_valid_to and user_valid_to indexes are for the orders table but are currently documented under order_execution. It would be great to move them here as well to make the index list for orders complete.
| - user_order_creation_timestamp: btree(`owner`, `creation_timestamp`) | |
| - user_order_creation_timestamp: btree(`owner`, `creation_timestamp` DESC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable performance improvement by adding an index on cancellation_timestamp to speed up order fetching. The logic is sound and should resolve the performance issue described. I have two main points of feedback. First, and most critically, the database migration script should use CREATE INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY to avoid locking the orders table in production, which could cause downtime. Second, there's a minor inconsistency in the README documentation update for an existing index. Overall, this is a great change once the migration safety is addressed.
| CREATE INDEX order_cancellation_timestamp ON orders USING BTREE(cancellation_timestamp); | ||
|
|
||
| -- drop index over (creation_timestamp, cancellation timestamp) | ||
| -- since it can only be used optimally for queries on the creation_timestamp. | ||
| DROP INDEX IF EXISTS order_creation_cancellation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating and dropping indexes on a large production table like orders without the CONCURRENTLY option can cause significant downtime. CREATE INDEX (without CONCURRENTLY) takes an ACCESS EXCLUSIVE lock on the table, blocking all reads and writes until the index is built. Similarly, DROP INDEX locks the table.
To avoid this, you should use CREATE INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY.
Note that CREATE INDEX CONCURRENTLY cannot be run inside a transaction block. You may need to configure Flyway to run this migration outside of a transaction.
CREATE INDEX CONCURRENTLY order_cancellation_timestamp ON orders USING BTREE(cancellation_timestamp);
-- drop index over (creation_timestamp, cancellation timestamp)
-- since it can only be used optimally for queries on the creation_timestamp.
DROP INDEX CONCURRENTLY IF EXISTS order_creation_cancellation;| - order_owner: hash(`owner`) | ||
| - order_quoting_parameters: btree(`sell_token`, `buy_token`, `sell_amount`) | ||
| - order_sell_buy_tokens: btree(`sell_token`, `buy_token`) | ||
| - user_order_creation_timestamp: btree(`owner`, `creation_timestamp`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This index definition appears to be missing DESC for creation_timestamp. The user_orders query typically fetches orders in descending order of creation time, so a descending index is more efficient. The same index is also documented under the order_execution table (line 214) as btree(owner, creation_timestamp DESC), which is likely the correct definition. Please verify and correct this.
Additionally, it seems several indexes are now documented under both orders and order_execution. To avoid confusion, it would be beneficial to clean this up and remove the incorrect entries from the order_execution section in a follow-up.
| - user_order_creation_timestamp: btree(`owner`, `creation_timestamp`) | |
| - user_order_creation_timestamp: btree(`owner`, `creation_timestamp` DESC) |
Description
Simply fetching the recently updated/added orders for updating the solvable orders cache is surprisingly slow. On mainnet this takes anywhere between 250ms and 3s (resolution of the time buckets is not ideal).
This query basically has the shape
SELECT fields FROM orders WHERE cancellation_timestamp > $1 OR creation_timestamp > $1 OR uid = ANY($2). When running the query withEXPLAIN ANALYZEit became apparent thatcreation_timestamp > $1is very fast butcancellation_timestamp > $1is not. It turned out there was only 1 index usingcancellation_timestampand it was in an indexbtree(creation_timestamp, cancellation_timestamp). This query can only be used to search orders bycreation_timestampvery efficiently but not bycancellation_timestamp.When I added an index
btree(cancellation_timestamp)the query time dropped from ~150ms to ~2ms. It's not completely clear how this translates to the prod environment since my test DB ran the originally query significantly faster than the prod replica but a speed up of 75x will certainly not hurt.Changes
How to test
manual tests on a cloned prod db