Skip to content

Conversation

@antons0l
Copy link
Collaborator

No description provided.

@jhheider jhheider requested a review from Copilot August 12, 2025 16:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes the database query for fetching project details by restructuring it to use a Common Table Expression (CTE) with materialized execution. The optimization aims to improve performance by computing expensive aggregations and subqueries once per project rather than multiple times.

Key changes:

  • Restructured the SQL query to use a materialized CTE that computes tea rank, package managers, and dependency counts once per canon
  • Moved the tea rank retrieval logic into subqueries within the CTE to get the latest rank more efficiently
  • Simplified the main query to select from the pre-computed base results

COALESCE(tr.rank,'0') AS "teaRank",
tr.created_at AS "teaRankCalculatedAt",
(
WITH base AS MATERIALIZED (
Copy link

Copilot AI Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MATERIALIZED keyword may not provide performance benefits for this single-row query (WHERE c.id = $1). For queries returning a single project, materialization overhead might outweigh benefits. Consider testing performance with and without MATERIALIZED.

Suggested change
WITH base AS MATERIALIZED (
WITH base AS (

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the use of MATERIALIZED does provide some performance benefits, I have verified it.

Comment on lines 212 to 228
-- latest tea rank
COALESCE((
SELECT tr.rank
FROM tea_ranks tr
WHERE tr.canon_id = c.id
ORDER BY tr.created_at DESC
LIMIT 1
), '0') AS "teaRank",
(
SELECT tr.created_at
FROM tea_ranks tr
WHERE tr.canon_id = c.id
ORDER BY tr.created_at DESC
LIMIT 1
) AS "teaRankCalculatedAt",
-- same packageManagers subquery as before
(
Copy link

Copilot AI Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tea rank subquery is executed twice (lines 213-219 and 220-226) with identical logic. This creates redundant database operations that could be optimized by using a single subquery in a lateral join or window function.

Suggested change
-- latest tea rank
COALESCE((
SELECT tr.rank
FROM tea_ranks tr
WHERE tr.canon_id = c.id
ORDER BY tr.created_at DESC
LIMIT 1
), '0') AS "teaRank",
(
SELECT tr.created_at
FROM tea_ranks tr
WHERE tr.canon_id = c.id
ORDER BY tr.created_at DESC
LIMIT 1
) AS "teaRankCalculatedAt",
-- same packageManagers subquery as before
(
-- latest tea rank and its timestamp via lateral join
COALESCE(tr_latest.rank, '0') AS "teaRank",
tr_latest.created_at AS "teaRankCalculatedAt",
-- same packageManagers subquery as before
FROM canons c
JOIN urls u_homepage ON c.url_id = u_homepage.id
LEFT JOIN LATERAL (
SELECT tr.rank, tr.created_at
FROM tea_ranks tr
WHERE tr.canon_id = c.id
ORDER BY tr.created_at DESC
LIMIT 1
) tr_latest ON TRUE
(

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Copy link
Member

@jhheider jhheider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay, voodoo!

@antons0l antons0l merged commit 819348c into main Aug 12, 2025
4 checks passed
@antons0l antons0l deleted the optimize-deps-query branch August 12, 2025 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants