Skip to content

SNOW-3347488: reduce describe query by reusing parent attributes for cte optimization on and reduce desc query on#4175

Draft
sfc-gh-aling wants to merge 13 commits intomainfrom
fix/infer-alias-types-from-parent
Draft

SNOW-3347488: reduce describe query by reusing parent attributes for cte optimization on and reduce desc query on#4175
sfc-gh-aling wants to merge 13 commits intomainfrom
fix/infer-alias-types-from-parent

Conversation

@sfc-gh-aling
Copy link
Copy Markdown
Contributor

@sfc-gh-aling sfc-gh-aling commented Apr 14, 2026

  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-3347488

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
      • If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
    • If this is a new feature/behavior, I'm adding the Local Testing parity changes.
    • I acknowledge that I have ensured my changes to be thread-safe. Follow the link for more information: Thread-safe Developer Guidelines
    • If adding any arguments to public Snowpark APIs or creating new public Snowpark APIs, I acknowledge that I have ensured my changes include AST support. Follow the link for more information: AST Support Guidelines
  3. Please describe how your code solves the related issue.

Change 1: General reduce-describe optimization
What: Enhanced _extract_inferable_attribute_names to accept from_attributes and resolve Alias(Attribute(...)) types by looking up the parent plan's known types by column name. Also enhanced _extract_selectable_attributes to pass from_attributes downstream. This is a general improvement to the metadata inference -- it works regardless of CTE.
Files: metadata_utils.py only
Effect: When reduce_describe_query_enabled is ON, more projections can be resolved locally without a DESCRIBE, because Alias(Attribute) expressions now inherit types from the FROM clause.

Change 2: CTE-specific describe reduction
What: Three things:
Added Alias(Cast(...)) inference using Cast.to for scalar types (skipping structured types to avoid Snowflake type promotion issues)
Added try_infer_attributes_from_flattened_projection and replaced the unconditional _attributes = None in SelectStatement.select() flattening with a smart re-derive attempt
Added unit tests (test_metadata_utils.py) and integ tests (test_reduce_describe_query.py)
Files: metadata_utils.py, select_statement.py, plus new test files
Effect: When CTE optimization is ON, the select() flattening no longer unconditionally discards cached attributes. Combined with Cast inference, this eliminates the extra describe query for self-joins on createDataFrame tables.

The two changes are complementary -- Change 1 handles the Alias(Attribute) case (column pass-through), Change 2 handles the Alias(Cast) case (typed columns from createDataFrame with schema). Together they cover the two most common projection patterns that trigger unnecessary describes.

@github-actions
Copy link
Copy Markdown


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 16, 2026

Codecov Report

❌ Patch coverage is 95.23810% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.27%. Comparing base (c5362e4) to head (d281d20).

Files with missing lines Patch % Lines
...lake/snowpark/_internal/analyzer/metadata_utils.py 94.87% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4175      +/-   ##
==========================================
- Coverage   95.42%   92.27%   -3.16%     
==========================================
  Files         171      171              
  Lines       43615    43644      +29     
  Branches     7459     7470      +11     
==========================================
- Hits        41620    40271    -1349     
- Misses       1220     2504    +1284     
- Partials      775      869      +94     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sfc-gh-aling sfc-gh-aling changed the title reuse attributes SNOW-3347488: reduce describe query by reusing parent attributes for cte optimization on and reduce desc query on Apr 17, 2026
@sfc-gh-aling sfc-gh-aling marked this pull request as ready for review April 17, 2026 22:16
@sfc-gh-aling sfc-gh-aling requested review from a team as code owners April 17, 2026 22:16
@sfc-gh-aling sfc-gh-aling marked this pull request as draft April 18, 2026 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants