-
Notifications
You must be signed in to change notification settings - Fork 735
[YQ-4903] Watermarks: YQL: new watermark options #29627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
⚪ ⚪ Ya make output | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
|
⚪
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
|
🟢 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds new watermark configuration options to YQL streaming queries, enabling finer control over watermark behavior. The implementation includes granularity, idle timeout, and late arrival policy settings that can be specified both at the PRAGMA level and per-table in the WITH clause.
- Added support for
WATERMARK_GRANULARITYandWATERMARK_IDLE_TIMEOUTtable-level settings - Added support for
WATERMARK_ADJUST_LATE_EVENTSandWATERMARK_DROP_LATE_EVENTSflags for controlling late event handling - Refactored settings processing to use
TCoNameValueTupleListinstead ofTExprListfor better type safety
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| watermarks_drop.sql, watermarks_as.sql, watermarks_adjust.sql, watermarks.sql | Test SQL files demonstrating new watermark options usage |
| yql_pq_topic_key_parser.h/cpp | Added getter methods and parsing logic for new watermark settings |
| yql_pq_dq_integration.cpp | Main implementation of watermark settings validation and conversion, including mutual exclusivity checks for late event policies |
| yql_pq_datasource.cpp | Updated to pass new watermark settings through the query pipeline |
| yql_pq_datasource_type_ann.cpp | Added type annotation validation for settings tuple |
| yql_pq_expr_nodes.json | Changed Settings field type from TExprList to TCoNameValueTupleList |
| yql_names.h | Added WatermarksLateEventsPolicySetting constant |
| ya.make | Reorganized PEERDIR dependencies (formatting change) |
| canondata/*.txt | Expected test output files for the new test cases |
| result.json | Test registry updates for new test cases |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| if (!watermarksLateEventsPolicy.Empty()) { | ||
| ctx.AddError(TIssue(ctx.GetPosition(pqReadTopic.Pos()), | ||
| TStringBuilder() << "Cannot adjust and " << *watermarksLateEventsPolicy << " late events at the same time")); |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message uses inconsistent verb forms. When watermarksLateEventsPolicy is "drop", the message reads "Cannot adjust and drop late events", but when it's "adjust", it reads "Cannot drop and adjust late events". Consider using a consistent format like "Cannot use both adjust and drop policies for late events" or "WATERMARK_ADJUST_LATE_EVENTS and WATERMARK_DROP_LATE_EVENTS are mutually exclusive".
| } | ||
| if (!watermarksLateEventsPolicy.Empty()) { | ||
| ctx.AddError(TIssue(ctx.GetPosition(pqReadTopic.Pos()), | ||
| TStringBuilder() << "Cannot drop and " << *watermarksLateEventsPolicy << " late events at the same time")); |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message uses inconsistent verb forms. When watermarksLateEventsPolicy is "adjust", the message reads "Cannot drop and adjust late events", but when it's "drop", it reads "Cannot adjust and drop late events". Consider using a consistent format like "Cannot use both drop and adjust policies for late events" or "WATERMARK_ADJUST_LATE_EVENTS and WATERMARK_DROP_LATE_EVENTS are mutually exclusive".
| .GetOrElse(TDqSettings::TDefault::WatermarksLateArrivalDelayMs)); | ||
| Add(props, WatermarksLateArrivalDelayUsSetting, ToString(lateArrivalDelay.MicroSeconds()), pos, ctx); | ||
|
|
||
| const auto lateEventsPolicy = watermarksLateEventsPolicy |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default late events policy of "adjust" is implicit and undocumented. Consider adding a comment explaining why "adjust" is the default behavior, especially since this is a breaking change from the previous behavior where this setting didn't exist.
| const auto lateEventsPolicy = watermarksLateEventsPolicy | |
| const auto lateEventsPolicy = watermarksLateEventsPolicy | |
| // Default late events policy is set to "adjust" for backward compatibility. | |
| // This is a breaking change from previous behavior where this setting did not exist. | |
| // See relevant documentation or design notes for further details. |
Changelog entry
Add new watermark options: granularity, idle timeout, late arrival policy
Changelog category
Description for reviewers
...