-
Notifications
You must be signed in to change notification settings - Fork 1
feat(executor, gw): error propagation #299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @ardatan, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request re-introduces previously reverted changes to enhance GraphQL error propagation. It focuses on improving the accuracy and context of errors returned from federated queries by normalizing error paths and enriching error details from downstream services, ensuring a more robust and user-friendly error handling experience.
Highlights
- Re-introduction of Error Propagation: This pull request re-introduces and refines the error propagation logic within the query plan executor, specifically addressing how errors are handled and presented in federated GraphQL queries.
- Enhanced Error Path Normalization: A new
error_normalization
module has been added to accurately map internal_entities
error paths (used during federated entity fetches) to their corresponding user-facing GraphQL query paths, providing more meaningful error locations. - Improved Downstream Error Context: Errors originating from downstream subgraphs are now automatically enriched with
code: "DOWNSTREAM_SERVICE_ERROR"
andserviceName
extensions, offering clearer insights into the source of the error. - Refactored Data Traversal and Merging: The core
traverse_and_callback
utility function has been updated to track and provideVecDeque<usize>
(path indexes) during data traversal. This enables a more precise and direct merging of fetched data into the main result, replacing the previous index-based approach. - Comprehensive Testing: New test fixtures and an integration test have been added to thoroughly validate the end-to-end error propagation and normalization behavior in a federated GraphQL environment.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request re-introduces error propagation logic by reverting a previous revert. The core of the changes is to correctly propagate error paths from subgraph responses (especially for _entities
queries) back to the client in the context of the original query. This is achieved by tracking index paths during data traversal and using this information to normalize error paths.
The implementation is mostly solid, but I've identified one critical bug that would lead to incorrect error paths, one high-severity risk of a panic, and a medium-severity opportunity for code simplification. Addressing these points will improve the robustness and maintainability of the new error propagation mechanism.
Once these issues are addressed, the changes will be in a good state to be merged.
let mut result = fetch_node | ||
.execute_for_projected_representations( | ||
execution_context, | ||
filtered_representations, | ||
BTreeSet::new(), | ||
vec![], | ||
) | ||
.await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In FlattenNode::execute
, execute_for_projected_representations
is called with an empty vector (vec![]
) for indexes_in_paths
. This is a bug. These paths are essential for normalize_errors_for_representations
to correctly map error paths from the subgraph back to the original query's structure.
Without passing the collected paths, any errors occurring in this fetch will have incorrect or missing path information in the final GraphQL response. The indexes_in_paths
collected during the traverse_and_callback
must be passed to this function.
let paths_for_fetch: Vec<VecDeque<usize>> =
representations.iter().map(|(_, path)| path.clone()).collect();
let mut result = fetch_node
.execute_for_projected_representations(
execution_context,
filtered_representations,
paths_for_fetch,
)
.await;
for path_segment in path.iter() { | ||
match path_segment { | ||
FlattenNodePathSegment::List => { | ||
let index = indexes_in_path.pop_front().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using .unwrap()
here poses a risk of panicking the gateway if the invariant (that indexes_in_path
has an element for each List
segment in the path) is ever broken. While this might be an unlikely scenario, a panic in the gateway is a serious issue.
To make the code more robust and to aid in debugging if this invariant is ever violated, it would be better to use .expect()
with a descriptive message. This clearly documents the assumption and provides more context in case of a panic.
let index = indexes_in_path.pop_front().unwrap(); | |
let index = indexes_in_path.pop_front().expect("BUG: Mismatched path segments and indexes"); |
match error.extensions { | ||
Some(ref mut extensions) => { | ||
if extensions.get("code").is_none() { | ||
extensions.insert( | ||
"code".to_string(), | ||
serde_json::Value::String( | ||
"DOWNSTREAM_SERVICE_ERROR".to_string(), | ||
), | ||
); | ||
} | ||
if extensions.get("serviceName").is_none() { | ||
extensions.insert( | ||
"serviceName".to_string(), | ||
serde_json::Value::String(subgraph_name.to_string()), | ||
); | ||
} | ||
} | ||
None => { | ||
error.extensions = Some(HashMap::from([ | ||
( | ||
"code".to_string(), | ||
serde_json::Value::String( | ||
"DOWNSTREAM_SERVICE_ERROR".to_string(), | ||
), | ||
), | ||
( | ||
"serviceName".to_string(), | ||
serde_json::Value::String(subgraph_name.to_string()), | ||
), | ||
])); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for adding default extensions to GraphQL errors is a bit verbose and can be simplified using more idiomatic Rust. By using get_or_insert_with
and the entry
API on HashMap
, you can make the code more concise and easier to read, avoiding the explicit match
on error.extensions
.
let extensions = error.extensions.get_or_insert_with(HashMap::new);
extensions.entry("code".to_string()).or_insert(
serde_json::Value::String("DOWNSTREAM_SERVICE_ERROR".to_string()),
);
extensions
.entry("serviceName".to_string())
.or_insert(serde_json::Value::String(subgraph_name.to_string()));
✅
|
let normalized_path = flatten_node.path.as_slice(); | ||
filtered_representations.push(b'['); | ||
let mut number_of_indexes = 0; | ||
let normalized_path = flatten_node.path.as_slice(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let number_of_indexes = flatten_node.path.as_slice().iter().filter(|v| matches!(v, FlattenNodePathSegment::List)).count()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm not sure if this is optimal enough. With this usage, it will create a new set of path segments then count it, no? But I avoid to create a new set/vec and count them directly with a single iteration.
lib/query-plan-executor/src/lib.rs
Outdated
None, | ||
) | ||
} | ||
.unwrap_or(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This unwrap_or
hides an error inside?
I mean, if this fail do we need some printing to the log? what's the scenario where an error is returned from here, and why the error is not relevant and dropped in favor of a default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a warn!
to print the error message when it happens. It shouldn't happen since we write to a string buffer but just in case it fails.
index_of_entities += 1; | ||
ParallelJob::Flatten(mut result, path) => { | ||
if let Some(entities) = result.entities { | ||
'entity_loop: for (entity, indexes_in_path) in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confirming this part for now, but with the new refactor i wonder if this section of code can be more readable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left comments why they are different.
let requires_nodes = fetch_node.requires.as_ref().unwrap(); | ||
filtered_representations.push(b'['); | ||
let mut first = true; | ||
let normalized_path = self.path.as_slice(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same suggestion as before, it can be simplified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left comments for the one above.
Federation Audit Results177 tests - 12 177 ✅ - 12 3s ⏱️ ±0s For more details on these parsing errors, see this check. Results for commit 1287c74. ± Comparison against base commit f594210. This pull request removes 12 tests.
♻️ This comment has been updated with latest results. |
e3909e7
to
1287c74
Compare
Re-done in #339 |
Reverts #298