-
-
Notifications
You must be signed in to change notification settings - Fork 62
feat(eap): Allow arrays to be queried from EAP #7551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat(eap): Allow arrays to be queried from EAP #7551
Conversation
| attributes_array = _process_arrays(arrays) | ||
| for key, value in attributes_array.items(): | ||
| add_attribute(k, v) | ||
|
|
||
| item = GetTraceResponse.Item( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a real bug, was this come across in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it was due to a late refactor before I commit (rename k/v to key/value) and didn't run the tests in-between.
|
Sentry needed to have some compatibility work done: getsentry/sentry#104063 |
| def _transform_array_value(value: dict[str, str]) -> Any: | ||
| for t, v in value.items(): | ||
| if t == "Int": | ||
| return int(v) | ||
| if t == "Bool": | ||
| return bool(v) | ||
| if t == "Double": | ||
| return float(v) | ||
| if t == "String": | ||
| return v | ||
| raise BadSnubaRPCRequestException(f"array value type unknown: {type(v)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Incorrect boolean parsing in _transform_array_value converts string "false" to True.
Severity: CRITICAL | Confidence: High
🔍 Detailed Analysis
The _transform_array_value function incorrectly parses boolean values. When v is a string like "false" or "0" (from ClickHouse's toJSONString output), bool(v) evaluates to True in Python. This leads to data corruption where False boolean attributes stored in arrays are incorrectly converted to True when retrieved via the GetTraceRequest endpoint, impacting logical correctness of trace data.
💡 Suggested Fix
Modify _transform_array_value to explicitly parse boolean strings. For example, return v.lower() in ("true", "1", "yes") should be used instead of bool(v) to correctly interpret string representations of boolean values.
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: snuba/web/rpc/v1/endpoint_get_trace.py#L470-L480
Potential issue: The `_transform_array_value` function incorrectly parses boolean
values. When `v` is a string like "false" or "0" (from ClickHouse's `toJSONString`
output), `bool(v)` evaluates to `True` in Python. This leads to data corruption where
`False` boolean attributes stored in arrays are incorrectly converted to `True` when
retrieved via the `GetTraceRequest` endpoint, impacting logical correctness of trace
data.
Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 4330143
| for row in data: | ||
| id = row.pop("id") | ||
| ts = row.pop("timestamp") | ||
| arrays = row.pop("attributes_array", "{}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Crash when attributes_array is None due to json.loads(None).
Severity: CRITICAL | Confidence: High
🔍 Detailed Analysis
A TypeError occurs when attributes_array is None. The row.pop("attributes_array", "{}") call at line 506 returns None if the key exists with a None value (e.g., from ClickHouse toJSONString(NULL)). This None is subsequently passed to json.loads() in _process_arrays, causing a TypeError: the JSON object must be str, bytes or bytearray, not NoneType. This crash is triggered when trace queries do not specify explicit attributes and ClickHouse returns NULL for attributes_array.
💡 Suggested Fix
Handle None values returned by row.pop() for attributes_array. Ensure that json.loads() receives a string. A possible fix is arrays = row.pop("attributes_array", None) or "{}" or explicitly checking for None before calling json.loads().
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: snuba/web/rpc/v1/endpoint_get_trace.py#L506
Potential issue: A `TypeError` occurs when `attributes_array` is `None`. The
`row.pop("attributes_array", "{}")` call at line 506 returns `None` if the key exists
with a `None` value (e.g., from ClickHouse `toJSONString(NULL)`). This `None` is
subsequently passed to `json.loads()` in `_process_arrays`, causing a `TypeError: the
JSON object must be str, bytes or bytearray, not NoneType`. This crash is triggered when
trace queries do not specify explicit attributes and ClickHouse returns `NULL` for
`attributes_array`.
Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 4330143
This will allow us to return array values stored in
attributes_arraywhen fetching a trace. Support to return arrays with containing different types were added tosentry-protos(getsentry/sentry-protos#153).One caveat right now is
clickhouse-driverdoesn't support theJSONorVariantcolumn types. The workaround for this right now is to encode as JSON and send the value back to Snuba as a string.There is a PR opened which adds support but there's not a lot of activity. Perhaps we should fork
clickhouse-driverand add support for theJSONtype for a mid term solution, then upsrtream the result.