Skip to content

Conversation

@phacops
Copy link
Contributor

@phacops phacops commented Nov 24, 2025

This will allow us to return array values stored in attributes_array when fetching a trace. Support to return arrays with containing different types were added to sentry-protos (getsentry/sentry-protos#153).

One caveat right now is clickhouse-driver doesn't support the JSON or Variant column types. The workaround for this right now is to encode as JSON and send the value back to Snuba as a string.

There is a PR opened which adds support but there's not a lot of activity. Perhaps we should fork clickhouse-driver and add support for the JSON type for a mid term solution, then upsrtream the result.

@phacops phacops requested review from a team as code owners November 24, 2025 21:12
Comment on lines 488 to 492
attributes_array = _process_arrays(arrays)
for key, value in attributes_array.items():
add_attribute(k, v)

item = GetTraceResponse.Item(

This comment was marked as outdated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a real bug, was this come across in tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it was due to a late refactor before I commit (rename k/v to key/value) and didn't run the tests in-between.

@phacops
Copy link
Contributor Author

phacops commented Nov 27, 2025

Sentry needed to have some compatibility work done: getsentry/sentry#104063

@phacops phacops requested a review from volokluev November 27, 2025 00:14
Comment on lines +470 to +480
def _transform_array_value(value: dict[str, str]) -> Any:
for t, v in value.items():
if t == "Int":
return int(v)
if t == "Bool":
return bool(v)
if t == "Double":
return float(v)
if t == "String":
return v
raise BadSnubaRPCRequestException(f"array value type unknown: {type(v)}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Incorrect boolean parsing in _transform_array_value converts string "false" to True.
Severity: CRITICAL | Confidence: High

🔍 Detailed Analysis

The _transform_array_value function incorrectly parses boolean values. When v is a string like "false" or "0" (from ClickHouse's toJSONString output), bool(v) evaluates to True in Python. This leads to data corruption where False boolean attributes stored in arrays are incorrectly converted to True when retrieved via the GetTraceRequest endpoint, impacting logical correctness of trace data.

💡 Suggested Fix

Modify _transform_array_value to explicitly parse boolean strings. For example, return v.lower() in ("true", "1", "yes") should be used instead of bool(v) to correctly interpret string representations of boolean values.

🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: snuba/web/rpc/v1/endpoint_get_trace.py#L470-L480

Potential issue: The `_transform_array_value` function incorrectly parses boolean
values. When `v` is a string like "false" or "0" (from ClickHouse's `toJSONString`
output), `bool(v)` evaluates to `True` in Python. This leads to data corruption where
`False` boolean attributes stored in arrays are incorrectly converted to `True` when
retrieved via the `GetTraceRequest` endpoint, impacting logical correctness of trace
data.

Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 4330143

for row in data:
id = row.pop("id")
ts = row.pop("timestamp")
arrays = row.pop("attributes_array", "{}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Crash when attributes_array is None due to json.loads(None).
Severity: CRITICAL | Confidence: High

🔍 Detailed Analysis

A TypeError occurs when attributes_array is None. The row.pop("attributes_array", "{}") call at line 506 returns None if the key exists with a None value (e.g., from ClickHouse toJSONString(NULL)). This None is subsequently passed to json.loads() in _process_arrays, causing a TypeError: the JSON object must be str, bytes or bytearray, not NoneType. This crash is triggered when trace queries do not specify explicit attributes and ClickHouse returns NULL for attributes_array.

💡 Suggested Fix

Handle None values returned by row.pop() for attributes_array. Ensure that json.loads() receives a string. A possible fix is arrays = row.pop("attributes_array", None) or "{}" or explicitly checking for None before calling json.loads().

🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: snuba/web/rpc/v1/endpoint_get_trace.py#L506

Potential issue: A `TypeError` occurs when `attributes_array` is `None`. The
`row.pop("attributes_array", "{}")` call at line 506 returns `None` if the key exists
with a `None` value (e.g., from ClickHouse `toJSONString(NULL)`). This `None` is
subsequently passed to `json.loads()` in `_process_arrays`, causing a `TypeError: the
JSON object must be str, bytes or bytearray, not NoneType`. This crash is triggered when
trace queries do not specify explicit attributes and ClickHouse returns `NULL` for
`attributes_array`.

Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 4330143

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants