Skip to content

Conversation

@omertuc
Copy link
Member

@omertuc omertuc commented Dec 22, 2025

In newer OCP versions we started seeing:

Error: etcd to utf-8

Caused by:
    invalid utf-8 sequence of 1 bytes from index 0
    : exit status 1"

This is due to k8s using a new format for etcd storage of CRs called CBOR:

kubernetes/enhancements#4222

recert was not aware of this format. This PR makes recert aware of it

Summary by CodeRabbit

  • New Features

    • Added CBOR (Concise Binary Object Representation) encoding support for data storage and retrieval.
    • Implemented encoding-aware data handling with automatic format detection and conversion.
  • Bug Fixes

    • Improved error reporting for etcd write operations with more descriptive error messages.
  • Chores

    • Added CBOR serialization library dependencies.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 22, 2025

Walkthrough

This PR adds CBOR (Concise Binary Object Representation) encoding support to the system. It vendors the ciborium library and dependencies, introduces Kubernetes-style self-describing CBOR encoding with JSON conversion, updates the etcd encoding layer to track encoding types (Protobuf, CBOR, JSON), and persists encoding metadata in storage. Error context is also consistently added to etcd PUT operations across multiple modules.

Changes

Cohort / File(s) Summary
Dependency Addition
Cargo.toml
Added ciborium 0.2.2 dependency for CBOR serialization support.
CBOR Encoding Module
src/etcd_encoding/k8s_cbor.rs
New module implementing bidirectional conversion between Kubernetes self-describing CBOR and JSON; includes k8s_cbor_bytes_to_json and json_to_k8s_cbor_bytes functions with comprehensive type mapping and error handling.
Etcd Encoding Infrastructure
src/etcd_encoding.rs
Added internal Encoding enum (Protobuf, Cbor, Json); updated decode signature to return (Vec<u8>, Encoding); updated encode signature to accept Encoding parameter; added CBOR detection and conversion paths alongside existing protobuf/JSON logic.
In-Memory Etcd Storage
src/k8s_etcd.rs
Modified storage to track encoding metadata: etcd_keyvalue_hashmap now stores (Encoding, Vec<u8>) pairs; put method now returns Result<()> and only updates existing keys with preserved encoding; encoding is extracted and used during commit to etcd.
Error Context Improvements
src/cluster_crypto/cert_key_pair.rs, src/cluster_crypto/distributed_jwt.rs, src/cluster_crypto/distributed_private_key.rs, src/cluster_crypto/distributed_public_key.rs, src/ocp_postprocess/encryption_config/etcd_rename.rs, src/ocp_postprocess/hostname_rename/etcd_rename.rs
Added error context .context("putting in etcd")? to etcd PUT operations for enhanced error reporting during failures.
Ciborium IO Library
vendor/ciborium-io/*
Vendored minimal no_std/no_alloc compatible I/O traits crate; provides Read and Write traits with blanket implementations for std and alloc features; includes error types EndOfFile() and OutOfSpace().
Ciborium Low-Level Codec
vendor/ciborium-ll/src/dec.rs, vendor/ciborium-ll/src/enc.rs, vendor/ciborium-ll/src/hdr.rs, vendor/ciborium-ll/src/seg.rs, vendor/ciborium-ll/src/lib.rs
Vendored low-level CBOR parsing/encoding library; implements Decoder, Encoder, Header types with segmented CBOR support; includes major/minor type handling and float normalization.
Ciborium Serde Integration
vendor/ciborium/src/de/mod.rs, vendor/ciborium/src/de/error.rs, vendor/ciborium/src/ser/mod.rs, vendor/ciborium/src/ser/error.rs
Vendored Serde-based CBOR deserializer/serializer; provides from_reader, from_reader_with_buffer, from_reader_with_recursion_limit, into_writer entry points; handles integers, floats, strings, arrays, maps, tags, and custom error types.
Ciborium Value System
vendor/ciborium/src/value/mod.rs, vendor/ciborium/src/value/de.rs, vendor/ciborium/src/value/ser.rs, vendor/ciborium/src/value/canonical.rs, vendor/ciborium/src/value/integer.rs, vendor/ciborium/src/value/error.rs
Vendored dynamic CBOR Value type with comprehensive variant support (Integer, Bytes, Text, Array, Map, Tag, etc.); includes canonical ordering, big integer handling, and serde integration; provides type-checking and safe access methods.
Ciborium Tag Support
vendor/ciborium/src/tag.rs
Vendored CBOR tag handling with Captured<V>, Required<V, TAG>, and Accepted<V, TAG> wrapper types for tag-aware (de)serialization and constraint enforcement.
Ciborium Tests
vendor/ciborium/tests/*.rs
Vendored comprehensive test suite covering canonical ordering, codec round-trips, error handling, recursion limits, UTF-8 handling, no_std support, and CBOR tag behavior.
Ciborium Metadata
vendor/ciborium/Cargo.toml*, vendor/ciborium/*.md, vendor/ciborium/LICENSE, vendor/ciborium-io/Cargo.toml*, vendor/ciborium-io/*.md, vendor/ciborium-io/LICENSE, vendor/ciborium-ll/Cargo.toml*, vendor/ciborium-ll/*.md, vendor/ciborium-ll/LICENSE
Vendored crate metadata, documentation, and licensing for ciborium ecosystem.
Crunchy Vendor
vendor/crunchy/*
Vendored crunchy macro expansion crate; provides compile-time iteration over numeric ranges with configurable limits (0–2048).

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Encoding Layer as Encoding/Decode
    participant K8s CBOR as K8s CBOR Module
    participant Storage as In-Memory Storage
    participant Etcd

    Client->>Encoding Layer: encode(data, encoding=CBOR)
    Encoding Layer->>K8s CBOR: json_to_k8s_cbor_bytes(json)
    K8s CBOR->>K8s CBOR: Convert JSON to CBOR<br/>Apply self-describing tag
    K8s CBOR-->>Encoding Layer: CBOR bytes
    Encoding Layer-->>Client: Encoded bytes

    Client->>Encoding Layer: decode(cbor_bytes)
    Encoding Layer->>K8s CBOR: Detect CBOR tag<br/>k8s_cbor_bytes_to_json(bytes)
    K8s CBOR->>K8s CBOR: Parse CBOR structure<br/>Convert to JSON
    K8s CBOR-->>Encoding Layer: (json_bytes, Encoding::CBOR)
    Encoding Layer-->>Client: (decoded_data, encoding_type)

    Client->>Storage: put(key, value, encoding)
    Storage->>Storage: Store (Encoding, Vec<u8>) pair
    Storage-->>Client: Result<()>

    Client->>Storage: commit_hashmap()
    Storage->>Encoding Layer: encode(value, stored_encoding)
    Encoding Layer-->>Storage: Encoded bytes
    Storage->>Etcd: PUT key with encoded bytes
    Etcd-->>Storage: Success/Error
    Storage-->>Client: Result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Suggested reviewers

  • mresvanis
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add CBOR support' directly and clearly summarizes the main change: introducing CBOR encoding/decoding support to handle Kubernetes CustomResources stored in CBOR format in etcd.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from danmanor and tsorya December 22, 2025 20:53
@openshift-ci
Copy link

openshift-ci bot commented Dec 22, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: omertuc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@omertuc omertuc marked this pull request as draft December 22, 2025 20:56
@omertuc
Copy link
Member Author

omertuc commented Dec 22, 2025

Still testing

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (4)
src/etcd_encoding/k8s_cbor.rs (1)

42-43: Consider consistent error handling for unsupported types.

Lines 42-43 use unimplemented! which will panic. For consistency with the rest of the file's error handling, consider using bail! to return an error instead, especially since these could be triggered by malformed input data.

Suggested fix
-        CborValue::Tag(value, _tag) => unimplemented!("Unsupported CBOR tag {:?}", value),
-        _ => unimplemented!("Unsupported CBOR type {:?}", cbor),
+        CborValue::Tag(tag_num, _contents) => bail!("Unsupported CBOR tag {}", tag_num),
+        other => bail!("Unsupported CBOR type {:?}", other),
src/etcd_encoding.rs (3)

68-73: Consider deriving additional traits for Encoding.

The enum is simple with no associated data. Deriving Debug, PartialEq, Eq, and Copy would improve ergonomics for debugging and comparisons without overhead.

Proposed change
-#[derive(Clone)]
+#[derive(Clone, Copy, Debug, PartialEq, Eq)]
 pub(crate) enum Encoding {
     Protobuf,
     Cbor,
     Json,
 }

75-86: Unnecessary async marker on decode function.

This function performs no async operations—all work is synchronous (CBOR parsing, JSON serialization, protobuf decoding). The async keyword adds overhead via the state machine transformation and requires callers to .await unnecessarily.

Proposed signature change
-pub(crate) async fn decode(data: &[u8]) -> Result<(Vec<u8>, Encoding)> {
+pub(crate) fn decode(data: &[u8]) -> Result<(Vec<u8>, Encoding)> {

Note: This requires updating all call sites to remove .await.


120-128: Unnecessary async and incomplete encoding parameter usage.

  1. Like decode, this function has no async operations.
  2. The TODO at line 128 notes the encoding parameter is ignored for non-CBOR cases. If Encoding::Json is passed but the kind is a known protobuf type, it will still be encoded as protobuf—potentially not preserving the original format.

If preserving the original encoding is the goal, consider using the encoding parameter to decide the output format:

Proposed approach
 pub(crate) async fn encode(data: &[u8], encoding: Encoding) -> Result<Vec<u8>> {
     let value: Value = serde_json::from_slice(data)?;
 
     if matches!(encoding, Encoding::Cbor) {
         return k8s_cbor::json_to_k8s_cbor_bytes(value).context("converting JSON to CBOR");
     }
 
-    // If kind is a known protobuf kind, write it back as protobuf, otherwise return raw JSON
-    // TODO: Just look at the new encoding param?
+    // If original encoding was JSON, preserve it regardless of kind
+    if matches!(encoding, Encoding::Json) {
+        return Ok(data.to_vec());
+    }
+
+    // Original was Protobuf - encode back to protobuf if kind is known
     let kind = value
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ba4373c and 8f7bc51.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (11)
  • .cargo/config.toml
  • Cargo.toml
  • src/cluster_crypto/cert_key_pair.rs
  • src/cluster_crypto/distributed_jwt.rs
  • src/cluster_crypto/distributed_private_key.rs
  • src/cluster_crypto/distributed_public_key.rs
  • src/etcd_encoding.rs
  • src/etcd_encoding/k8s_cbor.rs
  • src/k8s_etcd.rs
  • src/ocp_postprocess/encryption_config/etcd_rename.rs
  • src/ocp_postprocess/hostname_rename/etcd_rename.rs
💤 Files with no reviewable changes (1)
  • .cargo/config.toml
🧰 Additional context used
🧬 Code graph analysis (2)
src/etcd_encoding.rs (1)
src/etcd_encoding/k8s_cbor.rs (2)
  • k8s_cbor_bytes_to_json (75-92)
  • json_to_k8s_cbor_bytes (94-104)
src/k8s_etcd.rs (1)
src/etcd_encoding.rs (2)
  • encode (120-159)
  • decode (75-118)
🪛 GitHub Actions: Rust
src/cluster_crypto/distributed_jwt.rs

[error] 86-86: cargo fmt --check reported formatting diffs. Run 'cargo fmt' to fix code style issues.

src/ocp_postprocess/hostname_rename/etcd_rename.rs

[error] 227-227: cargo fmt --check reported formatting diffs. Run 'cargo fmt' to fix code style issues.

src/ocp_postprocess/encryption_config/etcd_rename.rs

[error] 177-177: cargo fmt --check reported formatting diffs. Run 'cargo fmt' to fix code style issues.

src/cluster_crypto/distributed_public_key.rs

[error] 129-129: cargo fmt --check reported formatting diffs. Run 'cargo fmt' to fix code style issues.

src/cluster_crypto/cert_key_pair.rs

[error] 357-357: cargo fmt --check reported formatting diffs. Run 'cargo fmt' to fix code style issues.

src/k8s_etcd.rs

[error] 346-346: cargo fmt --check reported formatting diffs. Run 'cargo fmt' to fix code style issues.

src/cluster_crypto/distributed_private_key.rs

[error] 96-96: cargo fmt --check reported formatting diffs. Run 'cargo fmt' to fix code style issues.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Red Hat Konflux / recert-4-22-on-pull-request
🔇 Additional comments (10)
src/etcd_encoding/k8s_cbor.rs (3)

47-72: Asymmetric Text/Bytes handling may affect round-trip fidelity.

cbor_to_json converts both CborValue::Text (line 11) and CborValue::Bytes (line 41) to JsonValue::String, but json_to_cbor always converts JsonValue::String to CborValue::Bytes (line 51). This means CBOR Text values will become Bytes after a round-trip.

If Kubernetes CBOR uses Bytes for string fields (as suggested by the map key handling), this may be intentional. Otherwise, you may want to preserve original types or document this behavior.


12-21: Integer precision loss for values > 2^53.

The fallback to f64 at line 19 may lose precision for integers outside the ±2^53 range. This is a known JSON limitation and is handled with an error context. For typical Kubernetes use cases, this should be fine since values rarely exceed i64 bounds.


94-104: LGTM!

The json_to_k8s_cbor_bytes function correctly wraps the CBOR value with the self-describing tag before serialization, maintaining the expected Kubernetes CBOR format.

Cargo.toml (1)

63-63: ciborium 0.2.2 is an appropriate dependency choice.

The crate is a suitable choice for CBOR encoding/decoding, currently at its latest stable version with no known security vulnerabilities.

src/etcd_encoding.rs (1)

99-117: LGTM!

The protobuf decoding path correctly preserves the per-kind deserialization logic and properly returns Encoding::Protobuf to track the original format.

src/k8s_etcd.rs (5)

26-26: LGTM!

The type change to store (Encoding, Vec<u8>) correctly tracks encoding metadata for round-trip preservation.


104-112: LGTM!

The commit path correctly extracts and propagates the encoding to preserve the original format when writing back to etcd.


186-192: LGTM!

The encoding is correctly discarded during reads since the cached value is already decoded JSON. The encoding is preserved for the commit path.


199-223: LGTM!

The encoding is correctly captured from both the encrypted and non-encrypted decode paths and stored in the cache for later use during commits.


232-244: Breaking change: put now requires key to exist in cache, but all existing callers are safe.

The method signature changed to return Result<()> and now validates that the key exists in the cache before updating it. All 100+ callers throughout the codebase follow a consistent pattern: they fetch the value via get_etcd_json() (which populates the cache), modify it, then call put_etcd_yaml() to persist changes. This ensures the cache invariant is always satisfied in practice.

Comment on lines +78 to +91
let (v, had_self_describing_tag) = match v {
CborValue::Tag(value, contents) => match value {
SELF_DESCRIBING_CBOR_TAG => {
// Self-describing CBOR tag, unwrap the contents
(*contents, true)
}
_ => panic!("Unsupported CBOR tag {}", value),
},
// We expected a self-describing CBOR tag at the root. Of course we could just proceed
// as is (since it's just raw CBOR) but it's a bit fishy, so just bail
_ => bail!("CBOR data that does not start with self-describing tag is not supported"),
};

cbor_to_json(v)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use bail! instead of panic! for consistency; remove unused variable.

Line 84 uses panic! while line 88 uses bail!. For consistency and safer error handling, prefer bail!. Also, had_self_describing_tag is set but never used.

Suggested fix
-    let (v, had_self_describing_tag) = match v {
+    let v = match v {
         CborValue::Tag(value, contents) => match value {
             SELF_DESCRIBING_CBOR_TAG => {
                 // Self-describing CBOR tag, unwrap the contents
-                (*contents, true)
+                *contents
             }
-            _ => panic!("Unsupported CBOR tag {}", value),
+            _ => bail!("Unsupported CBOR tag {}", value),
         },
         // We expected a self-describing CBOR tag at the root. Of course we could just proceed
         // as is (since it's just raw CBOR) but it's a bit fishy, so just bail
-        _ => bail!("CBOR data that does not start with self-describing tag is not supported"),
+        _ => bail!("CBOR data without self-describing tag is not supported"),
     };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let (v, had_self_describing_tag) = match v {
CborValue::Tag(value, contents) => match value {
SELF_DESCRIBING_CBOR_TAG => {
// Self-describing CBOR tag, unwrap the contents
(*contents, true)
}
_ => panic!("Unsupported CBOR tag {}", value),
},
// We expected a self-describing CBOR tag at the root. Of course we could just proceed
// as is (since it's just raw CBOR) but it's a bit fishy, so just bail
_ => bail!("CBOR data that does not start with self-describing tag is not supported"),
};
cbor_to_json(v)
let v = match v {
CborValue::Tag(value, contents) => match value {
SELF_DESCRIBING_CBOR_TAG => {
// Self-describing CBOR tag, unwrap the contents
*contents
}
_ => bail!("Unsupported CBOR tag {}", value),
},
// We expected a self-describing CBOR tag at the root. Of course we could just proceed
// as is (since it's just raw CBOR) but it's a bit fishy, so just bail
_ => bail!("CBOR data without self-describing tag is not supported"),
};
cbor_to_json(v)
🤖 Prompt for AI Agents
In src/etcd_encoding/k8s_cbor.rs around lines 78 to 91, replace the panic!
branch with a bail! to keep error handling consistent and remove the unused
had_self_describing_tag variable: change the match to return only the unwrapped
contents (no had_self_describing_tag binding) and in the default tag arm call
bail!(...) with a clear error including the tag value; keep the non-tag arm
as-is (bail for missing self-describing tag) and then pass the unwrapped
CborValue to cbor_to_json.

Comment on lines 346 to 351
pub(crate) async fn put_etcd_yaml(client: &InMemoryK8sEtcd, k8slocation: &K8sResourceLocation, value: Value) -> Result<()> {
client
.put(&k8slocation.as_etcd_key(), serde_json::to_string(&value)?.as_bytes().into())
.await;
.await.context("putting in etcd")?;
Ok(())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix formatting to pass CI.

The pipeline reports a cargo fmt failure on this line. Run cargo fmt to fix.

Likely fix
     client
         .put(&k8slocation.as_etcd_key(), serde_json::to_string(&value)?.as_bytes().into())
-        .await.context("putting in etcd")?;
+        .await
+        .context("putting in etcd")?;
     Ok(())
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub(crate) async fn put_etcd_yaml(client: &InMemoryK8sEtcd, k8slocation: &K8sResourceLocation, value: Value) -> Result<()> {
client
.put(&k8slocation.as_etcd_key(), serde_json::to_string(&value)?.as_bytes().into())
.await;
.await.context("putting in etcd")?;
Ok(())
pub(crate) async fn put_etcd_yaml(client: &InMemoryK8sEtcd, k8slocation: &K8sResourceLocation, value: Value) -> Result<()> {
client
.put(&k8slocation.as_etcd_key(), serde_json::to_string(&value)?.as_bytes().into())
.await
.context("putting in etcd")?;
Ok(())
}
🧰 Tools
🪛 GitHub Actions: Rust

[error] 346-346: cargo fmt --check reported formatting diffs. Run 'cargo fmt' to fix code style issues.

🤖 Prompt for AI Agents
In src/k8s_etcd.rs around lines 346 to 350, the formatting of the put_etcd_yaml
function does not match rustfmt/cargo fmt expectations and is causing CI to
fail; run cargo fmt (or rustfmt) to reformat the file so the function signature,
chained call and await/context lines conform to project style, then commit the
formatted file so CI passes.

Up until now we would write CBOR values back as etcd, after this commit
we preserve the original encoding, so CBOR values are written back as
CBOR.

This required tracking the encoding type in the in-memory etcd cache,
and modifying the encode function to handle CBOR.
@omertuc omertuc marked this pull request as ready for review January 12, 2026 09:16
@openshift-ci openshift-ci bot requested a review from mresvanis January 12, 2026 09:17
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Fix all issues with AI agents
In @src/etcd_encoding/k8s_cbor.rs:
- Line 84: The match arm currently panics on unknown CBOR tags ("_ =>
panic!(\"Unsupported CBOR tag {}\", value)"); change this to return an error
instead: replace the panic with an Err containing a descriptive message that
includes value (e.g., Err(format!(...)) wrapped in your crate's error type or
anyhow::Error) and adjust the surrounding function's signature to return
Result<..., E> if needed (update callers). Ensure you use the existing error
type used elsewhere in this module (e.g., EtcdEncodingError / crate::Error or
anyhow) so the error propagates instead of aborting the process.

In @vendor/ciborium/src/tag.rs:
- Around line 111-123: In deserialize_any in impl for tag (the function named
deserialize_any), avoid the unconditional self.tag.unwrap() when matching state
>= 2; instead detect when self.tag is None and return a proper deserialization
error (using the crate's Self::Error/error constructor) or restructure the state
transitions so state never advances to >= 2 for an untagged value; update the
match arm for the default case to check self.tag and either call
visitor.visit_u64(tag) when Some(tag) or return Err(...) with a clear error
indicating an unexpected tuple element for an untagged value.

In @vendor/ciborium/src/value/canonical.rs:
- Around line 62-67: The doc comment for the CanonicalValue struct is truncated;
update the comment above pub struct CanonicalValue(Value) to finish the sentence
by explaining why the wrapper exists and what it guarantees — e.g., state that
since a regular Value does not implement Ord/Eq and CBOR key ordering must
follow RFC 7049/8949, CanonicalValue wraps Value to provide total ordering and
equality for canonical sorting of map keys per those RFC sections. Ensure the
comment references CanonicalValue and Value and briefly mentions the RFCs and
the sorting purpose.

In @vendor/ciborium/src/value/de.rs:
- Around line 161-164: The code currently uses assert_eq!( "@@TAGGED@@", name )
after calling acc.variant(), which will panic on unexpected input; replace this
panic with a proper deserialization error path: check if the returned name
equals "@@TAGGED@@", and if not return an Err using the crate's deserializer
error type (the same error type other deserializers use in this module) instead
of panicking; keep the successful path calling data.tuple_variant(2, Inner)
unchanged so the flow remains acc.variant()? -> validate name ->
data.tuple_variant(2, Inner).

In @vendor/ciborium/src/value/integer.rs:
- Around line 62-68: The i128 branch in canonical_len
(vendor/ciborium/src/value/integer.rs) wrongly returns x.to_be_bytes().len() +
1; instead replicate the serialization in ser/mod.rs: take x.to_be_bytes(),
strip leading zero bytes to form slice, compute the CBOR length-header size for
that slice length (1 byte for the initial additional-info plus the extra integer
byte count when length > 23), then return 1 (tag) + header_size + slice.len();
update the canonical_len path that handles i128/BigPos to use this calculation
so canonical length matches actual encoding.
🧹 Nitpick comments (6)
vendor/ciborium/tests/macro.rs (1)

58-59: Duplicate test cases in vendored code.

Lines 58-59 contain identical test cases. This pattern repeats throughout the file (e.g., lines 72-73, 84-85, 168-169, etc.). Since this is vendored upstream code, these duplicates should be left as-is to maintain compatibility with the original ciborium library, but be aware this adds redundant test execution.

src/etcd_encoding.rs (2)

68-73: Consider adding Debug and PartialEq derives to Encoding.

Adding these derives would improve debuggability and enable assertions in tests.

♻️ Suggested improvement
-#[derive(Clone)]
+#[derive(Clone, Debug, PartialEq)]
 pub(crate) enum Encoding {
     Protobuf,
     Cbor,
     Json,
 }

127-128: Address the TODO comment about simplifying encoding logic.

The comment suggests that the kind check could be replaced by using the encoding parameter directly. This would simplify the logic and make the code more maintainable.

Would you like me to help refactor this to use the encoding parameter consistently, or should this be tracked as a follow-up task?

src/etcd_encoding/k8s_cbor.rs (1)

51-51: Add explicit comment explaining K8s-specific byte string encoding for JSON strings.

Line 51 converts JsonValue::String to CborValue::Bytes (major type 2), while line 12 handles CborValue::Text (major type 3) during decoding. This asymmetry—writing strings as byte strings but accepting both text and byte strings when reading—is intentional for K8s CBOR compatibility. The existing TODO comment (lines 36-40) acknowledges the binary data limitation for secrets, but a more explicit comment at line 51 would clarify why strings are encoded as byte strings rather than CBOR text strings per RFC 8949.

vendor/ciborium/src/tag.rs (1)

74-83: Minor: Error message could be clearer for Accepted type.

Line 80 uses the message "required tag not found", but Accepted rejects a wrong tag (not a missing one). Consider a more precise message like "unexpected tag" or "wrong tag found".

📝 Suggested improvement
             Internal::Tagged(t, v) if t == TAG => Ok(Accepted(v)),
             Internal::Untagged(v) => Ok(Accepted(v)),
-            _ => Err(de::Error::custom("required tag not found")),
+            _ => Err(de::Error::custom("unexpected tag")),
vendor/ciborium/src/value/canonical.rs (1)

9-23: Silent error handling in serialized_canonical_cmp.

Lines 15 and 17 discard serialization errors with let _ = .... If serialization fails for both values, the function compares two empty vectors, returning Equal. While unlikely in practice, this could produce incorrect orderings for malformed values.

Consider at minimum documenting this behavior, or using expect/unwrap if serialization failure is truly unexpected for valid Value instances.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f7bc51 and 8ba3eb8.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (57)
  • Cargo.toml
  • src/cluster_crypto/cert_key_pair.rs
  • src/cluster_crypto/distributed_jwt.rs
  • src/cluster_crypto/distributed_private_key.rs
  • src/cluster_crypto/distributed_public_key.rs
  • src/etcd_encoding.rs
  • src/etcd_encoding/k8s_cbor.rs
  • src/k8s_etcd.rs
  • src/ocp_postprocess/encryption_config/etcd_rename.rs
  • src/ocp_postprocess/hostname_rename/etcd_rename.rs
  • vendor/ciborium-io/.cargo-checksum.json
  • vendor/ciborium-io/.cargo_vcs_info.json
  • vendor/ciborium-io/Cargo.toml
  • vendor/ciborium-io/Cargo.toml.orig
  • vendor/ciborium-io/LICENSE
  • vendor/ciborium-io/README.md
  • vendor/ciborium-io/src/lib.rs
  • vendor/ciborium-ll/.cargo-checksum.json
  • vendor/ciborium-ll/.cargo_vcs_info.json
  • vendor/ciborium-ll/Cargo.toml
  • vendor/ciborium-ll/Cargo.toml.orig
  • vendor/ciborium-ll/LICENSE
  • vendor/ciborium-ll/README.md
  • vendor/ciborium-ll/src/dec.rs
  • vendor/ciborium-ll/src/enc.rs
  • vendor/ciborium-ll/src/hdr.rs
  • vendor/ciborium-ll/src/lib.rs
  • vendor/ciborium-ll/src/seg.rs
  • vendor/ciborium/.cargo-checksum.json
  • vendor/ciborium/.cargo_vcs_info.json
  • vendor/ciborium/Cargo.toml
  • vendor/ciborium/Cargo.toml.orig
  • vendor/ciborium/LICENSE
  • vendor/ciborium/README.md
  • vendor/ciborium/src/de/error.rs
  • vendor/ciborium/src/de/mod.rs
  • vendor/ciborium/src/lib.rs
  • vendor/ciborium/src/ser/error.rs
  • vendor/ciborium/src/ser/mod.rs
  • vendor/ciborium/src/tag.rs
  • vendor/ciborium/src/value/canonical.rs
  • vendor/ciborium/src/value/de.rs
  • vendor/ciborium/src/value/error.rs
  • vendor/ciborium/src/value/integer.rs
  • vendor/ciborium/src/value/mod.rs
  • vendor/ciborium/src/value/ser.rs
  • vendor/ciborium/tests/canonical.rs
  • vendor/ciborium/tests/codec.rs
  • vendor/ciborium/tests/error.rs
  • vendor/ciborium/tests/fuzz.rs
  • vendor/ciborium/tests/macro.rs
  • vendor/ciborium/tests/no_std.rs
  • vendor/ciborium/tests/recursion.rs
  • vendor/ciborium/tests/tag.rs
  • vendor/crunchy/.cargo-checksum.json
  • vendor/crunchy/Cargo.toml
  • vendor/crunchy/src/lib.rs
✅ Files skipped from review due to trivial changes (13)
  • vendor/ciborium/Cargo.toml.orig
  • vendor/ciborium-ll/.cargo-checksum.json
  • vendor/ciborium-io/.cargo-checksum.json
  • vendor/ciborium-ll/.cargo_vcs_info.json
  • vendor/ciborium/LICENSE
  • vendor/ciborium-io/LICENSE
  • vendor/ciborium-io/.cargo_vcs_info.json
  • vendor/ciborium-io/Cargo.toml
  • vendor/ciborium/README.md
  • vendor/ciborium-ll/Cargo.toml
  • vendor/crunchy/.cargo-checksum.json
  • vendor/ciborium/Cargo.toml
  • vendor/ciborium/.cargo_vcs_info.json
🚧 Files skipped from review as they are similar to previous changes (6)
  • src/ocp_postprocess/hostname_rename/etcd_rename.rs
  • src/ocp_postprocess/encryption_config/etcd_rename.rs
  • src/cluster_crypto/cert_key_pair.rs
  • src/cluster_crypto/distributed_private_key.rs
  • src/cluster_crypto/distributed_public_key.rs
  • src/cluster_crypto/distributed_jwt.rs
🧰 Additional context used
🧬 Code graph analysis (2)
src/k8s_etcd.rs (1)
src/etcd_encoding.rs (2)
  • encode (120-159)
  • decode (75-118)
src/etcd_encoding.rs (1)
src/etcd_encoding/k8s_cbor.rs (2)
  • k8s_cbor_bytes_to_json (75-92)
  • json_to_k8s_cbor_bytes (94-104)
🪛 GitHub Actions: Rust
src/etcd_encoding/k8s_cbor.rs

[error] 84-84: Clippy error: panic! should not be present in production code. Consider avoiding panics in production paths (panic! at src/etcd_encoding/k8s_cbor.rs:84).

🪛 LanguageTool
vendor/ciborium-io/README.md

[style] ~1-~1: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 1184 characters long)
Context: Workflow Status Average time to resolve an issue [![Percentage of issues still open](https:...

(EN_EXCESSIVE_EXCLAMATION)

vendor/ciborium-ll/README.md

[grammar] ~8-~8: Use a hyphen to join words.
Context: ...ped-brightgreen.svg) # ciborium-ll Low level CBOR parsing tools This crate con...

(QB_NEW_EN_HYPHEN)


[style] ~55-~55: Consider a more concise word here.
Context: ...s.ietf.org/html/rfc7049). ## Decoding In order to decode CBOR, you will create a `Decoder...

(IN_ORDER_TO_PREMIUM)

🔇 Additional comments (64)
vendor/ciborium/.cargo-checksum.json (1)

1-1: LGTM - Standard cargo checksum metadata.

This is auto-generated checksum metadata from cargo vendor ensuring integrity of the vendored ciborium crate files.

vendor/crunchy/Cargo.toml (1)

1-30: LGTM - Vendored crunchy crate manifest.

Standard vendored manifest for the crunchy crate (deterministic loop unroller). This appears to be a transitive dependency of the ciborium CBOR library.

vendor/ciborium-ll/LICENSE (1)

1-201: LGTM - Standard Apache 2.0 license file.

This is the standard Apache License 2.0 text for the vendored ciborium-ll crate, consistent with the license declaration in the crate manifests.

vendor/ciborium-io/Cargo.toml.orig (1)

1-28: LGTM - Vendored ciborium-io manifest snapshot.

Standard vendored manifest for ciborium-io providing simplified Read/Write traits for no_std CBOR operations. Version 0.2.2 is consistent with the main ciborium dependency.

Cargo.toml (1)

63-63: LGTM — ciborium dependency addition for CBOR support.

The ciborium library is appropriate for handling CBOR encoding/decoding. Version 0.2.2 is the latest available version with no known security advisories.

vendor/ciborium-io/README.md (1)

1-22: Vendored library documentation looks appropriate.

This README accurately documents the ciborium-io crate's Read and Write traits for no_std/no_alloc environments. The static analysis hint about exclamation marks is a false positive from badge URL syntax.

vendor/ciborium/tests/fuzz.rs (1)

1-64: Vendored fuzz test implementation is sound.

This is a standard fork-based fuzz test from the ciborium library that stress-tests the CBOR deserializer against random inputs. The IPC pattern correctly closes unused pipe ends in each process, and the parent properly waits for the child to exit.

vendor/ciborium/tests/no_std.rs (1)

1-39: Vendored no_std tests are well-structured.

These tests properly verify CBOR encoding/decoding functionality in a no_std environment with alloc. The conditional compilation (#![cfg(all(feature = "serde", not(feature = "std")))]) correctly targets the intended configuration, and the tests cover both success and error paths.

vendor/ciborium/src/ser/error.rs (1)

1-42: Vendored error type implementation is correct.

The Error<T> type properly implements serde's error traits for serialization. The generic design allows different I/O error types while providing a unified interface. Note: Line 13 has a minor typo ("reaturned" → "returned") in the upstream vendored code, but modifying vendored dependencies is generally avoided.

vendor/ciborium-ll/README.md (1)

1-131: Vendored library documentation is comprehensive.

This README provides excellent documentation for the low-level CBOR parsing crate, including a helpful ASCII diagram of CBOR item anatomy, clear explanations of the Decoder/Encoder workflow, and practical code examples. The static analysis hints are minor stylistic preferences that don't warrant changes to vendored documentation.

vendor/ciborium-ll/Cargo.toml.orig (1)

1-35: Vendored manifest looks correct.

This is a standard Cargo.toml manifest for the vendored ciborium-ll crate. The package metadata, dependencies, and feature configuration are appropriate for a no_std-compatible low-level CBOR codec library.

vendor/ciborium/tests/macro.rs (2)

13-30: LGTM - Helper macros are well-designed.

The map! and arr! macros provide clean abstractions for constructing CBOR Value representations using Value::serialized. The trailing comma handling in map! and the variadic pattern in arr! follow idiomatic Rust macro conventions.


356-358: Test function signature is appropriate.

The test function correctly accepts Value types for both parameters and uses a simple equality assertion for round-trip verification.

vendor/ciborium/src/value/error.rs (2)

12-16: Display delegates to Debug formatting.

The Display implementation outputs Custom("message") rather than just "message". This is acceptable for a vendored library's internal error type, though it results in less user-friendly error messages. Since this is vendored code, leave as-is.


20-31: Serde error trait implementations are correct.

Both serde::de::Error and serde::ser::Error implementations properly construct Custom variants from displayable messages using to_string().

vendor/ciborium/tests/canonical.rs (3)

23-43: LGTM - RFC 8949 canonical ordering test.

The test correctly verifies that sorting a shuffled array of CanonicalValue items produces the expected RFC 8949 canonical order. Using rand::thread_rng() for shuffling provides good test coverage for ordering logic.


45-64: LGTM - Map serialization test with exact byte verification.

The test verifies that a BTreeMap<CanonicalValue, Value> serializes to the exact expected CBOR byte sequence, confirming deterministic canonical map ordering.


94-111: LGTM - Tagged option round-trip test.

The test properly verifies round-trip integrity for both Some and None cases with tagged values using Required::<u64, 0xff>.

vendor/ciborium/tests/tag.rs (2)

11-22: LGTM - Comprehensive tag test matrix.

The test cases cover the three tag wrapper types (Captured, Required, Accepted) with appropriate success/failure expectations for tag presence and value matching scenarios.


42-56: LGTM - Decode verification handles both success and failure paths.

The match arms correctly handle all four cases: expected success with match, unexpected success (panic), expected success with actual error (propagate), and expected error (ignore). The same pattern is consistently applied to both byte and value decoding.

vendor/ciborium/tests/recursion.rs (4)

1-8: LGTM - Important security test documentation.

The module-level documentation clearly explains the purpose of these tests: preventing stack overflows from malicious deeply-nested CBOR input.


14-30: LGTM - Basic recursion limit tests.

Tests for array (0x9f indefinite array prefix) and map (0xbf indefinite map prefix) correctly expect RecursionLimitExceeded for deeply nested containers.


32-48: LGTM - Indefinite bytes/text tests.

Tests for bytes (0x5f) and text (0x7f) correctly expect Io errors rather than RecursionLimitExceeded, since indefinite-length byte/text strings don't cause recursion in the same way as containers.


50-92: LGTM - Parameterized recursion limit boundary tests.

The array_limit and map_limit tests thoroughly verify behavior at recursion boundaries using from_reader_with_recursion_limit. The three cases per limit (full buffer, limit+1, limit) correctly test exceeded, just-exceeded, and within-limit scenarios.

vendor/ciborium/tests/error.rs (3)

10-31: LGTM - Comprehensive error case test matrix.

The test cases cover important CBOR error scenarios including invalid values, indeterminate integers, type mismatches in streaming contexts, and invalid UTF-8 sequences. Each case specifies the expected error type and byte offset.


32-51: LGTM - Error type verification logic.

The error mapping approach using tuples ("syntax", Some(offset), None) enables precise verification of both error type and position. Using panic! for unexpected error variants is appropriate since the test cases only expect Syntax or Semantic errors.


53-59: LGTM - Long UTF-8 round-trip test.

This test verifies correct handling of long UTF-8 strings by using the 3-byte Katakana character 'ボ' repeated 2000 times (6000 bytes total). This is important coverage for ensuring CBOR string encoding/decoding handles multi-byte UTF-8 correctly without truncation or corruption.

vendor/ciborium/src/value/de.rs (2)

241-251: The unreachable!() is appropriate here.

The Integer type's internal representation guarantees that one of the try_from conversions will succeed since Integer stores an i128 that fits within the CBOR integer range. This defensive unreachable is acceptable.


614-619: LGTM!

The deserialized helper method provides a clean public API for converting a Value into any deserializable type.

vendor/ciborium/src/value/ser.rs (3)

34-58: Integer serialization correctly uses smallest representation.

The cascading try_from checks ensure integers are serialized in the smallest possible CBOR representation, which aligns with the library's design goals documented in lib.rs. The unreachable!() is safe since Integer guarantees one conversion succeeds.


25-32: Float precision preservation is correct.

The check (y as f64).to_bits() == x.to_bits() ensures lossless f32 encoding only when round-tripping preserves exact bit representation.


432-437: LGTM!

The serialized helper provides a clean public API mirroring the deserialized method in de.rs.

src/etcd_encoding.rs (2)

77-82: CBOR detection logic is correct.

The bytes [0xd9, 0xd9, 0xf7] correctly represent CBOR tag 55799 (self-describing CBOR) as per RFC 8949. The conversion to JSON for internal processing is a reasonable approach.


120-125: LGTM!

The CBOR encoding path correctly delegates to json_to_k8s_cbor_bytes with proper error context.

src/etcd_encoding/k8s_cbor.rs (1)

78-82: LGTM!

The unused variable had_self_describing_tag could be removed since it's always true when this code path is reached, but it documents the expected input format which aids readability.

vendor/ciborium/src/lib.rs (2)

95-110: LGTM!

Clean module organization and appropriate re-exports following serde conventions. The public API surface is well-documented.


129-223: LGTM!

The cbor! macro provides a convenient DSL for constructing CBOR values. The recursive macro rules handle nested structures correctly and the error handling via Result<Value, Error> is appropriate.

vendor/ciborium/src/ser/mod.rs (3)

69-75: Correct handling of signed integer encoding.

The negative integer encoding using XOR with !0 correctly implements CBOR's negative integer representation where the encoded value is -1 - n.


78-101: LGTM!

The i128 serialization correctly handles big integers by emitting a CBOR tag (BIGPOS/BIGNEG) followed by the minimal byte representation with leading zeros stripped.


488-499: LGTM!

The into_writer function provides a clean public API for CBOR serialization, properly creating the encoder and delegating to serde.

vendor/ciborium/src/value/integer.rs (2)

71-93: LGTM!

The canonical_cmp implementation correctly handles CBOR's deterministic encoding requirements:

  1. Shorter encodings sort before longer ones
  2. Negative numbers sort after positive numbers of the same length
  3. Negative numbers closer to zero sort lower (inverted comparison)

106-118: LGTM!

The TryFrom<i128> correctly validates that the value can be represented in CBOR's native integer range by checking if the magnitude fits in u64.

vendor/ciborium/src/de/error.rs (1)

1-73: LGTM - Well-structured deserialization error type.

The error enum provides appropriate variants for different failure modes (I/O, syntax, semantic errors, and recursion limit protection). The trait implementations enable proper interoperability with serde and the underlying ciborium_ll error types.

vendor/ciborium-ll/src/lib.rs (2)

1-160: LGTM - Well-structured low-level CBOR library.

The module organization is clean with proper no_std support, clear documentation of CBOR wire format anatomy, and correctly defined constants matching RFC 7049 specifications. The public API surface is appropriately minimal with internal types kept private.


218-486: Comprehensive test coverage for CBOR encoding/decoding.

The test suite covers both leaf values (integers, floats, simple values, tags) and structured nodes (arrays, maps with definite and indefinite lengths). The tests verify round-trip encoding/decoding correctness.

vendor/ciborium/src/tag.rs (1)

206-222: Intentional stub error type for tag extraction.

The Error type and its minimal implementation are intentionally designed to fail on unsupported operations. This is appropriate for a tag-extraction serializer that only needs to handle unsigned integers.

vendor/ciborium-io/src/lib.rs (2)

1-165: LGTM - Clean I/O abstraction for no_std environments.

The Read and Write traits provide a minimal, zero-cost abstraction that works in no_std and no_alloc contexts. Feature gating is properly implemented with blanket implementations for std::io types and specialized implementations for slices and Vec<u8>.


167-265: Good test coverage for I/O operations.

Tests cover all edge cases including EOF/OOS conditions, single and multiple byte operations, and feature-gated std/alloc variants.

vendor/ciborium-ll/src/enc.rs (2)

1-58: LGTM - Correct header encoding logic.

The push method correctly maps Header variants to wire format, writing the prefix byte (major type + minor encoding) followed by the affix bytes for extended values.


60-127: LGTM - Proper segmented encoding with UTF-8 boundary handling.

The bytes() and text() methods correctly implement CBOR segmented encoding. The text() method includes proper UTF-8 boundary detection (lines 112-114) to ensure each segment is valid UTF-8. The minimum segment sizes (1 for bytes, 4 for text) are enforced as documented.

vendor/ciborium-ll/src/hdr.rs (3)

1-73: LGTM - Comprehensive CBOR header representation.

The Header enum correctly models all CBOR major types per RFC 7049. The documentation clearly explains the negative integer encoding convention (bit-inverted) and the semantics of None for indefinite-length items.


75-116: Correct Title-to-Header conversion with proper error handling.

The conversion handles all major types and minor encodings correctly. Length values are properly converted to usize with overflow protection via try_from.


118-162: Smart float encoding with size optimization.

The Header to Title conversion for floats correctly selects the smallest representation (f16 → f32 → f64) that preserves the exact bit pattern. Using to_bits() comparison ensures correct handling of NaN and signed zero values.

vendor/ciborium/src/value/canonical.rs (2)

34-60: LGTM - Correct RFC 7049/8949 canonical ordering implementation.

The cmp_value function correctly implements canonical CBOR key ordering: shorter serializations sort first, then lexicographic comparison. The integer-specific handling with canonical_cmp ensures proper ordering where (e.g.) 10 < -1 due to serialization length.


69-124: LGTM - CanonicalValue wrapper with proper trait implementations.

The CanonicalValue wrapper correctly implements Ord, PartialOrd, Eq, and PartialEq based on canonical comparison, along with Serialize/Deserialize delegation to the inner Value.

vendor/ciborium/tests/codec.rs (1)

1-438: Vendored test file looks well-structured.

This is a comprehensive test suite for CBOR codec round-trip testing covering primitives, floats (including NaN handling), collections, tagged values, and custom types. The test infrastructure with rstest parameterization and the custom Float wrapper for NaN comparison is appropriate.

vendor/ciborium/src/de/mod.rs (1)

1-875: Vendored CBOR deserializer implementation is well-structured.

This is the ciborium library's Serde-based deserializer with proper handling for CBOR types including bigints, tags, segmented data, and comprehensive error reporting with byte offsets. The recursion limit (default 256) and scratch buffer approach are appropriate for preventing resource exhaustion.

vendor/ciborium-ll/src/dec.rs (1)

1-174: Vendored low-level CBOR decoder is properly implemented.

The decoder provides correct byte-level CBOR parsing with offset tracking for error reporting. The assertions on lines 53 and 99 enforce internal invariants about buffer state, which is appropriate for this low-level library code where callers are expected to follow the documented usage patterns.

src/k8s_etcd.rs (4)

346-351: Good addition of error context to put_etcd_yaml.

The .await.context("putting in etcd")? properly propagates errors with context, improving debuggability.


104-132: Commit flow correctly preserves encoding.

The commit logic properly extracts the encoding from the cached (encoding, value) tuple and uses it when encoding back to etcd storage. This ensures CBOR values are written back as CBOR and Protobuf values as Protobuf.


186-227: Get flow correctly tracks encoding through decode and decrypt paths.

The get() method properly handles the encoding through both the direct decode path (line 199) and the decrypt-then-decode path (line 213), storing the encoding alongside the value in the cache (line 223).


232-243: Behavioral change confirmed: put() now requires key to exist in cache to preserve encoding.

The concern is valid and the design is sound. Verification of all call sites confirms that the invariant consistently holds: get_etcd_json() is always called before put() or put_etcd_yaml(). This pattern is followed uniformly across cluster crypto operations and post-processing functions, ensuring the key is always populated in the cache before any write operation.

The design change is intentional and safe—callers properly initialize the cache by fetching the resource first, which preserves the original encoding through the modify-write cycle.

vendor/ciborium/src/value/mod.rs (1)

1-685: Vendored CBOR Value type is well-implemented.

This provides a comprehensive dynamic CBOR value representation with proper handling for all CBOR types including bigints (u128/i128 with tags 2/3). The accessor methods follow Rust conventions and the documentation includes helpful examples.

vendor/ciborium-ll/src/seg.rs (1)

1-216: Vendored CBOR segmentation handling is correctly implemented.

The segment parsing infrastructure properly handles:

  • UTF-8 text with partial character sequences across chunk boundaries (3-byte buffer is correct for max incomplete UTF-8)
  • Nested indefinite-length CBOR structures via the nested counter
  • Error reporting with byte offsets

The Parser trait design with saved() for preserving partial data across reads is well-suited for streaming CBOR parsing.

Comment on lines +42 to +43
CborValue::Tag(value, _tag) => unimplemented!("Unsupported CBOR tag {:?}", value),
_ => unimplemented!("Unsupported CBOR type {:?}", cbor),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace unimplemented! macros with proper error handling.

These unimplemented!() calls will panic in production. They should return errors with context about the unsupported type.

🔧 Suggested fix
-        CborValue::Tag(value, _tag) => unimplemented!("Unsupported CBOR tag {:?}", value),
-        _ => unimplemented!("Unsupported CBOR type {:?}", cbor),
+        CborValue::Tag(value, _tag) => bail!("Unsupported CBOR tag {:?}", value),

Note: The catch-all _ => unimplemented!(...) on line 43 may be unreachable if all CborValue variants are handled above. If so, consider removing it or replacing with unreachable!() with a comment explaining why.

// Self-describing CBOR tag, unwrap the contents
(*contents, true)
}
_ => panic!("Unsupported CBOR tag {}", value),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Replace panic! with proper error handling.

The pipeline flagged this panic! as inappropriate for production code. This should return an error instead.

🐛 Proposed fix
-            _ => panic!("Unsupported CBOR tag {}", value),
+            _ => bail!("Unsupported CBOR tag {}", value),
🧰 Tools
🪛 GitHub Actions: Rust

[error] 84-84: Clippy error: panic! should not be present in production code. Consider avoiding panics in production paths (panic! at src/etcd_encoding/k8s_cbor.rs:84).

🤖 Prompt for AI Agents
In @src/etcd_encoding/k8s_cbor.rs at line 84, The match arm currently panics on
unknown CBOR tags ("_ => panic!(\"Unsupported CBOR tag {}\", value)"); change
this to return an error instead: replace the panic with an Err containing a
descriptive message that includes value (e.g., Err(format!(...)) wrapped in your
crate's error type or anyhow::Error) and adjust the surrounding function's
signature to return Result<..., E> if needed (update callers). Ensure you use
the existing error type used elsewhere in this module (e.g., EtcdEncodingError /
crate::Error or anyhow) so the error propagates instead of aborting the process.

Comment on lines +111 to +123
#[inline]
fn deserialize_any<V: de::Visitor<'de>>(self, visitor: V) -> Result<V::Value, Self::Error> {
self.state += 1;

match self.state {
1 => visitor.visit_str(match self.tag {
Some(..) => "@@TAGGED@@",
None => "@@UNTAGGED@@",
}),

_ => visitor.visit_u64(self.tag.unwrap()),
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find TagAccess::new usage and initialization
rg -n -A10 "TagAccess::new" vendor/ciborium/

Repository: rh-ecosystem-edge/recert

Length of output: 3468


🏁 Script executed:

#!/bin/bash
# Find TagAccess struct definition and implementation
rg -n -B5 -A30 "struct TagAccess" vendor/ciborium/src/tag.rs

Repository: rh-ecosystem-edge/recert

Length of output: 1129


🏁 Script executed:

#!/bin/bash
# Check context around line 491 in value/de.rs
sed -n '475,510p' vendor/ciborium/src/value/de.rs

Repository: rh-ecosystem-edge/recert

Length of output: 1198


🏁 Script executed:

#!/bin/bash
# Check context around line 578 in de/mod.rs
sed -n '565,595p' vendor/ciborium/src/de/mod.rs

Repository: rh-ecosystem-edge/recert

Length of output: 1059


🏁 Script executed:

#!/bin/bash
# Find EnumAccess implementation for TagAccess
rg -n -A50 "impl.*EnumAccess.*for.*TagAccess" vendor/ciborium/src/tag.rs

Repository: rh-ecosystem-edge/recert

Length of output: 1612


🏁 Script executed:

#!/bin/bash
# Find SeqAccess implementation for TagAccess
rg -n -A30 "impl.*SeqAccess.*for.*TagAccess" vendor/ciborium/src/tag.rs

Repository: rh-ecosystem-edge/recert

Length of output: 917


Fix the unconditional unwrap at line 121 to handle None case.

When state >= 2, self.tag.unwrap() assumes the tag is present. However, tag can be None (untagged case) and the state can still reach 2 during deserialization of tuple variants with multiple elements. This causes a panic.

Consider replacing the unconditional unwrap with proper error handling, or restructure the state machine to prevent state >= 2 when tag is None.

🤖 Prompt for AI Agents
In @vendor/ciborium/src/tag.rs around lines 111 - 123, In deserialize_any in
impl for tag (the function named deserialize_any), avoid the unconditional
self.tag.unwrap() when matching state >= 2; instead detect when self.tag is None
and return a proper deserialization error (using the crate's Self::Error/error
constructor) or restructure the state transitions so state never advances to >=
2 for an untagged value; update the match arm for the default case to check
self.tag and either call visitor.visit_u64(tag) when Some(tag) or return
Err(...) with a clear error indicating an unexpected tuple element for an
untagged value.

Comment on lines +62 to +67
/// A CBOR Value that impl Ord and Eq to allow sorting of values as defined in both
/// RFC 7049 Section 3.9 (regarding key sorting) and RFC 8949 4.2.3 (as errata).
///
/// Since a regular [Value] can be
#[derive(Clone, Debug)]
pub struct CanonicalValue(Value);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Incomplete documentation comment.

The doc comment on line 65 appears truncated: "Since a regular [Value] can be" - the sentence is not finished.

📝 Suggested fix
 /// A CBOR Value that impl Ord and Eq to allow sorting of values as defined in both
 /// RFC 7049 Section 3.9 (regarding key sorting) and RFC 8949 4.2.3 (as errata).
 ///
-/// Since a regular [Value] can be
+/// Since a regular [Value] cannot implement [Ord] due to floating-point values,
+/// this wrapper provides canonical ordering for use as map keys.
 #[derive(Clone, Debug)]
 pub struct CanonicalValue(Value);
🤖 Prompt for AI Agents
In @vendor/ciborium/src/value/canonical.rs around lines 62 - 67, The doc comment
for the CanonicalValue struct is truncated; update the comment above pub struct
CanonicalValue(Value) to finish the sentence by explaining why the wrapper
exists and what it guarantees — e.g., state that since a regular Value does not
implement Ord/Eq and CBOR key ordering must follow RFC 7049/8949, CanonicalValue
wraps Value to provide total ordering and equality for canonical sorting of map
keys per those RFC sections. Ensure the comment references CanonicalValue and
Value and briefly mentions the RFCs and the sorting purpose.

Comment on lines +161 to +164
let (name, data): (String, _) = acc.variant()?;
assert_eq!("@@TAGGED@@", name);
data.tuple_variant(2, Inner)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Production code uses assert_eq! which will panic on unexpected input.

The assert_eq! on line 162 will panic if the enum variant name is not "@@TAGGED@@". If this code path can be reached with untrusted CBOR input, this could cause a denial-of-service. Consider returning an error instead.

🔧 Suggested fix
-        let (name, data): (String, _) = acc.variant()?;
-        assert_eq!("@@TAGGED@@", name);
-        data.tuple_variant(2, Inner)
+        let (name, data): (String, _) = acc.variant()?;
+        if name != "@@TAGGED@@" {
+            return Err(de::Error::custom("expected @@TAGGED@@ variant"));
+        }
+        data.tuple_variant(2, Inner)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let (name, data): (String, _) = acc.variant()?;
assert_eq!("@@TAGGED@@", name);
data.tuple_variant(2, Inner)
}
let (name, data): (String, _) = acc.variant()?;
if name != "@@TAGGED@@" {
return Err(de::Error::custom("expected @@TAGGED@@ variant"));
}
data.tuple_variant(2, Inner)
🤖 Prompt for AI Agents
In @vendor/ciborium/src/value/de.rs around lines 161 - 164, The code currently
uses assert_eq!( "@@TAGGED@@", name ) after calling acc.variant(), which will
panic on unexpected input; replace this panic with a proper deserialization
error path: check if the returned name equals "@@TAGGED@@", and if not return an
Err using the crate's deserializer error type (the same error type other
deserializers use in this module) instead of panicking; keep the successful path
calling data.tuple_variant(2, Inner) unchanged so the flow remains
acc.variant()? -> validate name -> data.tuple_variant(2, Inner).

Comment on lines +62 to +68
} else {
// Ciborium serializes u128/i128 as BigPos if they don't fit in 64 bits.
// In this special case we have to calculate the length.
// The Tag itself will always be 1 byte.
x.to_be_bytes().len() + 1
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n vendor/ciborium/src/value/integer.rs | head -100

Repository: rh-ecosystem-edge/recert

Length of output: 4278


🏁 Script executed:

# Look for the full context of canonical_len function
rg -B 20 -A 5 "fn canonical_len" vendor/ciborium/src/value/integer.rs

Repository: rh-ecosystem-edge/recert

Length of output: 952


🏁 Script executed:

# Search for how i128 values are actually serialized in ciborium
rg -n "to_be_bytes|BigPos|i128" vendor/ciborium/src/value/integer.rs

Repository: rh-ecosystem-edge/recert

Length of output: 360


🏁 Script executed:

# Check for serialization implementation
rg -rn "serialize.*BigPos\|BigPos.*serialize" vendor/ciborium/

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Look for tests related to canonical_len or i128
rg -rn "canonical_len\|test.*i128" vendor/ciborium/src/

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

rg -rn "BigPos|BigNeg" vendor/ciborium/src/ -A 3 -B 1

Repository: rh-ecosystem-edge/recert

Length of output: 509


🏁 Script executed:

# Check if there are any tests in integer.rs or related test files
rg -n "test" vendor/ciborium/src/value/integer.rs

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Look for serialize implementations in the value module
rg -rn "impl.*Serialize\|fn serialize" vendor/ciborium/src/value/ -A 5

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Check Ciborium source for how it handles big integers
find vendor/ciborium -name "*.rs" -type f | xargs rg -l "BigPos"

Repository: rh-ecosystem-edge/recert

Length of output: 104


🏁 Script executed:

cat -n vendor/ciborium/src/value.rs | head -150

Repository: rh-ecosystem-edge/recert

Length of output: 128


🏁 Script executed:

# Search for Value type and its serialization
rg -n "enum Value\|impl.*Value" vendor/ciborium/src/value.rs | head -20

Repository: rh-ecosystem-edge/recert

Length of output: 136


🏁 Script executed:

# Look for how integers are actually encoded to understand the BigInt representation
rg -rn "encode\|Encode" vendor/ciborium/src/ -t rust | grep -i "int\|value" | head -20

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Check CBOR spec or comments about integer encoding
rg -rn "RFC\|CBOR\|8949" vendor/ciborium/src/value/integer.rs

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Look for usage of canonical_len to see if the constant 17 is ever the issue
rg -rn "canonical_len" vendor/ciborium/src/

Repository: rh-ecosystem-edge/recert

Length of output: 209


🏁 Script executed:

rg -n "fn n\(" vendor/ciborium/src/value/integer.rs -B 2 -A 10

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Get more context around the n() function
rg -n "fn n\|fn canonical" vendor/ciborium/src/value/integer.rs -B 1 -A 15

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Check if canonical_len is actually used anywhere
git -C vendor/ciborium grep "canonical_len" || echo "Not found in git"

Repository: rh-ecosystem-edge/recert

Length of output: 213


🏁 Script executed:

# List all functions in the integer.rs file
rg -n "^\s*fn " vendor/ciborium/src/value/integer.rs

Repository: rh-ecosystem-edge/recert

Length of output: 472


🏁 Script executed:

ls -la vendor/ciborium/src/value/

Repository: rh-ecosystem-edge/recert

Length of output: 534


🏁 Script executed:

# Check if there's an encoder or serializer
find vendor/ciborium -name "*.rs" -type f | head -20

Repository: rh-ecosystem-edge/recert

Length of output: 710


🏁 Script executed:

# Look at what's in the value directory
rg -l "serialize\|encode" vendor/ciborium/src/ -t rust | head -10

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Get full context of the canonical_cmp function and how canonical_len is used
cat -n vendor/ciborium/src/value/integer.rs | sed -n '70,95p'

Repository: rh-ecosystem-edge/recert

Length of output: 1411


🏁 Script executed:

# Check if canonical_len calculation might actually be correct by examining CBOR encoding
# Let's see what the actual comment says about serialization
cat -n vendor/ciborium/src/value/integer.rs | sed -n '60,70p'

Repository: rh-ecosystem-edge/recert

Length of output: 594


🏁 Script executed:

cat -n vendor/ciborium/src/value/canonical.rs

Repository: rh-ecosystem-edge/recert

Length of output: 4790


🏁 Script executed:

# Check the serialization logic for integers
cat -n vendor/ciborium/src/value/ser.rs | head -150

Repository: rh-ecosystem-edge/recert

Length of output: 5464


🏁 Script executed:

# Look for how BigPos/BigNeg are handled in serialization
rg -n "BigPos\|BigNeg\|to_be_bytes" vendor/ciborium/src/value/ser.rs -B 2 -A 5

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Look for the actual CBOR big int encoding
find vendor/ciborium -name "*.rs" | xargs rg -l "tag\|Tag" | head -5

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Check the tag definitions
cat -n vendor/ciborium/src/tag.rs

Repository: rh-ecosystem-edge/recert

Length of output: 17338


🏁 Script executed:

# Look for tests that verify canonical_len correctness
rg -rn "canonical_len\|canonical_cmp" vendor/ciborium/tests/ -A 5 -B 2

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Check if there are any constants or documentation about CBOR encoding of big integers
rg -rn "i128\|u128" vendor/ciborium/src/ -B 2 -A 2 | grep -i "encode\|serial\|byte"

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Look for CBOR big integer encoding or any RFC references
rg -rn "BigPos\|BigNeg\|RFC.*8949\|bigint" vendor/ciborium/ --max-count 10

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Check if serde-cbor or similar is used, and look for big integer handling
grep -r "cbor\|Cbor" vendor/ciborium/Cargo.toml 2>/dev/null | head -20

Repository: rh-ecosystem-edge/recert

Length of output: 96


🏁 Script executed:

# Try to find any test that validates integer serialization length
find vendor/ciborium -name "*.rs" -type f -exec grep -l "serialize.*i128\|i128.*serialize" {} \;

Repository: rh-ecosystem-edge/recert

Length of output: 257


🏁 Script executed:

# Check if there's any documentation or examples showing how large integers are encoded
rg -rn "i128\|u128" vendor/ciborium/tests/ -B 2 -A 2

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Look at the serialization module to understand CBOR encoding
cat -n vendor/ciborium/src/ser/mod.rs | head -200

Repository: rh-ecosystem-edge/recert

Length of output: 6764


🏁 Script executed:

# Search for any "major" type encoding or CBOR structure that might explain big integer handling
rg -n "major\|0x[0-9a-f]\|<<\|>>|byte" vendor/ciborium/src/ser/mod.rs | head -30

Repository: rh-ecosystem-edge/recert

Length of output: 440


🏁 Script executed:

# Look for explicit handling of large integers or special cases
rg -n "u64\|i64\|u128\|i128" vendor/ciborium/src/ser/mod.rs -B 1 -A 3 | head -60

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Verify that canonical_len is used in canonical_cmp
cat -n vendor/ciborium/src/value/integer.rs | sed -n '70,92p'

Repository: rh-ecosystem-edge/recert

Length of output: 1375


🏁 Script executed:

# Understand the CBOR header encoding to calculate exact serialization size
# Look at what Header::Bytes encodes to
rg -n "Header::Bytes\|enum Header" vendor/ciborium/ -B 2 -A 5 | head -50

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Find the Header enum definition to understand CBOR encoding format
rg -n "enum Header\|struct Header" vendor/ciborium/ -A 20 | head -80

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Look for tag constants
rg -n "BIGPOS\|BIGNEG" vendor/ciborium/ -B 1 -A 1

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Look for where ciborium_ll is defined or used
find vendor -name "*.rs" -type f | xargs grep -l "ciborium_ll\|Header::" | head -10

Repository: rh-ecosystem-edge/recert

Length of output: 445


🏁 Script executed:

# Check if there are any tests that might validate canonical_len correctness
rg -rn "canonical_len\|test.*integer" vendor/ciborium/tests/ -B 2 -A 5

Repository: rh-ecosystem-edge/recert

Length of output: 50


🏁 Script executed:

# Let's create a simple test to understand the actual encoded size
cat > /tmp/test_encoding.py << 'EOF'
# Understanding CBOR encoding for large integers
# CBOR major type 0 (unsigned int) and 1 (negative int) use variable-length encoding
# CBOR major type 6 (semantic tag)

# For values fitting in 64 bits, they use native encoding (1, 2, 3, 5, or 9 bytes)
# For larger values, they use tag + bytes

# If we encode a u128 with tag 2 (BIGPOS):
# Tag encoding: major type 6 (first 3 bits) + tag value
#   For tag 2: 0xC2 (1 byte)
# Then bytes: major type 2 (first 3 bits) + length
#   For length 1-23: 1 byte (0x40-0x57)
#   For length 24-255: 2 bytes (0x58 + length)
#   For length 256-65535: 3 bytes (0x59 + 2-byte length)
#   etc.

# So canonical_len = tag_bytes + header_bytes + content_bytes
# For i128 that doesn't fit in i64:
# - Tag: 1 byte
# - Header: depends on slice.len() after stripping leading zeros
# - Content: slice.len() bytes

# The current code does: x.to_be_bytes().len() + 1 = 16 + 1 = 17
# But this doesn't account for the variable-length header encoding!

# For example, if leading zeros strip it down to 9 bytes:
# - Tag: 1 byte
# - Header: 2 bytes (0x58 + length byte for 1-255)
# - Content: 9 bytes
# Total: 12 bytes, not 17

print("The review comment is CORRECT")
print("canonical_len calculation is wrong because:")
print("1. It assumes 16 bytes but leading zeros are stripped")
print("2. It doesn't account for variable-length CBOR header encoding")
EOF
python3 /tmp/test_encoding.py

Repository: rh-ecosystem-edge/recert

Length of output: 154


The canonical_len calculation for i128 values is incorrect.

The actual CBOR serialization in vendor/ciborium/src/ser/mod.rs:78-101 strips leading zeros from the 16-byte representation before encoding:

let bytes = raw.to_be_bytes();
// Skip leading zeros.
let mut slice = &bytes[..];
while !slice.is_empty() && slice[0] == 0 {
    slice = &slice[1..];
}

However, canonical_len at line 66 uses x.to_be_bytes().len() + 1, which always returns 17 regardless of how many leading zeros are stripped. The actual encoded length is: 1 byte (tag) + 1-2+ bytes (CBOR length header) + slice.len() bytes (content), which varies based on the value. This mismatch causes incorrect canonical comparison ordering, breaking RFC 8949 compliance for sorting operations.

🤖 Prompt for AI Agents
In @vendor/ciborium/src/value/integer.rs around lines 62 - 68, The i128 branch
in canonical_len (vendor/ciborium/src/value/integer.rs) wrongly returns
x.to_be_bytes().len() + 1; instead replicate the serialization in ser/mod.rs:
take x.to_be_bytes(), strip leading zero bytes to form slice, compute the CBOR
length-header size for that slice length (1 byte for the initial additional-info
plus the extra integer byte count when length > 23), then return 1 (tag) +
header_size + slice.len(); update the canonical_len path that handles
i128/BigPos to use this calculation so canonical length matches actual encoding.

@openshift-ci
Copy link

openshift-ci bot commented Jan 12, 2026

@omertuc: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/cargo-clippy 8ba3eb8 link true /test cargo-clippy
ci/prow/e2e-aws-ovn-single-node-recert-parallel 8ba3eb8 link true /test e2e-aws-ovn-single-node-recert-parallel
ci/prow/e2e-aws-ovn-single-node-recert-serial 8ba3eb8 link true /test e2e-aws-ovn-single-node-recert-serial
ci/prow/ibu-e2e-flow 8ba3eb8 link true /test ibu-e2e-flow
ci/prow/ibi-e2e-flow 8ba3eb8 link true /test ibi-e2e-flow

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant