Use structured logging + add traces & metrics#331
Use structured logging + add traces & metrics#331malcolmgreaves wants to merge 5 commits intomainfrom
Conversation
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a telemetry subsystem (tracing + Prometheus metrics), replaces legacy logging initialization with tracing, and instruments many client APIs, repository functions, and server controllers with tracing spans, counters, and timing histograms. Also adds docs explaining tracing/metrics configuration. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Changes all of `oxen` to use structured logging, allowing for all `log::` statements to be asynchronously written to a rotating log file as a sequence of newline delimited JSON objects. Configuration is via the `OXEN_LOG_FILE` env var: setting it to `1` or `true` has it write logs in `~/.oxen/logs/`. Setting it to a directory has it write to log files there. Leaving it unset means all logging falls back to writing to STDERR. All `log` statements are now translated to `tracing` Events via the new `tracing-log` dependency. This means that if they're active in a trace, they will be associated with it. `oxen-server` now collects and exposes a Prometheus-compatible metrics endpoint as `/metrics`. This runs on a _different_ port than the main server process. It defaults to `9090` and can be changed with `OXEN_METRICS_PORT`. New trace spans have been added to every HTTP endpoint call in `oxen-rust` and to workspace upload functions.
93448fa to
03ee9d1
Compare
|
This is what the |
… of conflicts that were unrelated to the crates rename
|
Note: This is a heavily AI-generated PR. |
|
Here's a list of all of the new places that have |
There was a problem hiding this comment.
Actionable comments posted: 15
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (21)
oxen-rust/crates/lib/src/repositories/size.rs (1)
108-112:⚠️ Potential issue | 🟠 MajorBound the retry path to avoid unbounded recursion on persistent I/O failures.
At Line 111,
get_size(repo)recursively retries afterupdate_size(repo)?. If the read keeps failing (e.g., permission/disk issues), this can recurse indefinitely and risk stack overflow. Use a bounded retry loop (or return the underlying I/O error after one retry).Proposed fix (bounded retry)
pub fn get_size(repo: &LocalRepository) -> Result<RepoSizeFile, OxenError> { metrics::counter!("oxen_repo_size_get_size_total").increment(1); let path = repo_size_path(repo); - let size = util::fs::read_from_path(&path); - match size { - Ok(size) => { - let size: RepoSizeFile = serde_json::from_str(&size)?; - Ok(size) - } - Err(e) => { - log::info!("Size file not found, creating it: {e}"); - update_size(repo)?; - get_size(repo) - } - } + match util::fs::read_from_path(&path) { + Ok(size) => Ok(serde_json::from_str(&size)?), + Err(first_err) => { + log::info!("Size file not found, creating it: {first_err}"); + update_size(repo)?; + match util::fs::read_from_path(&path) { + Ok(size) => Ok(serde_json::from_str(&size)?), + Err(second_err) => Err(OxenError::basic_str(format!( + "Failed to read repo size file after update attempt: {second_err}" + ))), + } + } + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/repositories/size.rs` around lines 108 - 112, get_size currently recurses by calling update_size(repo)? followed by get_size(repo) on Err(e), which can cause unbounded recursion on persistent I/O failures; change this to a bounded retry: after catching the Err in get_size, call update_size(repo) once (or loop with a small max_attempts counter, e.g., 1-3 attempts) and then perform a single non-recursive retry of the read, returning the underlying I/O error if the retry still fails; update references in get_size and keep update_size unchanged.oxen-rust/crates/server/src/controllers/workspaces/data_frames/embeddings.rs (1)
17-23:⚠️ Potential issue | 🟠 MajorSpan fields
namespaceandrepo_nameare declared but never populatedLines 17, 44, and 149 use
#[tracing::instrument(skip_all, fields(namespace, repo_name))]. These field declarations create empty span fields without populating them. Becauseskip_allis specified and noSpan::record()calls follow, the namespace and repo_name values remain unset in traces, reducing observability.Record these values immediately after extracting them:
Fix for all three functions (get, neighbors, post)
#[tracing::instrument(skip_all, fields(namespace, repo_name))] pub async fn get(req: HttpRequest) -> Result<HttpResponse, OxenHttpError> { metrics::counter!("oxen_server_workspaces_data_frames_embeddings_get_total").increment(1); let app_data = app_data(&req)?; let namespace = path_param(&req, "namespace")?; let repo_name = path_param(&req, "repo_name")?; + tracing::Span::current().record("namespace", &namespace); + tracing::Span::current().record("repo_name", &repo_name); let workspace_id = path_param(&req, "workspace_id")?;Apply the same fix to
neighbors(line 44) andpost(line 149).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/workspaces/data_frames/embeddings.rs` around lines 17 - 23, The tracing span fields `namespace` and `repo_name` are declared but not populated because `skip_all` prevents automatic recording; after extracting `namespace` and `repo_name` in each handler (functions get, neighbors, post) call tracing::Span::current().record(...) to set the span fields immediately (i.e., right after the path_param(...) calls) so the values appear in traces—update the three functions (get, neighbors, post) to record both "namespace" and "repo_name" into the current span.oxen-rust/crates/lib/src/api/client/version.rs (1)
18-21:⚠️ Potential issue | 🔴 CriticalReplace unwrap with error handling for optional response field.
Line 20 uses
unwrap()on the remoteoxen_versionfield. A valid HTTP response with missingoxen_versionwill panic instead of returning a typed error. While server responses always populate this field, the client should not assume upstream correctness.Proposed fix
- Ok(val) => Ok(val.oxen_version.unwrap()), + Ok(val) => val.oxen_version.ok_or_else(|| { + OxenError::basic_str(format!( + "api::version::get_remote_version {url} missing `oxen_version` in response" + )) + }),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/version.rs` around lines 18 - 21, The code currently calls unwrap() on StatusMessage.oxen_version which can panic; change it to handle the Option safely by matching or using ok_or to convert a missing field into an OxenError. Specifically, after deserializing into StatusMessage (type StatusMessage), replace Ok(val) => Ok(val.oxen_version.unwrap()) with something like: extract val.oxen_version via match or val.oxen_version.ok_or_else(|| OxenError::basic_str("missing oxen_version in response".to_string())) and return that Err via OxenError::basic_str when absent; keep using OxenError::basic_str for the error path so callers receive a typed error rather than panicking.oxen-rust/crates/server/src/controllers/workspaces/changes.rs (4)
59-69:⚠️ Potential issue | 🟡 MinorSame issue: empty
namespaceandrepo_namefields.The same pattern issue applies here as in
list_root. The fields declared infields(namespace, repo_name)won't be populated since the values are extracted inside the function body.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/workspaces/changes.rs` around lines 59 - 69, The tracing::instrument attribute on function list declares fields(namespace, repo_name) but those values are only extracted inside the function body so they remain empty; either remove namespace and repo_name from the attribute (e.g., use skip_all without fields) or, after extracting namespace and repo_name via path_param(&req, ...) in the list function, record them into the current span with tracing::Span::current().record("namespace", &namespace.as_str()) and similarly for "repo_name" so the span fields are populated; update the attribute or add the Span::current().record calls in the list function accordingly.
134-166:⚠️ Potential issue | 🟡 MinorSame issue: empty
namespaceandrepo_namefields.The same pattern issue applies to the
unstage_manyfunction.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/workspaces/changes.rs` around lines 134 - 166, The tracing fields declared in the attribute on unstage_many (#[tracing::instrument(skip_all, fields(namespace, repo_name))]) are left empty because the span fields are never recorded; after you parse namespace and repo_name (the values returned by path_param in unstage_many), record them into the current span with tracing::Span::current().record("namespace", &namespace.as_str()) and tracing::Span::current().record("repo_name", &repo_name.as_str()); this keeps the attribute but populates the fields for proper tracing instead of leaving them empty.
99-123:⚠️ Potential issue | 🟡 MinorSame issue: empty
namespaceandrepo_namefields.The same pattern issue applies to the
unstagefunction.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/workspaces/changes.rs` around lines 99 - 123, The tracing span for unstage declares fields(namespace, repo_name) but never populates them, leaving those fields empty; after you parse namespace and repo_name in unstage (the variables namespace and repo_name), record them into the current tracing span (e.g., tracing::Span::current().record("namespace", &tracing::field::display(&namespace)) and similarly for "repo_name") so the span fields are populated for telemetry.
20-30:⚠️ Potential issue | 🟡 MinorFields
namespaceandrepo_namewill not be populated in the trace span.The
fields(namespace, repo_name)declaration in the#[tracing::instrument]attribute creates these fields as uninitialized because they are extracted inside the function body viapath_param(), not available at function entry when the span is created.To fix this, either:
- Declare fields as
field::Emptyin the attribute and populate them withSpan::current().record()after extraction:Option 1: Record fields after extraction
-#[tracing::instrument(skip_all, fields(namespace, repo_name))] +#[tracing::instrument(skip_all, fields(namespace = field::Empty, repo_name = field::Empty))] pub async fn list_root( req: HttpRequest, query: web::Query<PageNumQuery>, ) -> actix_web::Result<HttpResponse, OxenHttpError> { metrics::counter!("oxen_server_workspaces_changes_list_root_total").increment(1); let app_data = app_data(&req)?; let namespace = path_param(&req, "namespace")?; let repo_name = path_param(&req, "repo_name")?; + let span = tracing::Span::current(); + span.record("namespace", &namespace); + span.record("repo_name", &repo_name);
- Remove the empty field declarations if they're not needed:
Option 2: Simplify by removing empty fields
-#[tracing::instrument(skip_all, fields(namespace, repo_name))] +#[tracing::instrument(skip_all)]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/workspaces/changes.rs` around lines 20 - 30, The tracing span fields declared in the attribute on list_root (fields(namespace, repo_name)) are empty because namespace and repo_name are extracted inside the function; fix by either (A) changing the attribute to declare those fields as empty (e.g., namespace = field::Empty, repo_name = field::Empty) and after extracting namespace and repo_name with path_param() call Span::current().record(...) to set them, or (B) simply remove the fields(...) declaration from the #[tracing::instrument] attribute on list_root if you don't need those values in the span; update the attribute and add the span.record calls in list_root when using option A.oxen-rust/crates/lib/src/test.rs (1)
76-97:⚠️ Potential issue | 🔴 CriticalTracing guard is dropped immediately, causing file logging to flush prematurely. Additionally, test-specific code paths are being triggered via the
TESTenvironment variable instead of explicit test-only initialization.The
_tracing_guardis created as a local variable and dropped wheninit_test_env()returns. According to theinit_tracingdocumentation, the guard must be held for the lifetime of the application to keep the non-blocking file writer active. Dropping it immediately flushes the writer, breaking file logging for tests.More broadly, setting
std::env::set_var("TEST", "true")creates a security issue: theTESTenv var is checked inuser_config.rsandauth_config.rsto conditionally load test configuration. This pattern allows test-only code paths to be accidentally triggered in production if the environment variable is set outside of test context. Use explicit test-only initialization instead of runtime environment checks.Store the tracing guard as a static that persists for the application lifetime, and remove the
TESTenv var checks in favor of explicit initialization functions for test configuration.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/test.rs` around lines 76 - 97, The tracing guard created by util::telemetry::init_tracing inside init_test_env is dropped at function return, flushing the non-blocking file writer; persist it in a static (e.g., a lazy_static/OnceCell/OnceLock) such as a static TRACING_GUARD to hold the guard for the program lifetime and assign it in init_test_env while still using ENV_LOCK for one-time setup. Remove the unsafe std::env::set_var("TEST", "true") call and instead add an explicit test-mode initialization flag/function that test-only code can call; update user_config.rs and auth_config.rs to stop relying on the TEST env var and read the explicit test-initialization state (or accept a test-only init parameter) so test configuration is activated only via the new init_test_env path.oxen-rust/crates/server/src/controllers/diff.rs (1)
555-567:⚠️ Potential issue | 🟠 Major
update_df_diffresolves the right file from the wrong revision.At Line 565,
node_2is fetched withcommit_1instead ofcommit_2. This can return incorrect comparisons or false not-found errors.Fix
- let node_2 = - repositories::entries::get_file(&repository, &commit_1, &resource_2)?.ok_or_else(|| { + let node_2 = + repositories::entries::get_file(&repository, &commit_2, &resource_2)?.ok_or_else(|| { OxenError::ResourceNotFound(format!("{}@{}", resource_2.display(), commit_2).into()) })?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/diff.rs` around lines 555 - 567, In update_df_diff the second file is mistakenly loaded from commit_1 instead of commit_2 causing wrong comparisons; change the call that creates node_2 to use commit_2 (i.e., replace repositories::entries::get_file(&repository, &commit_1, &resource_2) with repositories::entries::get_file(&repository, &commit_2, &resource_2)) and keep the OxenError::ResourceNotFound message using commit_2 for correct diagnostics.oxen-rust/crates/lib/src/api/client/workspaces/data_frames.rs (1)
74-99:⚠️ Potential issue | 🟠 MajorDon't treat every 2xx response as downloadable bytes.
The server returns
200 application/jsonwithis_indexed: falsefor unindexed data frames. This client will happily write that JSON envelope intoopts.outputand returnOk(()).🛠️ Minimal fix
let client = client::new_for_url(&url)?; let res = client.get(&url).send().await?; + + if let Some(content_type) = res + .headers() + .get(reqwest::header::CONTENT_TYPE) + .and_then(|value| value.to_str().ok()) + { + if content_type.starts_with("application/json") { + let body = client::parse_json_body(&url, res).await?; + let response: WorkspaceJsonDataFrameViewResponse = serde_json::from_str(&body)?; + if !response.is_indexed { + return Err(OxenError::basic_str("data frame is not indexed")); + } + return Err(OxenError::basic_str(format!( + "Unexpected JSON response from {url}" + ))); + } + } if !res.status().is_success() { let status = res.status();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/workspaces/data_frames.rs` around lines 74 - 99, After receiving a success response in the download flow, do not assume the body is raw bytes; inspect the response Content-Type (and if JSON) parse the body with client::parse_json_body(&url, res).await? and check the JSON envelope for is_indexed:false. In practice update the block after res.status().is_success() (the code using res, client.get(&url).send().await?, parse_json_body, output_path, util::fs::file_create, and res.bytes_stream()) to: if Content-Type is application/json then parse JSON and return an OxenError (e.g., resource_not_found or a new resource_not_indexed error) instead of streaming to file; only proceed to file_create and res.bytes_stream() when the response is a non-JSON byte stream.oxen-rust/crates/server/src/controllers/workspaces/data_frames.rs (1)
667-678:⚠️ Potential issue | 🟠 MajorCheck the rename target against staged workspace files too.
repositories::entries::get_file(&repo, &workspace.commit, &new_path)only looks at committed state. Ifnew_pathalready exists in the workspace, this rename can clobber or conflict with uncommitted work.🛠️ Minimal fix
- if repositories::entries::get_file(&repo, &workspace.commit, &new_path)?.is_some() { + if file_exists_in_workspace_or_commit(&workspace, &new_path)? { return Err(OxenHttpError::BadRequest("new_path already exists".into())); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/workspaces/data_frames.rs` around lines 667 - 678, The current pre-rename check only queries committed state via repositories::entries::get_file(&repo, &workspace.commit, &new_path) and can miss uncommitted/staged files; before calling repositories::workspaces::data_frames::rename(&workspace, &path, &new_path).await? add a check against the workspace staging/uncommitted view (e.g., call the repository's staged-entry API such as repositories::entries::get_staged_entry / get_staged_entries or the workspace staging accessor) to return a BadRequest if new_path exists in the staged/uncommitted workspace state for the given workspace object and new_path. Ensure the new validation uses the same workspace identifier (workspace) and new_path variables so the rename cannot clobber uncommitted files.oxen-rust/crates/server/src/controllers/versions.rs (1)
37-54:⚠️ Potential issue | 🟠 MajorRecord
namespaceandrepo_namevalues in trace spans.The span fields declared in
#[tracing::instrument(skip_all, fields(namespace, repo_name))]are never populated. These handlers extractnamespaceandrepo_namefromreqbut do not callSpan::record()to populate the declared fields. Trace output will have empty namespace/repo_name values across the metadata, clean, download, and batch endpoints.Extract the values and record them:
let span = tracing::Span::current(); span.record("namespace", &namespace); span.record("repo_name", &repo_name);Affects: Lines 37-54, 78-80, 96-122, 175-199, 488-515
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/versions.rs` around lines 37 - 54, The tracing span fields declared in #[tracing::instrument(skip_all, fields(namespace, repo_name))] are never populated; inside the metadata handler (and similarly in clean, download, batch handlers) extract the namespace and repo_name from the HttpRequest as you already do and then call let span = tracing::Span::current(); span.record("namespace", &namespace); span.record("repo_name", &repo_name); to populate the span fields (do this immediately after parsing the path params in the metadata function and the corresponding spot in clean/download/batch).oxen-rust/crates/lib/src/repositories/workspaces.rs (1)
317-337:⚠️ Potential issue | 🟠 MajorReturn deletion failures to the caller.
If
remove_dir_allfails, this function logs the error and still returnsOk(()). The HTTP delete path andTemporaryWorkspace::dropwill both treat the workspace as cleaned up even though it still exists on disk.Suggested fix
- match util::fs::remove_dir_all(&workspace_dir) { - Ok(_) => log::debug!("workspace::delete removed workspace dir: {workspace_dir:?}"), - Err(e) => log::error!("workspace::delete error removing workspace dir: {e:?}"), - } - - Ok(()) + util::fs::remove_dir_all(&workspace_dir)?; + log::debug!("workspace::delete removed workspace dir: {workspace_dir:?}"); + Ok(())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/repositories/workspaces.rs` around lines 317 - 337, The delete function currently swallows filesystem errors from util::fs::remove_dir_all and returns Ok(()) even when deletion fails; change behavior in delete (pub fn delete workspace) to propagate the error to the caller instead of only logging it: when util::fs::remove_dir_all(&workspace_dir) returns Err(e) convert or wrap e into an appropriate OxenError (e.g., an I/O or workspace deletion error) and return Err(...) from delete; keep the debug/log lines but ensure the Err path returns the error so callers (HTTP delete path and TemporaryWorkspace::drop) can react to the failure.oxen-rust/crates/server/src/controllers/workspaces.rs (1)
52-63:⚠️ Potential issue | 🟠 MajorDon't log full workspace request bodies.
These lines persist raw client payloads into server logs. In
commit, that includes author email and commit message; on parse failures it also stores arbitrary invalid input. With JSON file logging enabled, this is now a durable PII/data-retention leak. Log structured metadata only and drop the rawbody.Also applies to: 392-412
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/workspaces.rs` around lines 52 - 63, The code is logging full request bodies (including PII) on parse errors; update the error logging in the NewWorkspace JSON parse block (the match on serde_json::from_str into NewWorkspace) to stop including the raw `body` in log::error and instead log only structured metadata (e.g., error message, namespace, repo_name, request length or a redacted flag) and return the BadRequest as before; apply the same change to the other handler referenced around lines 392-412 (the commit parsing/handling path) so neither `commit` nor workspace parse errors persist raw payloads in logs.oxen-rust/crates/lib/src/repositories/entries.rs (1)
343-408:⚠️ Potential issue | 🟠 MajorPreserve hashes, not just paths, when computing missing files.
This range builder collapses entries by
path, so multiple revisions of the same file path lose older hashes. TheNonebranch then checks onlyhead_commit, which misses blobs that exist only in earlier commits. Together, those cases can report a sync as complete while older commits still reference missing version files.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/repositories/entries.rs` around lines 343 - 408, The code in list_missing_files_in_commit_range is collapsing entries by path (all_entries.dedup_by(|a,b| a.path == b.path)) and in the None branch only inspects head_commit, which loses older hashes and misses blobs referenced by earlier commits; fix by preserving unique hashes instead of deduping by path (e.g., remove the dedup_by(path) and dedupe by entry.hash or use a HashSet keyed on entry.hash to keep all distinct blobs) and update the None branch to collect entries from all commits up to the first commit (use the existing commits listing helpers like repositories::commits::list_between or other repo commit-listing function to build the full commit list and call list_for_commit for each) before checking version_store.version_exists so every referenced hash is verified.oxen-rust/crates/server/src/controllers/entries.rs (1)
80-88:⚠️ Potential issue | 🔴 CriticalConstrain
content_fileto the repo's version-store root before opening it.
repo.path.join(content_file)trusts client-supplied paths without validation. Rust'sPathBuf::join()replaces the base path entirely when the second argument is absolute (e.g.,/etc/passwd), and relative paths with../segments are preserved and resolve to parent directories. After the filename rewrite at lines 81-84, neither defense prevents reading files outside the repository. An attacker can supply absolute paths or traversal sequences ascontent_fileto read arbitrary files from the system and receive them in the tar archive. Validate that the final resolved path is within the expected version-store directory usingcanonicalize()andstrip_prefix(), or accept only hash-based identifiers instead of full paths.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/entries.rs` around lines 80 - 88, path_to_read is built from untrusted content_file and then rewritten by replace_file_name_keep_extension, which still allows absolute paths or traversals to escape the repository; before calling tar.append_path_with_name, canonicalize both repo.path and the resolved path_to_read (or use fs::canonicalize) and validate that path_to_read.strip_prefix(repo_canonical_path) succeeds (otherwise return an error), rejecting absolute or traversal escapes; update the handler that uses repo.path, content_file, replace_file_name_keep_extension and tar.append_path_with_name to perform this prefix check and fail safely if the check fails.oxen-rust/crates/lib/src/repositories/tree.rs (1)
1085-1091:⚠️ Potential issue | 🔴 CriticalHarden tar extraction: current path handling allows traversal and panic-on-input.
unpack_nodesis handling archive-controlled paths withunwrap/expectand joins entry paths directly into the destination. A crafted archive can trigger panics or write outside.oxenvia../absolute paths.🔧 Proposed hardening
+use std::path::Component; @@ - let path = file.path().unwrap(); + let path = file + .path() + .map_err(|e| OxenError::basic_str(format!("Invalid archive path: {e}")))?; + if path.components().any(|c| { + matches!(c, Component::ParentDir | Component::RootDir | Component::Prefix(_)) + }) { + return Err(OxenError::basic_str(format!( + "Unsafe archive path component: {path:?}" + ))); + } @@ - if let Some(parent) = dst_path.parent() { - util::fs::create_dir_all(parent).expect("Could not create parent dir"); - } + if let Some(parent) = dst_path.parent() { + util::fs::create_dir_all(parent)?; + } @@ - .map(|c| c.as_os_str().to_str().unwrap()) + .map(|c| { + c.as_os_str().to_str().ok_or_else(|| { + OxenError::basic_str("Non-UTF8 node path in archive") + }) + }) - .collect::<Vec<&str>>() + .collect::<Result<Vec<&str>, OxenError>>()?Also applies to: 1097-1110
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/repositories/tree.rs` around lines 1085 - 1091, unpack_nodes currently uses file.path().unwrap() and joins entry paths directly into dst_path, then panics on errors with expect, which allows path traversal and crashes; change unpack_nodes to validate and sanitize each archive entry path before joining: reject absolute paths and any Component::ParentDir (".."), normalize the path to a safe relative path (e.g., iterate components and rebuild only Normal components), handle file.path() errors without unwrap by returning/propagating a Result or skipping the entry with a logged warning, and replace expect("Could not create parent dir") with proper error handling (propagate or log and continue). Apply the same sanitization and error-handling fixes to the other similar block referenced around lines 1097-1110 (the code that computes dst_path and calls create_dir_all).oxen-rust/crates/server/src/controllers/tree.rs (1)
27-33:⚠️ Potential issue | 🟡 MinorRecord
namespace/repo_namespan fields after parsing.Declared span fields remain empty unless explicitly recorded. Add
Span::record()calls after parsing these path parameters to populate the tracing span—this pattern appears across 10 handlers in this file.♻️ Suggested pattern (apply to each handler)
#[tracing::instrument(skip_all, fields(namespace, repo_name))] pub async fn get_node_by_id(req: HttpRequest) -> actix_web::Result<HttpResponse, OxenHttpError> { metrics::counter!("oxen_server_tree_get_node_by_id_total").increment(1); let app_data = app_data(&req)?; let namespace = path_param(&req, "namespace")?; let repo_name = path_param(&req, "repo_name")?; + let span = tracing::Span::current(); + span.record("namespace", tracing::field::display(&namespace)); + span.record("repo_name", tracing::field::display(&repo_name));Also applies to: 42-47, 74-80, 114-119, 138-143, 169-174, 195-198, 209-214, 230-236, 320-323
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/tree.rs` around lines 27 - 33, The tracing span fields declared at the top of handlers (e.g., get_node_by_id) are never populated; after parsing path params you should record them into the current span by calling tracing::Span::current().record for "namespace" and "repo_name" (use tracing::field::display(&namespace) / tracing::field::display(&repo_name) as the value) immediately after obtaining namespace and repo_name; apply the same pattern to the other handlers listed (the handlers around lines with start of other functions in this file) so each parsed path parameter is recorded into its span.oxen-rust/crates/lib/src/api/client/export.rs (1)
19-68:⚠️ Potential issue | 🟡 MinorDuration metrics not recorded on all error paths.
The timer is started at line 19, but duration is only recorded in two places (lines 60-61 and 64-65). Early returns at lines 34, 37-39, and 52-54 skip recording, causing incomplete duration metrics for failed requests.
Consider using a guard pattern or restructuring to ensure consistent recording:
🛠️ Suggested approach using a scope guard or explicit recording
pub async fn download_dir_as_zip( remote_repo: &RemoteRepository, revision: impl AsRef<str>, directory: impl AsRef<Path>, local_path: impl AsRef<Path>, ) -> Result<u64, OxenError> { metrics::counter!("oxen_client_export_download_dir_as_zip_total").increment(1); let timer = std::time::Instant::now(); + let record_duration = || { + metrics::histogram!("oxen_client_export_download_dir_as_zip_duration_seconds") + .record(timer.elapsed().as_secs_f64()); + }; let revision = revision.as_ref().to_string(); // ... rest of function if let Ok(res) = client.get(&url).send().await { let status = res.status(); if !status.is_success() { if status == reqwest::StatusCode::UNAUTHORIZED { let e = "Err: unauthorized request to download data".to_string(); log::error!("{e}"); + record_duration(); return Err(OxenError::authentication(e)); } + record_duration(); return Err(OxenError::basic_str(format!( "download_dir_as_zip failed with status {status} for {url}" ))); } // ... handle success let size = match tokio::io::copy(&mut reader, &mut file).await { Ok(s) => s, Err(e) => { let _ = tokio::fs::remove_file(local_path).await; + record_duration(); return Err(OxenError::basic_str(format!( "Failed to download ZIP to {local_path:?}: {e}" ))); } }; // ... }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/export.rs` around lines 19 - 68, The duration metric started by timer is not recorded on several early-error returns; ensure the histogram "oxen_client_export_download_dir_as_zip_duration_seconds" is always recorded before any return. Fix by introducing a scope guard (or a small Drop-guard struct) that records timer.elapsed() into metrics::histogram!("oxen_client_export_download_dir_as_zip_duration_seconds") when dropped, or alternatively call that histogram.record(timer.elapsed().as_secs_f64()) immediately before every Err/early-return in the function (e.g., in the unauthorized branch where OxenError::authentication is returned, the status error branch returning OxenError::basic_str, and the request-send failure branch), referencing the existing timer local and the histogram metric name so all code paths record duration consistently.oxen-rust/crates/lib/src/api/client/commits.rs (1)
849-858:⚠️ Potential issue | 🔴 CriticalPropagate failed chunk uploads instead of returning success.
upload_data_to_server_in_chunks_with_clientonly logsErrand keeps going, then returnsOk(()). That letspost_data_to_server_with_clientreport success even when one or more chunks never reached the server, leaving the remote with a partial tarball.🛠️ Suggested fix
for (i, chunk) in chunks.iter().enumerate() { log::debug!( "upload_data_to_server_in_chunks uploading chunk {} of size {}", i, ByteSize::b(chunks.len() as u64) ); @@ - match upload_data_chunk_to_server_with_retry( - client, - remote_repo, - chunk, - &hash, - ¶ms, - is_compressed, - filename, - ) - .await - { - Ok(_) => { - log::debug!("Success uploading chunk!") - } - Err(err) => { - log::error!("Err uploading chunk: {err}") - } - } + upload_data_chunk_to_server_with_retry( + client, + remote_repo, + chunk, + &hash, + ¶ms, + is_compressed, + filename, + ) + .await + .map_err(|err| { + OxenError::basic_str(format!( + "Failed uploading chunk {}/{}: {err}", + i + 1, + chunks.len() + )) + })?; + log::debug!("Success uploading chunk!"); } Ok(()) }Also applies to: 942-973
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/commits.rs` around lines 849 - 858, The chunk upload currently swallows errors inside upload_data_to_server_in_chunks_with_client causing post_data_to_server_with_client to report success even when uploads fail; modify upload_data_to_server_in_chunks_with_client so it does not just log Err but returns Err (or the first encountered OxenError) when any chunk upload fails (e.g., exit the loop early or aggregate and return an Err), and update post_data_to_server_with_client to propagate that Result (do not convert errors to Ok(())); reference functions upload_data_to_server_in_chunks_with_client and post_data_to_server_with_client to locate the changes and ensure any caller checks the Result and handles failure paths.oxen-rust/crates/lib/src/api/client/versions.rs (1)
502-535:⚠️ Potential issue | 🟡 MinorDuration metric not recorded on successful upload path.
The histogram recording at lines 529-530 is placed after the retry loop, but when the upload succeeds (line 518-520), the function returns early without recording the duration. This means successful uploads won't have their duration tracked.
🐛 Proposed fix to record duration on all paths
#[tracing::instrument(skip(local_repo, remote_repo, chunk, client))] pub async fn multipart_batch_upload_with_retry( local_repo: &LocalRepository, remote_repo: &RemoteRepository, chunk: &Vec<Entry>, client: &reqwest::Client, ) -> Result<(), OxenError> { metrics::counter!("oxen_client_versions_multipart_batch_upload_with_retry_total").increment(1); let timer = std::time::Instant::now(); let max_retries = max_retries(); let mut files_to_retry: Vec<ErrorFileInfo> = vec![]; for attempt in 0..max_retries { files_to_retry = multipart_batch_upload(local_repo, remote_repo, chunk, client, files_to_retry).await?; if files_to_retry.is_empty() { + metrics::histogram!("oxen_client_versions_multipart_batch_upload_with_retry_duration_seconds") + .record(timer.elapsed().as_secs_f64()); return Ok(()); } // Don't sleep after the last attempt if attempt + 1 < max_retries { let wait_time = retry::exponential_backoff(BASE_WAIT_TIME as u64, attempt, MAX_WAIT_TIME as u64); sleep(Duration::from_millis(wait_time)).await; } } metrics::histogram!("oxen_client_versions_multipart_batch_upload_with_retry_duration_seconds") .record(timer.elapsed().as_secs_f64()); Err(OxenError::basic_str(format!( "Failed to upload files: {files_to_retry:#?}" ))) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/versions.rs` around lines 502 - 535, In multipart_batch_upload_with_retry, the duration histogram is only recorded after the retry loop, so successful early returns miss metrics; before returning Ok(()) inside the loop (the success path after files_to_retry.is_empty()), record the existing timer with metrics::histogram!("oxen_client_versions_multipart_batch_upload_with_retry_duration_seconds").record(timer.elapsed().as_secs_f64()); to ensure the duration is captured for successful uploads (keep the existing histogram recording after the loop for the failure path).
🟡 Minor comments (7)
oxen-rust/crates/server/src/controllers/metadata.rs-16-16 (1)
16-16:⚠️ Potential issue | 🟡 MinorPopulate span fields explicitly after parsing path params.
On Line 16 and Line 105,
fields(namespace, repo_name)declares fields, but they are never recorded with values, so these dimensions stay empty in traces.In tracing 0.1, for #[instrument(skip_all, fields(namespace, repo_name))], are those fields automatically populated from local variables created inside the function body, or must Span::current().record(...) be used?Suggested patch
#[tracing::instrument(skip_all, fields(namespace, repo_name))] pub async fn file(req: HttpRequest) -> actix_web::Result<HttpResponse, OxenHttpError> { metrics::counter!("oxen_server_metadata_file_total").increment(1); let app_data = app_data(&req)?; let namespace = path_param(&req, "namespace")?; let repo_name = path_param(&req, "repo_name")?; + let span = tracing::Span::current(); + span.record("namespace", &tracing::field::display(&namespace)); + span.record("repo_name", &tracing::field::display(&repo_name)); let repo = get_repo(&app_data.path, namespace, &repo_name)?; @@ #[tracing::instrument(skip_all, fields(namespace, repo_name))] pub async fn update_metadata(req: HttpRequest) -> actix_web::Result<HttpResponse, OxenHttpError> { metrics::counter!("oxen_server_metadata_update_metadata_total").increment(1); let app_data = app_data(&req)?; let namespace = path_param(&req, "namespace")?; let repo_name = path_param(&req, "repo_name")?; + let span = tracing::Span::current(); + span.record("namespace", &tracing::field::display(&namespace)); + span.record("repo_name", &tracing::field::display(&repo_name)); let repo = get_repo(&app_data.path, &namespace, &repo_name)?;Also applies to: 105-105
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/metadata.rs` at line 16, The #[tracing::instrument(skip_all, fields(namespace, repo_name))] macro declares empty span fields but never assigns values; after you parse the path params in the metadata controller functions (the functions annotated with that macro in metadata.rs), explicitly record the values into the current span (e.g., call Span::current().record(...) or tracing::Span::current().record(...) for the "namespace" and "repo_name" fields) immediately after you have the parsed variables so traces contain the actual namespace and repo_name values. Ensure you reference the same variable names used in the function (namespace, repo_name) when recording.oxen-rust/crates/server/src/controllers/export.rs-31-33 (1)
31-33:⚠️ Potential issue | 🟡 MinorRecord ZIP download duration on error paths too.
Right now,
oxen_server_export_download_zip_duration_secondsis skipped when the handler returns early with an error, so latency metrics underreport failing requests.📉 Proposed fix (record duration regardless of outcome)
pub async fn download_zip(req: HttpRequest) -> Result<HttpResponse, OxenHttpError> { metrics::counter!("oxen_server_export_download_zip_total").increment(1); let timer = std::time::Instant::now(); - let app_data = app_data(&req)?; + let result: Result<HttpResponse, OxenHttpError> = async { + let app_data = app_data(&req)?; let namespace = path_param(&req, "namespace")?; let repo_name = path_param(&req, "repo_name")?; let repo = get_repo(&app_data.path, &namespace, &repo_name)?; @@ - let response = controllers::versions::stream_versions_zip(&repo, files_with_hash).await?; - - metrics::histogram!("oxen_server_export_download_zip_duration_seconds") - .record(timer.elapsed().as_secs_f64()); - Ok(response) + let response = controllers::versions::stream_versions_zip(&repo, files_with_hash).await?; + Ok(response) + } + .await; + + metrics::histogram!("oxen_server_export_download_zip_duration_seconds") + .record(timer.elapsed().as_secs_f64()); + result }Also applies to: 45-46, 62-63, 79-83
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/export.rs` around lines 31 - 33, The handler currently starts a timer (let timer = std::time::Instant::now()) and increments the counter but only records oxen_server_export_download_zip_duration_seconds on success, so early returns (e.g., after app_data(&req)? and other error branches around lines 45-46, 62-63, 79-83) skip recording duration; fix by ensuring the duration is recorded on every exit path—either wrap the handler body with a scope guard or explicitly compute elapsed = timer.elapsed().as_secs_f64() and call metrics::histogram!("oxen_server_export_download_zip_duration_seconds", elapsed) right before every return/error propagation point (including where app_data(&req)? would return) or convert fallible calls to map_err/map to record the metric in the error branch; reference the existing timer variable and the metric name when adding these recordings.oxen-rust/crates/lib/src/api/client/workspaces/files.rs-1241-1243 (1)
1241-1243:⚠️ Potential issue | 🟡 Minor
downloadduration histogram currently excludes failures.At Line 1276 and Line 1281 the function returns before Line 1286, so failed downloads are not reflected in
oxen_client_workspaces_files_download_duration_seconds.📉 Proposed fix (always record elapsed time)
pub async fn download( @@ ) -> Result<(), OxenError> { metrics::counter!("oxen_client_workspaces_files_download_total").increment(1); let timer = std::time::Instant::now(); - let uri = if util::fs::has_tabular_extension(path) { + let result: Result<(), OxenError> = async { + let uri = if util::fs::has_tabular_extension(path) { format!("/workspaces/{workspace_id}/data_frames/download/{path}") } else { format!("/workspaces/{workspace_id}/files/{path}") @@ - return Err(OxenError::path_does_not_exist(path)); + return Err(OxenError::path_does_not_exist(path)); } @@ - return Err(OxenError::basic_str(format!( + return Err(OxenError::basic_str(format!( "Error: Could not download file {body:?}" ))); } - - metrics::histogram!("oxen_client_workspaces_files_download_duration_seconds") - .record(timer.elapsed().as_secs_f64()); - Ok(()) + Ok(()) + }.await; + metrics::histogram!("oxen_client_workspaces_files_download_duration_seconds") + .record(timer.elapsed().as_secs_f64()); + result }Also applies to: 1276-1283, 1286-1288
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/workspaces/files.rs` around lines 1241 - 1243, The download function records a start time in variable timer but returns early on error paths (e.g., when determining uri or on failures) before calling the histogram oxen_client_workspaces_files_download_duration_seconds.observe(...), so failures aren't measured; update download (or its error return paths) to always record elapsed time by calling oxen_client_workspaces_files_download_duration_seconds.observe(timer.elapsed().as_secs_f64()) just before every early return, or implement a small scope guard/RAII Drop helper that captures timer and records the duration on drop so all exits (success or failure) are automatically observed; refer to the download function, the timer variable, and the histogram name when making the change.oxen-rust/config_for_tracing_metrics.md-67-71 (1)
67-71:⚠️ Potential issue | 🟡 MinorAdd language identifiers to the sample fences.
These two unlabeled fenced blocks trip MD040 and lose syntax highlighting in rendered docs.
textis enough here.🛠️ Minimal fix
-``` +```text 2026-03-09T12:00:01.234Z INFO liboxen::repositories::add: staging files path="images/" 2026-03-09T12:00:01.567Z DEBUG liboxen::core::v_latest::index::commit_merkle_tree: loading merkle tree commit_id="abc123" 2026-03-09T12:00:02.890Z INFO liboxen::repositories::commits: committed message="add training data"```diff -``` +```text # HELP oxen_server_commits_upload_total oxen_server_commits_upload_total # TYPE oxen_server_commits_upload_total counter oxen_server_commits_upload_total 42 ... oxen_errors_total{module="repositories::push",error="RemoteBranchNotFound"} 3</details> Also applies to: 183-199 <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@oxen-rust/config_for_tracing_metrics.mdaround lines 67 - 71, The unlabeled
fenced code blocks containing the sample logs (the three-line block starting
with "2026-03-09T12:00:01.234Z INFO liboxen::repositories::add: staging files
path="images/"" and the Prometheus metrics block starting with "# HELP
oxen_server_commits_upload_total") need language identifiers to avoid MD040 and
enable syntax highlighting; update both fences (and the other similar block at
the section covering lines 183-199) to usetext instead of plainso the
sample log and metrics blocks are labeled consistently.</details> </blockquote></details> <details> <summary>oxen-rust/crates/server/src/controllers/versions/chunks.rs-29-30 (1)</summary><blockquote> `29-30`: _⚠️ Potential issue_ | _🟡 Minor_ **Record chunk durations on every exit path.** `*_total` increments before validation/IO, but the histogram only records on the happy path and the invalid-body fallback. Bad requests and write failures never hit `_count`, so the latency series under-reports the exact failures you’ll want to debug. <details> <summary>⏱️ Suggested structure</summary> ```diff - // existing body with early returns - metrics::histogram!("oxen_server_versions_chunks_complete_duration_seconds") - .record(timer.elapsed().as_secs_f64()); - Ok(HttpResponse::Ok().json(StatusMessage::resource_found())) + let result = async { + // existing body with early returns converted to `return result;` + } + .await; + metrics::histogram!("oxen_server_versions_chunks_complete_duration_seconds") + .record(timer.elapsed().as_secs_f64()); + result ``` </details> Also applies to: 67-69, 74-75, 87-118, 149-156 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/versions/chunks.rs` around lines 29 - 30, The counter metric is incremented and timer started, but the duration histogram is only recorded on the happy path; update the upload handler so the duration histogram (the same histogram used later in this file for chunk uploads) is recorded on every exit path — before every early return and in every error branch (e.g., validation failures, IO/write errors). Concretely, capture the elapsed time from the existing timer (std::time::Instant::now()) and call the same histogram record/observe function used elsewhere in this file (the chunk upload histogram) immediately before each return or error response so all paths (success, invalid-body, bad requests, write failures) report latency. ``` </details> </blockquote></details> <details> <summary>oxen-rust/crates/lib/src/api/client/import.rs-79-81 (1)</summary><blockquote> `79-81`: _⚠️ Potential issue_ | _🟡 Minor_ **Record duration once at function boundary instead of inside the retry closure.** Current placement misses terminal-failure latency and couples timing to success parsing path. <details> <summary>⏱️ Suggested placement</summary> ```diff - retry::with_retry(&config, |_attempt| { + let result = retry::with_retry(&config, |_attempt| { @@ - metrics::histogram!("oxen_client_import_upload_zip_duration_seconds") - .record(timer.elapsed().as_secs_f64()); Ok(response.commit) } }) - .await + .await; + metrics::histogram!("oxen_client_import_upload_zip_duration_seconds") + .record(timer.elapsed().as_secs_f64()); + result ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/import.rs` around lines 79 - 81, The timing call metrics::histogram!("oxen_client_import_upload_zip_duration_seconds").record(timer.elapsed().as_secs_f64()) is inside the retry closure and only records on the success/parse path; move timing to the function boundary by creating the timer (e.g., let timer = Instant::now()) before entering the retry closure and calling metrics::histogram!(...).record(timer.elapsed().as_secs_f64()) once after the retry completes (both on success path that returns Ok(response.commit) and on the error return path) so terminal failures are measured; remove the in-closure record and ensure the timer symbol and the histogram call are used outside the closure surrounding the retry logic. ``` </details> </blockquote></details> <details> <summary>oxen-rust/crates/server/src/controllers/branches.rs-21-22 (1)</summary><blockquote> `21-22`: _⚠️ Potential issue_ | _🟡 Minor_ **Record the `namespace` and `repo_name` fields after extracting them.** With `skip_all, fields(namespace, repo_name)`, the span declares these fields but leaves them empty. They remain empty unless you call `Span::record()` after extracting the values via `path_param()`, making the spans impossible to filter by repo. The same issue occurs in the `show` handler and across other instrumented functions. <details> <summary>Example fix for the `index` handler:</summary> ```rust pub async fn index(req: HttpRequest) -> actix_web::Result<HttpResponse, OxenHttpError> { metrics::counter!("oxen_server_branches_index_total").increment(1); let app_data = app_data(&req)?; let namespace = path_param(&req, "namespace")?; let name = path_param(&req, "repo_name")?; tracing::Span::current() .record("namespace", &namespace) .record("repo_name", &name); let repo = get_repo(&app_data.path, namespace, name)?; // ... rest of function } ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/branches.rs` around lines 21 - 22, The tracing span declares fields namespace and repo_name via #[tracing::instrument(skip_all, fields(namespace, repo_name))] but never records them, leaving the fields empty; after extracting the values with path_param() in handlers like index and show, call tracing::Span::current().record("namespace", &namespace) and .record("repo_name", &name) (or the local variable names you used) immediately after extraction so the span contains the repo info for filtering; apply the same pattern to other instrumented functions that declare these fields. ``` </details> </blockquote></details> </blockquote></details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Path: .coderabbit.yaml **Review profile**: CHILL **Plan**: Pro **Run ID**: `d3ecc3d1-5332-4e19-b03f-18afb24e09f2` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 3988b1a5d2431fabd05226860a0c67d56938ad77 and 506c67ec4e1d351902be565eb87251a4aaebff72. </details> <details> <summary>⛔ Files ignored due to path filters (2)</summary> * `oxen-python/Cargo.lock` is excluded by `!**/*.lock` * `oxen-rust/Cargo.lock` is excluded by `!**/*.lock` </details> <details> <summary>📒 Files selected for processing (95)</summary> * `oxen-rust/Cargo.toml` * `oxen-rust/config_for_tracing_metrics.md` * `oxen-rust/crates/cli/Cargo.toml` * `oxen-rust/crates/cli/src/main.rs` * `oxen-rust/crates/lib/Cargo.toml` * `oxen-rust/crates/lib/src/api/client/branches.rs` * `oxen-rust/crates/lib/src/api/client/commits.rs` * `oxen-rust/crates/lib/src/api/client/compare.rs` * `oxen-rust/crates/lib/src/api/client/data_frames.rs` * `oxen-rust/crates/lib/src/api/client/diff.rs` * `oxen-rust/crates/lib/src/api/client/dir.rs` * `oxen-rust/crates/lib/src/api/client/entries.rs` * `oxen-rust/crates/lib/src/api/client/export.rs` * `oxen-rust/crates/lib/src/api/client/file.rs` * `oxen-rust/crates/lib/src/api/client/import.rs` * `oxen-rust/crates/lib/src/api/client/merger.rs` * `oxen-rust/crates/lib/src/api/client/metadata.rs` * `oxen-rust/crates/lib/src/api/client/repositories.rs` * `oxen-rust/crates/lib/src/api/client/schemas.rs` * `oxen-rust/crates/lib/src/api/client/stats.rs` * `oxen-rust/crates/lib/src/api/client/tree.rs` * `oxen-rust/crates/lib/src/api/client/version.rs` * `oxen-rust/crates/lib/src/api/client/versions.rs` * `oxen-rust/crates/lib/src/api/client/workspaces.rs` * `oxen-rust/crates/lib/src/api/client/workspaces/changes.rs` * `oxen-rust/crates/lib/src/api/client/workspaces/commits.rs` * `oxen-rust/crates/lib/src/api/client/workspaces/data_frames.rs` * `oxen-rust/crates/lib/src/api/client/workspaces/data_frames/columns.rs` * `oxen-rust/crates/lib/src/api/client/workspaces/data_frames/embeddings.rs` * `oxen-rust/crates/lib/src/api/client/workspaces/data_frames/rows.rs` * `oxen-rust/crates/lib/src/api/client/workspaces/files.rs` * `oxen-rust/crates/lib/src/core/v_latest/index/commit_merkle_tree.rs` * `oxen-rust/crates/lib/src/repositories/add.rs` * `oxen-rust/crates/lib/src/repositories/branches.rs` * `oxen-rust/crates/lib/src/repositories/checkout.rs` * `oxen-rust/crates/lib/src/repositories/clone.rs` * `oxen-rust/crates/lib/src/repositories/commits.rs` * `oxen-rust/crates/lib/src/repositories/data_frames.rs` * `oxen-rust/crates/lib/src/repositories/diffs.rs` * `oxen-rust/crates/lib/src/repositories/download.rs` * `oxen-rust/crates/lib/src/repositories/entries.rs` * `oxen-rust/crates/lib/src/repositories/fetch.rs` * `oxen-rust/crates/lib/src/repositories/fork.rs` * `oxen-rust/crates/lib/src/repositories/init.rs` * `oxen-rust/crates/lib/src/repositories/load.rs` * `oxen-rust/crates/lib/src/repositories/merge.rs` * `oxen-rust/crates/lib/src/repositories/metadata.rs` * `oxen-rust/crates/lib/src/repositories/prune.rs` * `oxen-rust/crates/lib/src/repositories/pull.rs` * `oxen-rust/crates/lib/src/repositories/push.rs` * `oxen-rust/crates/lib/src/repositories/restore.rs` * `oxen-rust/crates/lib/src/repositories/rm.rs` * `oxen-rust/crates/lib/src/repositories/save.rs` * `oxen-rust/crates/lib/src/repositories/size.rs` * `oxen-rust/crates/lib/src/repositories/stats.rs` * `oxen-rust/crates/lib/src/repositories/status.rs` * `oxen-rust/crates/lib/src/repositories/tree.rs` * `oxen-rust/crates/lib/src/repositories/workspaces.rs` * `oxen-rust/crates/lib/src/test.rs` * `oxen-rust/crates/lib/src/util.rs` * `oxen-rust/crates/lib/src/util/logging.rs` * `oxen-rust/crates/lib/src/util/telemetry.rs` * `oxen-rust/crates/server/Cargo.toml` * `oxen-rust/crates/server/src/controllers/action.rs` * `oxen-rust/crates/server/src/controllers/branches.rs` * `oxen-rust/crates/server/src/controllers/commits.rs` * `oxen-rust/crates/server/src/controllers/data_frames.rs` * `oxen-rust/crates/server/src/controllers/diff.rs` * `oxen-rust/crates/server/src/controllers/dir.rs` * `oxen-rust/crates/server/src/controllers/entries.rs` * `oxen-rust/crates/server/src/controllers/export.rs` * `oxen-rust/crates/server/src/controllers/fork.rs` * `oxen-rust/crates/server/src/controllers/health.rs` * `oxen-rust/crates/server/src/controllers/import.rs` * `oxen-rust/crates/server/src/controllers/merger.rs` * `oxen-rust/crates/server/src/controllers/metadata.rs` * `oxen-rust/crates/server/src/controllers/migrations.rs` * `oxen-rust/crates/server/src/controllers/namespaces.rs` * `oxen-rust/crates/server/src/controllers/not_found.rs` * `oxen-rust/crates/server/src/controllers/oxen_version.rs` * `oxen-rust/crates/server/src/controllers/prune.rs` * `oxen-rust/crates/server/src/controllers/repositories.rs` * `oxen-rust/crates/server/src/controllers/revisions.rs` * `oxen-rust/crates/server/src/controllers/schemas.rs` * `oxen-rust/crates/server/src/controllers/tree.rs` * `oxen-rust/crates/server/src/controllers/versions.rs` * `oxen-rust/crates/server/src/controllers/versions/chunks.rs` * `oxen-rust/crates/server/src/controllers/workspaces.rs` * `oxen-rust/crates/server/src/controllers/workspaces/changes.rs` * `oxen-rust/crates/server/src/controllers/workspaces/data_frames.rs` * `oxen-rust/crates/server/src/controllers/workspaces/data_frames/columns.rs` * `oxen-rust/crates/server/src/controllers/workspaces/data_frames/embeddings.rs` * `oxen-rust/crates/server/src/controllers/workspaces/data_frames/rows.rs` * `oxen-rust/crates/server/src/controllers/workspaces/files.rs` * `oxen-rust/crates/server/src/main.rs` </details> <details> <summary>💤 Files with no reviewable changes (1)</summary> * oxen-rust/crates/lib/src/util/logging.rs </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
| #[tracing::instrument(skip(remote_repo, path, opts))] | ||
| pub async fn get( | ||
| remote_repo: &RemoteRepository, | ||
| commit_or_branch: &str, | ||
| path: impl AsRef<Path>, | ||
| opts: DFOpts, | ||
| ) -> Result<JsonDataFrameViewResponse, OxenError> { | ||
| metrics::counter!("oxen_client_data_frames_get_total").increment(1); |
There was a problem hiding this comment.
Stop logging full dataframe payloads to the new JSON log sink.
body/val here contain the page contents. With file-backed structured logging enabled in this PR, these debug lines will persist raw row data and schema metadata to disk and can explode log volume on hot read paths. Please log only stable request metadata or a truncated/redacted summary instead.
Also applies to: 29-34
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/lib/src/api/client/data_frames.rs` around lines 13 - 20, The
get function in data_frames.rs is currently logging full dataframe payloads
(variables like body/val returned in JsonDataFrameViewResponse) which will
persist raw row data; replace those debug/info logs with logging only stable
request metadata (remote_repo.name, commit_or_branch, path.as_ref().display(),
opts) and a redacted or size-limited summary (e.g., row_count,
schema_fields_count, first_n_rows_preview) instead of the full body/val; ensure
the same change is applied to any other logging in this file that references
body or val (including the other debug lines noted) so raw page contents are
never emitted to the JSON/file log sink.
| #[tracing::instrument(skip(remote_repo, path))] | ||
| pub async fn index( | ||
| remote_repo: &RemoteRepository, | ||
| commit_or_branch: &str, | ||
| path: impl AsRef<Path>, | ||
| ) -> Result<StatusMessage, OxenError> { | ||
| metrics::counter!("oxen_client_data_frames_index_total").increment(1); | ||
| let path_str = path.as_ref().to_str().unwrap(); |
There was a problem hiding this comment.
Normalize index paths the same way as the other data-frame calls.
Unlike get and from_directory, this still uses to_str().unwrap(). That preserves \ separators on Windows and panics on valid non-UTF8 paths, so the index endpoint can fail for repo paths the sibling calls already handle.
🛠️ Suggested fix
- let path_str = path.as_ref().to_str().unwrap();
+ let path_str = util::fs::to_unix_str(path);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/lib/src/api/client/data_frames.rs` around lines 43 - 50, The
index function currently uses path.as_ref().to_str().unwrap(), which panics on
non-UTF-8 paths and preserves Windows backslashes; change the assignment of
path_str in index to use the same normalization used by the other data-frame
helpers (the non-panicking, platform-normalizing routine used by get and
from_directory). Replace the to_str().unwrap() call with the shared normalizer
(or Path::to_string_lossy() followed by replacing '\' with '/' or the project's
normalize_path utility) so path_str is UTF-safe and has normalized '/'
separators; keep the same variable name (path_str) and only alter how it's
produced.
| #[tracing::instrument( | ||
| skip(remote_repo, remote_path, local_path, revision), | ||
| fields(num_bytes) | ||
| )] | ||
| pub async fn download_large_entry( | ||
| remote_repo: &RemoteRepository, | ||
| remote_path: impl AsRef<Path>, | ||
| local_path: impl AsRef<Path>, | ||
| revision: impl AsRef<str>, | ||
| num_bytes: u64, | ||
| ) -> Result<(), OxenError> { | ||
| metrics::counter!("oxen_client_entries_download_large_entry_total").increment(1); |
There was a problem hiding this comment.
Fix the extra zero-length chunk on exact chunk boundaries.
((total_size / chunk_size) + 1) overcounts whenever total_size % chunk_size == 0, so the last task is queued with chunk_size = 0. That can trigger a bogus /chunk request for files whose size is an exact multiple of AVG_CHUNK_SIZE.
🛠️ Suggested fix
- let num_chunks = ((total_size / chunk_size) + 1) as usize;
- let mut chunk_size = chunk_size;
+ let num_chunks = ((total_size + chunk_size - 1) / chunk_size) as usize;
@@
for i in 0..num_chunks {
// Make sure we read the last size correctly
let chunk_start = (i as u64) * chunk_size;
- if (chunk_start + chunk_size) > total_size {
- chunk_size = total_size % chunk_size;
- }
+ let this_chunk_size = std::cmp::min(chunk_size, total_size - chunk_start);
let filename = format!("chunk_{i}");
let tmp_file = tmp_dir.join(filename);
@@
tmp_file,
revision.as_ref().to_string(),
chunk_start,
- chunk_size,
+ this_chunk_size,
));
}Also applies to: 597-614
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/lib/src/api/client/entries.rs` around lines 555 - 566, The
code in download_large_entry is computing chunk count as ((total_size /
chunk_size) + 1) which creates a zero-length final chunk when total_size is an
exact multiple of chunk_size; update the logic (and similarly the block around
lines 597-614) to compute the number of chunks using integer ceil (e.g.,
(total_size + chunk_size - 1) / chunk_size) or otherwise avoid adding 1
unconditionally, and ensure you never enqueue a chunk task with length 0 by
validating chunk_size > 0 before pushing the task or computing chunk end with
min(start + chunk_size, total_size).
| #[tracing::instrument(skip(remote_repo, path))] | ||
| pub async fn rm( | ||
| remote_repo: &RemoteRepository, | ||
| workspace_id: &str, | ||
| path: impl AsRef<Path>, | ||
| ) -> Result<(), OxenError> { | ||
| metrics::counter!("oxen_client_workspaces_changes_rm_total").increment(1); |
There was a problem hiding this comment.
Do not record workspace_id in rm span fields.
Line 49 currently captures workspace_id in tracing metadata, unlike list, and this can leak identifiers into structured logs/traces.
Proposed fix
-#[tracing::instrument(skip(remote_repo, path))]
+#[tracing::instrument(skip(remote_repo, workspace_id, path))]
pub async fn rm(
remote_repo: &RemoteRepository,
workspace_id: &str,
path: impl AsRef<Path>,
) -> Result<(), OxenError> {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/lib/src/api/client/workspaces/changes.rs` around lines 49 -
55, The tracing span for function rm is currently capturing workspace_id and
leaking identifiers; update the tracing attribute on fn rm to prevent capturing
workspace_id by adding it to the skip list (e.g. change
#[tracing::instrument(skip(remote_repo, path))] to also skip workspace_id) so
workspace_id is not recorded in span fields; locate the
#[tracing::instrument(...)] on the async fn rm and modify the attribute
accordingly.
| #[tracing::instrument] | ||
| pub fn start_fork( | ||
| original_path: PathBuf, | ||
| new_path: PathBuf, | ||
| ) -> Result<ForkStartResponse, OxenError> { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find the fork.rs file
find . -name "fork.rs" -type f | head -5Repository: Oxen-AI/Oxen
Length of output: 240
🏁 Script executed:
# Read the file to examine lines 49-53 and 106-107
cat -n oxen-rust/crates/lib/src/repositories/fork.rs | sed -n '40,120p'Repository: Oxen-AI/Oxen
Length of output: 3617
Skip path arguments in tracing instrumentation to avoid leaking filesystem paths to structured logs.
Lines 49 and 106 capture PathBuf/Path arguments by default in the tracing instrumentation, which can expose absolute filesystem paths in logs.
-#[tracing::instrument]
+#[tracing::instrument(skip(original_path, new_path))]
pub fn start_fork(
original_path: PathBuf,
new_path: PathBuf,
) -> Result<ForkStartResponse, OxenError> {-#[tracing::instrument]
+#[tracing::instrument(skip(repo_path))]
pub fn get_fork_status(repo_path: &Path) -> Result<ForkStatusResponse, OxenError> {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #[tracing::instrument] | |
| pub fn start_fork( | |
| original_path: PathBuf, | |
| new_path: PathBuf, | |
| ) -> Result<ForkStartResponse, OxenError> { | |
| #[tracing::instrument(skip(original_path, new_path))] | |
| pub fn start_fork( | |
| original_path: PathBuf, | |
| new_path: PathBuf, | |
| ) -> Result<ForkStartResponse, OxenError> { |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/lib/src/repositories/fork.rs` around lines 49 - 53, The
tracing instrumentation on start_fork currently captures PathBuf arguments
(original_path, new_path) and may leak filesystem paths; update the attribute to
#[tracing::instrument(skip(original_path, new_path))] on the start_fork function
to prevent those fields from being recorded, and apply the same pattern to any
other functions in this file that accept Path or PathBuf parameters (e.g., the
function at ~line 106) so all path args are excluded from structured logs.
| #[tracing::instrument(skip_all, fields(namespace, repo_name))] | ||
| pub async fn upload( | ||
| req: HttpRequest, | ||
| query: web::Query<ChunkQuery>, | ||
| mut body: web::Payload, | ||
| ) -> Result<HttpResponse, OxenHttpError> { | ||
| metrics::counter!("oxen_server_versions_chunks_upload_total").increment(1); | ||
| let timer = std::time::Instant::now(); |
There was a problem hiding this comment.
❓ Verification inconclusive
Script executed:
#!/bin/bash
# Expect no matches today; if so, the declared span fields are never recorded.
rg -n 'Span::current\(\)\.record|\.record\("namespace"|\.record\("repo_name"' \
oxen-rust/crates/server/src/controllers/versions/chunks.rsRepository: Oxen-AI/Oxen
Repository: Oxen-AI/Oxen
Exit code: 1
Script executed:
cat -n oxen-rust/crates/server/src/controllers/versions/chunks.rs | head -200Repository: Oxen-AI/Oxen
Repository: Oxen-AI/Oxen
Exit code: 0
stdout:
1 use std::path::PathBuf;
2
3 use crate::errors::OxenHttpError;
4 use crate::helpers::get_repo;
5 use crate::params::{app_data, path_param};
6
7 use actix_web::{HttpRequest, HttpResponse, web};
8 use futures_util::stream::StreamExt as _;
9 use liboxen::constants::AVG_CHUNK_SIZE;
10 use liboxen::core;
11 use liboxen::repositories;
12 use liboxen::view::StatusMessage;
13 use liboxen::view::versions::CompleteVersionUploadRequest;
14 use serde::Deserialize;
15 use tokio::io::AsyncWriteExt;
16
17 #[derive(Deserialize, Debug)]
18 pub struct ChunkQuery {
19 pub offset: Option<u64>,
20 pub size: Option<u64>,
21 }
22
23 #[tracing::instrument(skip_all, fields(namespace, repo_name))]
24 pub async fn upload(
25 req: HttpRequest,
26 query: web::Query<ChunkQuery>,
27 mut body: web::Payload,
28 ) -> Result<HttpResponse, OxenHttpError> {
29 metrics::counter!("oxen_server_versions_chunks_upload_total").increment(1);
30 let timer = std::time::Instant::now();
31 let app_data = app_data(&req)?;
32 let namespace = path_param(&req, "namespace")?;
33 let repo_name = path_param(&req, "repo_name")?;
34 let version_id = path_param(&req, "version_id")?;
35
36 let offset = query.offset.unwrap_or(0);
37
38 let repo = get_repo(&app_data.path, namespace, repo_name)?;
39
40 log::debug!(
41 "/upload version {} chunk offset{} to repo: {:?}",
42 version_id,
43 offset,
44 repo.path
45 );
46
47 let version_store = repo.version_store()?;
48
49 let mut writer = version_store
50 .get_version_chunk_writer(&version_id, offset)
51 .await?;
52
53 // Write chunks in stream
54 while let Some(chunk_result) = body.next().await {
55 let chunk = chunk_result.map_err(|e| OxenHttpError::BadRequest(e.to_string().into()))?;
56 writer
57 .write_all(&chunk)
58 .await
59 .map_err(|e| OxenHttpError::BasicError(e.to_string().into()))?;
60 }
61
62 writer
63 .flush()
64 .await
65 .map_err(|e| OxenHttpError::BasicError(e.to_string().into()))?;
66
67 metrics::histogram!("oxen_server_versions_chunks_upload_duration_seconds")
68 .record(timer.elapsed().as_secs_f64());
69 Ok(HttpResponse::Ok().json(StatusMessage::resource_created()))
70 }
71
72 #[tracing::instrument(skip_all, fields(namespace, repo_name))]
73 pub async fn complete(req: HttpRequest, body: String) -> Result<HttpResponse, OxenHttpError> {
74 metrics::counter!("oxen_server_versions_chunks_complete_total").increment(1);
75 let timer = std::time::Instant::now();
76 let app_data = app_data(&req)?;
77 let namespace = path_param(&req, "namespace")?;
78 let repo_name = path_param(&req, "repo_name")?;
79 let version_id = path_param(&req, "version_id")?;
80 let repo = get_repo(&app_data.path, namespace, repo_name)?;
81
82 log::debug!("/complete version chunk upload to repo: {:?}", repo.path);
83
84 // Try to deserialize the body
85 let request: Result<CompleteVersionUploadRequest, serde_json::Error> =
86 serde_json::from_str(&body);
87 if let Ok(request) = request {
88 // There should only be a single file in the request
89 if request.files.len() != 1 {
90 return Ok(HttpResponse::BadRequest().json(StatusMessage::error(
91 "Expected a single file in the request",
92 )));
93 }
94
95 let file = &request.files[0];
96 // Support both new clients (num_chunks) and old clients (upload_results)
97 let num_chunks = file
98 .num_chunks
99 .or_else(|| file.upload_results.as_ref().map(|r| r.len()))
100 .ok_or_else(|| {
101 OxenHttpError::BadRequest(
102 "Missing both num_chunks and upload_results in request".into(),
103 )
104 })?;
105 log::debug!("Client uploaded {num_chunks} chunks");
106 let version_store = repo.version_store()?;
107
108 let chunks = version_store.list_version_chunks(&version_id).await?;
109 log::debug!("Found {} chunks on server", chunks.len());
110
111 if chunks.len() != num_chunks {
112 return Ok(
113 HttpResponse::BadRequest().json(StatusMessage::error(format!(
114 "Number of chunks does not match expected number of chunks: {} != {}",
115 chunks.len(),
116 num_chunks
117 ))),
118 );
119 }
120
121 // Combine all the chunks for a version file into a single file
122 let version_path = version_store
123 .combine_version_chunks(&version_id, true)
124 .await?;
125
126 // If the workspace id is provided, stage the file
127 if let Some(workspace_id) = request.workspace_id {
128 let Some(workspace) = repositories::workspaces::get(&repo, &workspace_id)? else {
129 return Ok(HttpResponse::NotFound().json(StatusMessage::error(format!(
130 "Workspace not found: {workspace_id}"
131 ))));
132 };
133 // TODO: Can we just replace workspaces::files::add with this?
134 // repositories::workspaces::files::add(&workspace, &version_path)?;
135 let dst_path = if let Some(dst_dir) = &file.dst_dir {
136 dst_dir.join(file.file_name.clone())
137 } else {
138 PathBuf::from(file.file_name.clone())
139 };
140
141 core::v_latest::workspaces::files::add_version_file(
142 &workspace,
143 &version_path,
144 &dst_path,
145 &version_id,
146 )?;
147 }
148
149 metrics::histogram!("oxen_server_versions_chunks_complete_duration_seconds")
150 .record(timer.elapsed().as_secs_f64());
151 return Ok(HttpResponse::Ok().json(StatusMessage::resource_found()));
152 }
153
154 metrics::histogram!("oxen_server_versions_chunks_complete_duration_seconds")
155 .record(timer.elapsed().as_secs_f64());
156 Ok(HttpResponse::BadRequest().json(StatusMessage::error("Invalid request body")))
157 }
158
159 // TODO: Add content-type and oxen-revision-id in the response header
160 // Currently, this endpoint is not used anywhere.
161 #[tracing::instrument(skip_all, fields(namespace, repo_name))]
162 pub async fn download(
163 req: HttpRequest,
164 query: web::Query<ChunkQuery>,
165 ) -> Result<HttpResponse, OxenHttpError> {
166 metrics::counter!("oxen_server_versions_chunks_download_total").increment(1);
167 let app_data = app_data(&req)?;
168 let namespace = path_param(&req, "namespace")?;
169 let repo_name = path_param(&req, "repo_name")?;
170 let version_id = path_param(&req, "version_id")?;
171 let repo = get_repo(&app_data.path, namespace, repo_name)?;
172
173 let offset = query.offset.unwrap_or(0);
174 let size = query.size.unwrap_or(AVG_CHUNK_SIZE);
175
176 log::debug!(
177 "download_chunk for repo: {:?}, file_hash: {}, offset: {}, size: {}",
178 repo.path,
179 version_id,
180 offset,
181 size
182 );
183
184 let version_store = repo.version_store()?;
185
186 let chunk_data = version_store
187 .get_version_chunk(&version_id, offset, size)
188 .await?;
189 Ok(HttpResponse::Ok()
190 .insert_header((
191 actix_web::http::header::CONTENT_LENGTH,
192 chunk_data.len().to_string(),
193 ))
194 .body(chunk_data))
195 }
196
197 #[tracing::instrument(skip_all)]
198 pub async fn create(_req: HttpRequest, _body: String) -> Result<HttpResponse, OxenHttpError> {
199 metrics::counter!("oxen_server_versions_chunks_create_total").increment(1);
200 Ok(HttpResponse::Ok().json(StatusMessage::resource_found()))
Span fields namespace and repo_name are declared but never manually recorded.
Both values are extracted from req after the span is created (lines 32–33, 77–78, 168–169), but the declared fields are never populated via Span::current().record(...). With skip_all on the instrument attribute, automatic capture doesn't apply, so these fields will remain empty in traces unless explicitly recorded.
Additionally, download() lacks a timer and duration histogram like upload() and complete() have.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/server/src/controllers/versions/chunks.rs` around lines 23 -
30, The tracing span in upload() declares fields "namespace" and "repo_name" but
never records them (span is created with skip_all), so after extracting the
values from req you must call tracing::Span::current().record(...) to populate
"namespace" and "repo_name" (use the actual extracted variables from upload(),
e.g., namespace and repo_name, with tracing::field::display or as_str()). Also
add the same request-timing pattern used in upload() and complete() to
download(): create a timer at the start of download(), and record the elapsed
duration to the appropriate histogram metric (mirroring the metric name/behavior
used by upload()/complete()) so download() has consistent duration
observability.
| pub mod files; | ||
|
|
||
| /// Get or create workspace | ||
| #[tracing::instrument(skip_all, fields(namespace, repo_name))] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "workspaces.rs" | grep -E "server/src/controllers"Repository: Oxen-AI/Oxen
Length of output: 111
🏁 Script executed:
# Check the file structure and size
wc -l oxen-rust/crates/server/src/controllers/workspaces.rsRepository: Oxen-AI/Oxen
Length of output: 113
🏁 Script executed:
# Read the key sections: around line 23 and the first handler
sed -n '1,70p' oxen-rust/crates/server/src/controllers/workspaces.rsRepository: Oxen-AI/Oxen
Length of output: 2759
🏁 Script executed:
# Check around line 147
sed -n '140,160p' oxen-rust/crates/server/src/controllers/workspaces.rsRepository: Oxen-AI/Oxen
Length of output: 918
🏁 Script executed:
# Look for Span::record calls to see if fields are actually being recorded
rg "Span::record|span\.record" oxen-rust/crates/server/src/controllers/workspaces.rsRepository: Oxen-AI/Oxen
Length of output: 38
🏁 Script executed:
# Check for any path_param usage and how namespace/repo_name are extracted
rg "path_param|namespace|repo_name" oxen-rust/crates/server/src/controllers/workspaces.rs | head -20Repository: Oxen-AI/Oxen
Length of output: 1467
🏁 Script executed:
# Check the raw body logging at line 62
sed -n '55,70p' oxen-rust/crates/server/src/controllers/workspaces.rsRepository: Oxen-AI/Oxen
Length of output: 717
🏁 Script executed:
# Check lines 402-404 and 411 for raw body logging in commit handler
sed -n '395,420p' oxen-rust/crates/server/src/controllers/workspaces.rsRepository: Oxen-AI/Oxen
Length of output: 1052
🏁 Script executed:
# Check other handlers mentioned - around line 147, 191, 202
sed -n '190,210p' oxen-rust/crates/server/src/controllers/workspaces.rsRepository: Oxen-AI/Oxen
Length of output: 932
🏁 Script executed:
# Verify the pattern - check if all handlers extract namespace/repo_name but don't record them
sed -n '140,160p' oxen-rust/crates/server/src/controllers/workspaces.rsRepository: Oxen-AI/Oxen
Length of output: 918
🏁 Script executed:
# Find the path_param function definition
rg -A 5 "fn path_param" --type rustRepository: Oxen-AI/Oxen
Length of output: 994
🏁 Script executed:
# Check if this file imports tracing and uses Span::record anywhere
grep -n "use.*tracing\|use.*log" oxen-rust/crates/server/src/controllers/workspaces.rs | head -20Repository: Oxen-AI/Oxen
Length of output: 38
🏁 Script executed:
# Check if there's any tracing::Span usage or record calls in the entire workspaces.rs
grep -n "tracing\|Span\|record" oxen-rust/crates/server/src/controllers/workspaces.rs | head -20Repository: Oxen-AI/Oxen
Length of output: 642
🏁 Script executed:
# Check the context around line 427 to see the .record() pattern usage
sed -n '420,435p' oxen-rust/crates/server/src/controllers/workspaces.rsRepository: Oxen-AI/Oxen
Length of output: 747
Record namespace and repo_name into these spans.
fields(namespace, repo_name) only declares empty fields. These values are extracted via path_param() but skip_all suppresses automatic argument capture, and without Span::record() calls, the spans never carry the actual repo context.
Also applies to: 147, 191, 202, 255, 281, 323, 363
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/server/src/controllers/workspaces.rs` at line 23, The
tracing attributes declare fields(namespace, repo_name) but never populate them
because skip_all prevents auto-capture; after you extract namespace and
repo_name via path_param() in each annotated handler, explicitly record them
into the current span (e.g., call Span::current().record(...) or use
span.record(...)) so the span carries the actual values; update each traced
function bearing #[tracing::instrument(skip_all, fields(namespace, repo_name))]
(the ones mentioned in the review) to record the extracted namespace and
repo_name into the span immediately after extraction.
| #[tracing::instrument(skip_all, fields(namespace, repo_name))] | ||
| pub async fn get( | ||
| req: HttpRequest, | ||
| query: web::Query<DFOptsQuery>, | ||
| ) -> Result<HttpResponse, OxenHttpError> { | ||
| metrics::counter!("oxen_server_workspaces_data_frames_get_total").increment(1); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Expect no matches today; if so, the declared span fields are never recorded.
rg -n 'Span::current\(\)\.record|\.record\("namespace"|\.record\("repo_name"' \
oxen-rust/crates/server/src/controllers/workspaces/data_frames.rsRepository: Oxen-AI/Oxen
Length of output: 38
🏁 Script executed:
# Check if namespace and repo_name are being parsed from the request in these handlers
rg -n 'path_param.*namespace|path_param.*repo_name' \
oxen-rust/crates/server/src/controllers/workspaces/data_frames.rs | head -20Repository: Oxen-AI/Oxen
Length of output: 1061
🏁 Script executed:
# Let's also look at the actual handler implementations at a few key line ranges
# to confirm they parse the values but don't record them
sed -n '90,110p' oxen-rust/crates/server/src/controllers/workspaces/data_frames.rs
sed -n '221,240p' oxen-rust/crates/server/src/controllers/workspaces/data_frames.rsRepository: Oxen-AI/Oxen
Length of output: 1989
Tracing fields namespace and repo_name never get populated on these spans.
#[instrument(skip_all, fields(namespace, repo_name))] declares empty fields. Both values are parsed from req using path_param(), but nothing records them to the span. These handlers emit spans without repo identity, losing observability.
📌 Suggested fix pattern
let namespace = path_param(&req, "namespace")?;
let repo_name = path_param(&req, "repo_name")?;
+let span = tracing::Span::current();
+span.record("namespace", &tracing::field::display(&namespace));
+span.record("repo_name", &tracing::field::display(&repo_name));Affects: get (90-95), get_schema (221-223), download (275-280), download_streaming (406-411), get_by_branch (479-484), diff (531-537), put (599-601), delete (630-632), rename (650-652)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/server/src/controllers/workspaces/data_frames.rs` around
lines 90 - 95, The tracing span fields "namespace" and "repo_name" are declared
but never populated; after you parse namespace and repo_name from the
HttpRequest (via path_param()) in each handler (get, get_schema, download,
download_streaming, get_by_branch, diff, put, delete, rename) record those
values into the current span—e.g., obtain the parsed strings and call
tracing::Span::current().record("namespace", &namespace.as_str()) and
.record("repo_name", &repo_name.as_str()) (or create a new span with those
fields) so the declared fields are set on the emitted span for proper
observability.
| metrics::counter!("oxen_server_workspaces_data_frames_get_by_branch_total").increment(1); | ||
| let app_data = app_data(&req).unwrap(); |
There was a problem hiding this comment.
Don't panic on missing app data.
unwrap() turns a recoverable extractor failure into a handler panic. Keeping ? here preserves the existing HTTP error path instead of unwinding a request worker.
🛠️ Minimal fix
- let app_data = app_data(&req).unwrap();
+ let app_data = app_data(&req)?;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| metrics::counter!("oxen_server_workspaces_data_frames_get_by_branch_total").increment(1); | |
| let app_data = app_data(&req).unwrap(); | |
| metrics::counter!("oxen_server_workspaces_data_frames_get_by_branch_total").increment(1); | |
| let app_data = app_data(&req)?; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/server/src/controllers/workspaces/data_frames.rs` around
lines 484 - 485, The handler currently calls app_data(&req).unwrap(), which can
panic on extractor failure; replace the unwrap() with the fallible operator
(app_data(&req)?), and propagate the error by ensuring the enclosing function
signature returns an appropriate Result (e.g., Result<impl Responder,
actix_web::Error> or the existing handler error type). Update any necessary use
of the returned app_data variable afterwards (no other logic changes), or if the
handler must map the error, convert it with .map_err(...) to the handler's error
type instead of unwrapping.
| } | ||
|
|
||
| /// Get file from workspace | ||
| #[tracing::instrument(skip_all, fields(namespace, repo_name))] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify instrumented handlers declare namespace/repo_name fields
# and whether those fields are explicitly recorded in this file.
rg -n '^\#\[tracing::instrument\(.*fields\(namespace, repo_name\).*\)\]' oxen-rust/crates/server/src/controllers/workspaces/files.rs
rg -n 'Span::current\(\)\.record\("namespace"|"repo_name"' oxen-rust/crates/server/src/controllers/workspaces/files.rsRepository: Oxen-AI/Oxen
Length of output: 1345
🏁 Script executed:
# Read the file and check function bodies for span recording patterns
cat -n oxen-rust/crates/server/src/controllers/workspaces/files.rs | sed -n '53,100p'Repository: Oxen-AI/Oxen
Length of output: 3150
🏁 Script executed:
# Search for any span.record calls in the entire file, with broader pattern
rg -n 'span\s*\.\s*record|Span::current\(\)' oxen-rust/crates/server/src/controllers/workspaces/files.rsRepository: Oxen-AI/Oxen
Length of output: 38
🏁 Script executed:
# Check for any tracing-related operations in the file
rg -n 'tracing|Span' oxen-rust/crates/server/src/controllers/workspaces/files.rs | head -30Repository: Oxen-AI/Oxen
Length of output: 456
🏁 Script executed:
# Count total lines in the file to understand scope
wc -l oxen-rust/crates/server/src/controllers/workspaces/files.rsRepository: Oxen-AI/Oxen
Length of output: 119
Record declared span fields in all instrumented handlers.
At lines 53, 204, 283, 349, 419, and 437, fields(namespace, repo_name) are declared but never recorded, leaving these fields empty in trace output.
Suggested fix pattern
-#[tracing::instrument(skip_all, fields(namespace, repo_name))]
+#[tracing::instrument(
+ skip_all,
+ fields(namespace = tracing::field::Empty, repo_name = tracing::field::Empty)
+)]
pub async fn get(
req: HttpRequest,
query: web::Query<WorkspaceFileQueryParams>,
) -> Result<HttpResponse, OxenHttpError> {
metrics::counter!("oxen_server_workspaces_files_get_total").increment(1);
let app_data = app_data(&req)?;
let namespace = path_param(&req, "namespace")?;
let repo_name = path_param(&req, "repo_name")?;
+ let span = tracing::Span::current();
+ span.record("namespace", tracing::field::display(&namespace));
+ span.record("repo_name", tracing::field::display(&repo_name));Apply the same pattern to add (204), add_version_files (283), rm_files (349), validate (419), and mv (437).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #[tracing::instrument(skip_all, fields(namespace, repo_name))] | |
| #[tracing::instrument( | |
| skip_all, | |
| fields(namespace = tracing::field::Empty, repo_name = tracing::field::Empty) | |
| )] | |
| pub async fn get( | |
| req: HttpRequest, | |
| query: web::Query<WorkspaceFileQueryParams>, | |
| ) -> Result<HttpResponse, OxenHttpError> { | |
| metrics::counter!("oxen_server_workspaces_files_get_total").increment(1); | |
| let app_data = app_data(&req)?; | |
| let namespace = path_param(&req, "namespace")?; | |
| let repo_name = path_param(&req, "repo_name")?; | |
| let span = tracing::Span::current(); | |
| span.record("namespace", tracing::field::display(&namespace)); | |
| span.record("repo_name", tracing::field::display(&repo_name)); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/server/src/controllers/workspaces/files.rs` at line 53, The
declared span fields "namespace" and "repo_name" are never recorded for the
instrumented handlers; update each instrumented handler (the top-level handler
that has #[tracing::instrument(skip_all, fields(namespace, repo_name))] and the
functions add, add_version_files, rm_files, validate, and mv) to record the
actual values into the current span immediately after entering the function by
calling tracing::Span::current().record(...) for both "namespace" and
"repo_name" (use tracing::field::display or debug as appropriate for the
variable types), ensuring the span fields are populated in trace output.
There was a problem hiding this comment.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
oxen-rust/crates/lib/src/api/client/versions.rs (1)
502-531:⚠️ Potential issue | 🟠 MajorDuration metric is missing on the success path.
At Line 519 the function returns
Ok(())before histogram recording at Line 529, so successful uploads never emitoxen_client_versions_multipart_batch_upload_with_retry_duration_ms.🐛 Proposed fix
pub async fn multipart_batch_upload_with_retry( @@ for attempt in 0..max_retries { files_to_retry = multipart_batch_upload(local_repo, remote_repo, chunk, client, files_to_retry).await?; if files_to_retry.is_empty() { + metrics::histogram!("oxen_client_versions_multipart_batch_upload_with_retry_duration_ms") + .record(timer.elapsed().as_millis() as f64); return Ok(()); } @@ metrics::histogram!("oxen_client_versions_multipart_batch_upload_with_retry_duration_ms") .record(timer.elapsed().as_millis() as f64);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/versions.rs` around lines 502 - 531, The duration histogram is only recorded on the failure path; update multipart_batch_upload_with_retry to record the elapsed duration before any early return: capture the timer at the top (already present), and call metrics::histogram!("oxen_client_versions_multipart_batch_upload_with_retry_duration_ms").record(timer.elapsed().as_millis() as f64) immediately before returning Ok(()) inside the loop (and ensure it's also recorded at the end of the function as already present) so successful uploads emit the metric; modify the return path in multipart_batch_upload_with_retry accordingly.oxen-rust/crates/lib/src/api/client/workspaces/files.rs (1)
1244-1291:⚠️ Potential issue | 🟠 MajorRecord
downloadduration on both success and error paths.Line 1279 and Line 1284 return early before the histogram is recorded, so failed downloads are invisible in latency metrics.
🛠️ Proposed fix
pub async fn download( remote_repo: &RemoteRepository, workspace_id: &str, path: &str, output_path: Option<&Path>, ) -> Result<(), OxenError> { metrics::counter!("oxen_client_workspaces_files_download_total").increment(1); let timer = std::time::Instant::now(); @@ - if response.status().is_success() { - // Save the raw file contents from the response stream - let output_path = output_path.unwrap_or_else(|| Path::new(path)); - let output_dir = output_path.parent().unwrap_or_else(|| Path::new("")); - - if !output_dir.exists() { - util::fs::create_dir_all(output_dir)?; - } - - let mut file = tokio::fs::File::create(&output_path).await?; - let mut stream = response.bytes_stream(); - - while let Some(chunk) = stream.next().await { - let chunk = chunk?; - file.write_all(&chunk).await?; - } - file.flush().await?; - } else { - let status = response.status(); - - if status == reqwest::StatusCode::NOT_FOUND { - return Err(OxenError::path_does_not_exist(path)); - } - - log::error!("api::client::workspace::files::download failed with status: {status}"); - let body = client::parse_json_body(&url, response).await?; - return Err(OxenError::basic_str(format!( - "Error: Could not download file {body:?}" - ))); - } - - metrics::histogram!("oxen_client_workspaces_files_download_duration_ms") - .record(timer.elapsed().as_millis() as f64); - Ok(()) + let result: Result<(), OxenError> = async { + if response.status().is_success() { + // Save the raw file contents from the response stream + let output_path = output_path.unwrap_or_else(|| Path::new(path)); + let output_dir = output_path.parent().unwrap_or_else(|| Path::new("")); + + if !output_dir.exists() { + util::fs::create_dir_all(output_dir)?; + } + + let mut file = tokio::fs::File::create(&output_path).await?; + let mut stream = response.bytes_stream(); + + while let Some(chunk) = stream.next().await { + let chunk = chunk?; + file.write_all(&chunk).await?; + } + file.flush().await?; + Ok(()) + } else { + let status = response.status(); + if status == reqwest::StatusCode::NOT_FOUND { + Err(OxenError::path_does_not_exist(path)) + } else { + log::error!("api::client::workspace::files::download failed with status: {status}"); + let body = client::parse_json_body(&url, response).await?; + Err(OxenError::basic_str(format!( + "Error: Could not download file {body:?}" + ))) + } + } + } + .await; + + metrics::histogram!("oxen_client_workspaces_files_download_duration_ms") + .record(timer.elapsed().as_millis() as f64); + result }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/workspaces/files.rs` around lines 1244 - 1291, The download code records duration only on the success path; early returns on error (the NOT_FOUND return and the Err after parsing body) skip the metrics::histogram!("oxen_client_workspaces_files_download_duration_ms").record(...) call. Fix by ensuring the histogram is recorded before any return: refactor the function to produce a Result value (e.g., let result = { ... } ) that captures success or error without returning early, then call metrics::histogram!("oxen_client_workspaces_files_download_duration_ms").record(timer.elapsed().as_millis() as f64); and finally return the result; alternatively, record the histogram immediately before each early return in the branches that call OxenError::path_does_not_exist(path) and the Err after client::parse_json_body(&url, response).await?. Reference symbols: timer, metrics::histogram!("oxen_client_workspaces_files_download_duration_ms"), response, client::parse_json_body, OxenError::path_does_not_exist.
🧹 Nitpick comments (8)
oxen-rust/crates/lib/src/repositories/push.rs (1)
65-75: Missing duration histogram forpush_remote_branch.The
pushfunction records both a counter and a duration histogram, butpush_remote_branchonly increments a counter. For consistent observability across push operations, consider adding duration measurement here as well.♻️ Proposed fix to add duration measurement
#[tracing::instrument(skip(repo, opts), fields(repo_path = %repo.path.display()))] pub async fn push_remote_branch( repo: &LocalRepository, opts: &PushOpts, ) -> Result<Branch, OxenError> { metrics::counter!("oxen_repo_push_push_remote_branch_total").increment(1); - match repo.min_version() { + let timer = std::time::Instant::now(); + let result = match repo.min_version() { MinOxenVersion::V0_10_0 => panic!("v0.10.0 is deprecated"), _ => core::v_latest::push::push_remote_branch(repo, opts).await, - } + }; + metrics::histogram!("oxen_repo_push_push_remote_branch_duration_ms") + .record(timer.elapsed().as_millis() as f64); + result }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/repositories/push.rs` around lines 65 - 75, Add a duration histogram for push_remote_branch similar to the one used in push: start a timer (e.g., Instant::now()) before the version match, await the core::v_latest::push::push_remote_branch call, then compute elapsed and record it to a histogram metric (name like "oxen_repo_push_push_remote_branch_duration_seconds") in addition to the existing counter increment; keep the existing match on repo.min_version() and ensure the timer is recorded even if push_remote_branch returns an error by recording after the await or in a finally-like block.oxen-rust/crates/lib/src/repositories/pull.rs (1)
30-30: Consider adding duration histograms forpull_allandpull_remote_branchtoo.Only
pullcurrently emits latency; adding timing to the other two improves comparability and dashboard coverage.📈 Suggested parity instrumentation
pub async fn pull_all(repo: &LocalRepository) -> Result<(), OxenError> { metrics::counter!("oxen_repo_pull_pull_all_total").increment(1); - match repo.min_version() { + let timer = std::time::Instant::now(); + let result = match repo.min_version() { MinOxenVersion::V0_10_0 => panic!("v0.10.0 no longer supported"), _ => core::v_latest::pull::pull_all(repo).await, - } + }; + metrics::histogram!("oxen_repo_pull_pull_all_duration_ms") + .record(timer.elapsed().as_millis() as f64); + result } pub async fn pull_remote_branch( repo: &LocalRepository, fetch_opts: &FetchOpts, ) -> Result<(), OxenError> { metrics::counter!("oxen_repo_pull_pull_remote_branch_total").increment(1); - match repo.min_version() { + let timer = std::time::Instant::now(); + let result = match repo.min_version() { MinOxenVersion::V0_10_0 => panic!("v0.10.0 no longer supported"), _ => core::v_latest::pull::pull_remote_branch(repo, fetch_opts).await, - } + }; + metrics::histogram!("oxen_repo_pull_pull_remote_branch_duration_ms") + .record(timer.elapsed().as_millis() as f64); + result }Also applies to: 43-43
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/repositories/pull.rs` at line 30, The pull_all and pull_remote_branch paths currently only increment counters but lack duration histograms; add timing instrumentation analogous to the existing pull latency metric by wrapping the body of pull_all and pull_remote_branch with a histogram timer (e.g., metrics::histogram!("oxen_repo_pull_pull_all_duration_seconds") and metrics::histogram!("oxen_repo_pull_remote_branch_duration_seconds") or following the naming used for pull) and record elapsed duration on completion/failure, keeping the existing counter increments (symbols to update: the metric invocation at metrics::counter!("oxen_repo_pull_pull_all_total") and the corresponding counter in pull_remote_branch and the pull function's histogram usage as a pattern to mirror).oxen-rust/crates/server/src/controllers/workspaces/files.rs (1)
80-84: Consider adding duration metrics for consistency.The
gethandler increments a counter but doesn't record duration, unlike theaddhandler which tracks both. Given thatgetcan involve expensive operations (image resize, video thumbnail generation, file streaming), capturing latency would provide useful observability data.♻️ Suggested pattern (similar to `add`)
pub async fn get( req: HttpRequest, query: web::Query<WorkspaceFileQueryParams>, ) -> Result<HttpResponse, OxenHttpError> { metrics::counter!("oxen_server_workspaces_files_get_total").increment(1); + let timer = std::time::Instant::now(); let app_data = app_data(&req)?; // ... existing code ... + // Before each return, record duration: + // metrics::histogram!("oxen_server_workspaces_files_get_duration_ms") + // .record(timer.elapsed().as_millis() as f64);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/workspaces/files.rs` around lines 80 - 84, The get handler (pub async fn get) currently only increments metrics::counter!("oxen_server_workspaces_files_get_total") but should also record latency like the add handler; wrap the handler's work with a start time (e.g., let start = Instant::now()) at the top of get and, before returning, record the elapsed duration into a duration metric (e.g., metrics::histogram! or metrics::timing! with a name like "oxen_server_workspaces_files_get_duration_seconds") so you capture request latency for expensive operations (image resize, thumbnail, streaming) alongside the existing counter.oxen-rust/crates/lib/src/api/client/workspaces/data_frames.rs (1)
280-287: Consider adding duration histograms to other network-calling functions.Currently only
downloadtracks operation duration. Other functions likeput,diff, andrename_data_framealso make HTTP requests and could benefit from latency histograms for debugging slow operations. This is optional but would improve observability consistency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/workspaces/data_frames.rs` around lines 280 - 287, The telemetry for network operations is inconsistent: only download records duration. For better observability, add a duration histogram around the HTTP call(s) in rename_data_frame (and mirror for other network functions like put and diff) — capture start = Instant::now() before the request, execute the existing remote call(s) inside the function, then record elapsed.as_secs_f64() into a metrics histogram (e.g. metrics::histogram!("oxen_client_workspaces_data_frames_rename_data_frame_duration_seconds").record(elapsed)) and include useful labels/tags if your metrics library supports them; ensure the histogram is recorded in both success and error paths so latency is always reported.oxen-rust/crates/server/src/controllers/versions.rs (1)
515-525: Duration histogram not recorded on error paths.The timer starts at line 515, but if
save_multipartsreturns an error (line 521), the histogram at lines 524-525 is skipped. This means failed uploads won't contribute to latency metrics, which could make the metrics optimistically skewed.If tracking all request durations (success and failure) is desired, consider using a scope guard or restructuring:
♻️ Option: Record duration for all requests
pub async fn batch_upload( req: HttpRequest, payload: Multipart, ) -> Result<HttpResponse, OxenHttpError> { metrics::counter!("oxen_server_versions_batch_upload_total").increment(1); let timer = std::time::Instant::now(); + let result = batch_upload_inner(req, payload).await; + metrics::histogram!("oxen_server_versions_batch_upload_duration_ms") + .record(timer.elapsed().as_millis() as f64); + result +} + +async fn batch_upload_inner( + req: HttpRequest, + payload: Multipart, +) -> Result<HttpResponse, OxenHttpError> { let app_data = app_data(&req)?; // ... rest of implementation - metrics::histogram!("oxen_server_versions_batch_upload_duration_ms") - .record(timer.elapsed().as_millis() as f64); Ok(HttpResponse::Ok().json(ErrorFilesResponse {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/versions.rs` around lines 515 - 525, The histogram is skipped when save_multiparts returns an Err because the function returns early; ensure the duration is always recorded by capturing the start Instant (timer) and recording metrics::histogram!("oxen_server_versions_batch_upload_duration_ms").record(...) in a finally-like scope regardless of success or error—e.g., create a small scope guard/Drop helper or wrap the save_multiparts call in a match that records timer.elapsed() before propagating errors; reference the timer variable, the save_multiparts call, and the metrics::histogram! invocation when making the change.oxen-rust/crates/server/src/controllers/commits.rs (1)
74-74: Tracing span fields are declared but never populated.The pattern
#[tracing::instrument(skip_all, fields(namespace, repo_name))]declaresnamespaceandrepo_namefields, but these remain empty because:
skip_allprevents automatic capture of function arguments- The values are extracted after the span is created (e.g., lines 92-93)
- No
Span::current().record()calls populate these fieldsThis applies to all 17 instrumented handlers in this file. The spans will have empty field values, reducing observability.
🔧 Suggested fix pattern
#[tracing::instrument(skip_all, fields(namespace, repo_name))] pub async fn index(req: HttpRequest) -> actix_web::Result<HttpResponse, OxenHttpError> { metrics::counter!("oxen_server_commits_index_total").increment(1); let app_data = app_data(&req)?; let namespace = path_param(&req, "namespace")?; let repo_name = path_param(&req, "repo_name")?; + tracing::Span::current().record("namespace", &namespace); + tracing::Span::current().record("repo_name", &repo_name); let repo = get_repo(&app_data.path, namespace, repo_name)?;Apply this pattern to all handlers that declare these fields.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/server/src/controllers/commits.rs` at line 74, The tracing spans declare fields but never get populated: for every handler in commits.rs annotated with #[tracing::instrument(skip_all, fields(namespace, repo_name))], either remove skip_all so the macro captures function arguments (rename args to match the field names if needed) or explicitly populate the fields after you parse/extract them by calling tracing::Span::current().record("namespace", &namespace.as_str()) and .record("repo_name", &repo_name.as_str()) (apply this pattern to all 17 instrumented handlers in this file).oxen-rust/crates/lib/src/api/client/versions.rs (1)
211-235: Record duration once at function exit to cover all terminal outcomes.Right now, non-auth error exits (Line 236) bypass the histogram. Consider recording once after
with_retry(...).awaitreturns.♻️ Proposed refactor for complete timer coverage
pub async fn download_data_from_version_paths( @@ ) -> Result<u64, OxenError> { metrics::counter!("oxen_client_versions_download_data_from_version_paths_total").increment(1); let timer = std::time::Instant::now(); let config = retry::RetryConfig::default(); - retry::with_retry(&config, |_attempt| async { + let result = retry::with_retry(&config, |_attempt| async { match try_download_data_from_version_paths(remote_repo, hashes, local_repo).await { - Ok(val) => { - metrics::histogram!( - "oxen_client_versions_download_data_from_version_paths_duration_ms" - ) - .record(timer.elapsed().as_millis() as f64); - Ok(val) - } - Err(OxenError::Authentication(val)) => { - metrics::histogram!( - "oxen_client_versions_download_data_from_version_paths_duration_ms" - ) - .record(timer.elapsed().as_millis() as f64); - Err(OxenError::Authentication(val)) - } + Ok(val) => Ok(val), + Err(OxenError::Authentication(val)) => Err(OxenError::Authentication(val)), Err(err) => Err(err), } }) - .await + .await; + metrics::histogram!("oxen_client_versions_download_data_from_version_paths_duration_ms") + .record(timer.elapsed().as_millis() as f64); + result }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/versions.rs` around lines 211 - 235, The duration histogram is only recorded on Ok and Authentication error paths inside the retry closure, so moves/other errors skip recording; refactor download_data_from_version_paths to call retry::with_retry(...).await first, capture its Result, then record metrics::histogram!("oxen_client_versions_download_data_from_version_paths_duration_ms").record(timer.elapsed().as_millis() as f64) once immediately after with_retry returns, and finally return or propagate the captured Result from that call; remove the duplicate histogram.record calls inside the match/closure (referencing download_data_from_version_paths, try_download_data_from_version_paths, with_retry, and the histogram metric name).oxen-rust/crates/lib/src/repositories/merge.rs (1)
144-148: Consider addingbranch_nameas an explicit span field inmerge.
branch_nameis currently skipped, which drops useful trace context during merge debugging.Proposed tweak
-#[tracing::instrument(skip(repo, branch_name), fields(repo_path = %repo.path.display()))] +#[tracing::instrument( + skip(repo, branch_name), + fields(repo_path = %repo.path.display(), branch_name = %branch_name.as_ref()) +)] pub async fn merge(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/repositories/merge.rs` around lines 144 - 148, The tracing span for merge currently skips branch_name which loses useful trace context; update the #[tracing::instrument(...)] on the merge function to only skip repo and add branch_name as a span field (e.g. change skip(repo, branch_name) to skip(repo) and add fields(repo_path = %repo.path.display(), branch_name = %branch_name.as_ref()) so the span records the branch name while still skipping the repo value; target the #[tracing::instrument] on the merge function and the branch_name parameter to implement this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@oxen-rust/crates/cli/src/cmd/fsck.rs`:
- Line 57: The CLI prints elapsed time using result.elapsed.as_millis() but
labels it with "s", causing a units mismatch; update the fsck output to either
change the suffix to "ms" or convert the elapsed value to seconds (e.g., use
result.elapsed.as_secs_f64()) so the printed unit matches the value — locate the
println in fsck.rs that references result.elapsed.as_millis() and adjust the
value or the suffix accordingly.
In `@oxen-rust/crates/lib/src/api/client/import.rs`:
- Around line 79-80: The upload timing is only recorded after a successful
response parse, leaving error paths unmeasured; capture the start timer before
the HTTP/send/retry block and ensure
metrics::histogram!("oxen_client_import_upload_zip_duration_ms").record(timer.elapsed().as_millis()
as f64) is executed exactly once after the retry logic completes regardless of
success or failure (e.g., in a finally/cleanup block or after matching the
Result from the request), so move the record call out of the success-only branch
and into the post-retry completion path surrounding the send/parse/retry logic
and timer variable used in import.rs.
In `@oxen-rust/crates/lib/src/api/client/workspaces/files.rs`:
- Line 1237: The tracing spans for download and validate_upload_feasibility
currently only skip remote_repo; update their tracing attributes to also skip
workspace_id, path, and output_path (as applicable) to avoid capturing sensitive
file paths and high-cardinality fields. Locate the
#[tracing::instrument(skip(remote_repo))] annotation above the download function
and change it to skip(remote_repo, workspace_id, path, output_path) (and
similarly update the annotation above validate_upload_feasibility to include
workspace_id, path, output_path) so those parameters are omitted from span
fields.
In `@oxen-rust/crates/lib/src/core/v_latest/index/commit_merkle_tree.rs`:
- Line 28: The tracing spans are currently capturing the full Commit struct even
though only commit.id is recorded; update each tracing attribute that has
fields(commit_id = %commit.id) to include commit in the skip list (e.g., change
#[tracing::instrument(skip(repo), fields(commit_id = %commit.id))] to
#[tracing::instrument(skip(repo, commit), fields(commit_id = %commit.id))]) so
the full struct is not serialized into spans; apply this change to every
function annotated with tracing::instrument that records commit_id to ensure
commit is skipped.
In `@oxen-rust/crates/lib/src/repositories/checkout.rs`:
- Around line 68-73: The remote-mode branch of the checkout path calls
mut_repo.set_workspace(value) even when value may be a commit id, causing
failures for detached/commit checkouts; guard the workspace update so it only
runs for actual branch/workspace names (e.g. check via an existing branch lookup
or a repo.is_branch(value) helper) before calling mut_repo.set_workspace(value)
and mut_repo.save(), or alternatively catch and ignore the specific "workspace
not found" error from set_workspace when value is a commit id.
In `@oxen-rust/crates/lib/src/repositories/commits.rs`:
- Around line 49-50: The tracing attributes for functions that accept a
user-provided message (e.g., the commit function `pub fn commit(repo:
&LocalRepository, message: &str) -> Result<Commit, OxenError>`) must skip the
`message` field so free-form text is not recorded; update the
#[tracing::instrument(...)] annotations to include message in skip (e.g.,
skip(repo, message)) for this function and the other two functions referenced
(the ones around lines 62–67 and 79–80), mirroring how `create_initial_commit`
already skips `message`.
In `@oxen-rust/crates/lib/src/repositories/metadata/audio.rs`:
- Line 18: The code stores milliseconds into a seconds field by using
duration.as_millis(); change the computation so num_seconds receives seconds
(not ms) — e.g. replace the use of duration.as_millis() with
duration.as_secs_f64() (or divide as_millis() by 1000.0) when assigning to the
variable named seconds so the value matches the API contract expected by the
num_seconds field.
In `@oxen-rust/crates/lib/src/repositories/metadata/video.rs`:
- Line 24: The code writes milliseconds into MetadataVideo.video.num_seconds
(using video.duration().as_millis() as f64) which breaks the contract; change
the duration extraction to seconds (e.g. use video.duration().as_secs_f64() or
otherwise convert Duration to seconds) so MetadataVideo.video.num_seconds stores
seconds again and tests/display (referenced in this file and metadata_video.rs)
remain correct.
In `@oxen-rust/crates/lib/src/repositories/pull.rs`:
- Line 15: The tracing span currently logs the absolute filesystem path via
fields(repo_path = %repo.path.display()), which can leak sensitive local info;
update each tracing::instrument annotation (the three occurrences using
fields(repo_path = %repo.path.display())) to instead log a non-sensitive
identifier such as repo.name or repo.id (e.g., fields(repo = %repo.name) or
fields(repo_id = %repo.id)) or a redacted token (e.g., fields(repo =
%repo_path_redacted)), and remove the direct repo.path.display() usage from the
span fields so only the safe identifier is emitted.
In `@oxen-rust/crates/server/src/controllers/export.rs`:
- Around line 31-32: The duration histogram
oxen_server_export_download_zip_duration_ms is only recorded on the success
path; ensure the timer started by the local variable timer is observed on all
exits (including early Err returns) by recording the elapsed time before every
return or by introducing a scope guard/RAII that logs
metrics::histogram!("oxen_server_export_download_zip_duration_ms", elapsed_ms)
in its Drop; update the export handler (the function that increments
oxen_server_export_download_zip_total and defines timer) to call
metrics::histogram! with timer.elapsed().as_millis() (or
elapsed().as_secs_f64()*1000) on failure paths and in the current success path
so timing is always emitted.
- Line 13: The span fields declared in the tracing attribute
(#[tracing::instrument(skip_all, fields(namespace, repo_name))]) are never
populated; after you extract the path parameters into the variables namespace
and repo_name in this handler, record them on the current span (e.g., call
tracing::Span::current().record("namespace",
&tracing::field::display(&namespace)) and
tracing::Span::current().record("repo_name",
&tracing::field::display(&repo_name))) so the declared fields are populated in
the span.
In `@oxen-rust/crates/server/src/controllers/prune.rs`:
- Around line 43-44: The duration histogram is only recorded on the happy path,
so failures that return via ? never emit oxen_server_prune_prune_duration_ms;
wrap the timing so it always records (e.g. create a small Drop guard that
captures std::time::Instant::now() and in drop calls
metrics::histogram!("oxen_server_prune_prune_duration_ms").record(elapsed_ms) or
move the histogram.observe call into a finally-like scope after awaiting inner
async work). Apply the same change for the second timing block noted around the
code that follows the counter increment at the second site (the other
std::time::Instant::now() / histogram pair) so both success and error paths
always emit duration.
- Line 38: The tracing span declared with #[tracing::instrument(skip_all,
fields(namespace, repo_name))] leaves namespace and repo_name empty at runtime;
after you extract the namespace and repo_name variables (the values assigned on
the lines referenced), record them into the current span (e.g. via
tracing::Span::current().record("namespace", &namespace) and
.record("repo_name", &repo_name) or equivalent) so the span fields are populated
with the actual repo context for prune handler functions in controllers::prune.
In `@oxen-rust/crates/server/src/controllers/versions.rs`:
- Line 37: The tracing span fields `namespace` and `repo_name` are declared in
the attribute but remain empty because `#[tracing::instrument(skip_all,
fields(namespace, repo_name))]` prevents auto-recording; after you extract the
path params inside each handler (e.g., metadata, clean, download,
batch_download, batch_upload) call Span::current().record("namespace",
&namespace.as_str()) and Span::current().record("repo_name",
&repo_name.as_str()) (use the actual variable names you extracted) to populate
those fields so the span contains the correct values.
---
Outside diff comments:
In `@oxen-rust/crates/lib/src/api/client/versions.rs`:
- Around line 502-531: The duration histogram is only recorded on the failure
path; update multipart_batch_upload_with_retry to record the elapsed duration
before any early return: capture the timer at the top (already present), and
call
metrics::histogram!("oxen_client_versions_multipart_batch_upload_with_retry_duration_ms").record(timer.elapsed().as_millis()
as f64) immediately before returning Ok(()) inside the loop (and ensure it's
also recorded at the end of the function as already present) so successful
uploads emit the metric; modify the return path in
multipart_batch_upload_with_retry accordingly.
In `@oxen-rust/crates/lib/src/api/client/workspaces/files.rs`:
- Around line 1244-1291: The download code records duration only on the success
path; early returns on error (the NOT_FOUND return and the Err after parsing
body) skip the
metrics::histogram!("oxen_client_workspaces_files_download_duration_ms").record(...)
call. Fix by ensuring the histogram is recorded before any return: refactor the
function to produce a Result value (e.g., let result = { ... } ) that captures
success or error without returning early, then call
metrics::histogram!("oxen_client_workspaces_files_download_duration_ms").record(timer.elapsed().as_millis()
as f64); and finally return the result; alternatively, record the histogram
immediately before each early return in the branches that call
OxenError::path_does_not_exist(path) and the Err after
client::parse_json_body(&url, response).await?. Reference symbols: timer,
metrics::histogram!("oxen_client_workspaces_files_download_duration_ms"),
response, client::parse_json_body, OxenError::path_does_not_exist.
---
Nitpick comments:
In `@oxen-rust/crates/lib/src/api/client/versions.rs`:
- Around line 211-235: The duration histogram is only recorded on Ok and
Authentication error paths inside the retry closure, so moves/other errors skip
recording; refactor download_data_from_version_paths to call
retry::with_retry(...).await first, capture its Result, then record
metrics::histogram!("oxen_client_versions_download_data_from_version_paths_duration_ms").record(timer.elapsed().as_millis()
as f64) once immediately after with_retry returns, and finally return or
propagate the captured Result from that call; remove the duplicate
histogram.record calls inside the match/closure (referencing
download_data_from_version_paths, try_download_data_from_version_paths,
with_retry, and the histogram metric name).
In `@oxen-rust/crates/lib/src/api/client/workspaces/data_frames.rs`:
- Around line 280-287: The telemetry for network operations is inconsistent:
only download records duration. For better observability, add a duration
histogram around the HTTP call(s) in rename_data_frame (and mirror for other
network functions like put and diff) — capture start = Instant::now() before the
request, execute the existing remote call(s) inside the function, then record
elapsed.as_secs_f64() into a metrics histogram (e.g.
metrics::histogram!("oxen_client_workspaces_data_frames_rename_data_frame_duration_seconds").record(elapsed))
and include useful labels/tags if your metrics library supports them; ensure the
histogram is recorded in both success and error paths so latency is always
reported.
In `@oxen-rust/crates/lib/src/repositories/merge.rs`:
- Around line 144-148: The tracing span for merge currently skips branch_name
which loses useful trace context; update the #[tracing::instrument(...)] on the
merge function to only skip repo and add branch_name as a span field (e.g.
change skip(repo, branch_name) to skip(repo) and add fields(repo_path =
%repo.path.display(), branch_name = %branch_name.as_ref()) so the span records
the branch name while still skipping the repo value; target the
#[tracing::instrument] on the merge function and the branch_name parameter to
implement this change.
In `@oxen-rust/crates/lib/src/repositories/pull.rs`:
- Line 30: The pull_all and pull_remote_branch paths currently only increment
counters but lack duration histograms; add timing instrumentation analogous to
the existing pull latency metric by wrapping the body of pull_all and
pull_remote_branch with a histogram timer (e.g.,
metrics::histogram!("oxen_repo_pull_pull_all_duration_seconds") and
metrics::histogram!("oxen_repo_pull_remote_branch_duration_seconds") or
following the naming used for pull) and record elapsed duration on
completion/failure, keeping the existing counter increments (symbols to update:
the metric invocation at metrics::counter!("oxen_repo_pull_pull_all_total") and
the corresponding counter in pull_remote_branch and the pull function's
histogram usage as a pattern to mirror).
In `@oxen-rust/crates/lib/src/repositories/push.rs`:
- Around line 65-75: Add a duration histogram for push_remote_branch similar to
the one used in push: start a timer (e.g., Instant::now()) before the version
match, await the core::v_latest::push::push_remote_branch call, then compute
elapsed and record it to a histogram metric (name like
"oxen_repo_push_push_remote_branch_duration_seconds") in addition to the
existing counter increment; keep the existing match on repo.min_version() and
ensure the timer is recorded even if push_remote_branch returns an error by
recording after the await or in a finally-like block.
In `@oxen-rust/crates/server/src/controllers/commits.rs`:
- Line 74: The tracing spans declare fields but never get populated: for every
handler in commits.rs annotated with #[tracing::instrument(skip_all,
fields(namespace, repo_name))], either remove skip_all so the macro captures
function arguments (rename args to match the field names if needed) or
explicitly populate the fields after you parse/extract them by calling
tracing::Span::current().record("namespace", &namespace.as_str()) and
.record("repo_name", &repo_name.as_str()) (apply this pattern to all 17
instrumented handlers in this file).
In `@oxen-rust/crates/server/src/controllers/versions.rs`:
- Around line 515-525: The histogram is skipped when save_multiparts returns an
Err because the function returns early; ensure the duration is always recorded
by capturing the start Instant (timer) and recording
metrics::histogram!("oxen_server_versions_batch_upload_duration_ms").record(...)
in a finally-like scope regardless of success or error—e.g., create a small
scope guard/Drop helper or wrap the save_multiparts call in a match that records
timer.elapsed() before propagating errors; reference the timer variable, the
save_multiparts call, and the metrics::histogram! invocation when making the
change.
In `@oxen-rust/crates/server/src/controllers/workspaces/files.rs`:
- Around line 80-84: The get handler (pub async fn get) currently only
increments metrics::counter!("oxen_server_workspaces_files_get_total") but
should also record latency like the add handler; wrap the handler's work with a
start time (e.g., let start = Instant::now()) at the top of get and, before
returning, record the elapsed duration into a duration metric (e.g.,
metrics::histogram! or metrics::timing! with a name like
"oxen_server_workspaces_files_get_duration_seconds") so you capture request
latency for expensive operations (image resize, thumbnail, streaming) alongside
the existing counter.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0bb1d2e4-33d2-4788-beb0-4c98386614ae
📒 Files selected for processing (28)
oxen-rust/config_for_tracing_metrics.mdoxen-rust/crates/cli/src/cmd/fsck.rsoxen-rust/crates/lib/src/api/client/export.rsoxen-rust/crates/lib/src/api/client/import.rsoxen-rust/crates/lib/src/api/client/versions.rsoxen-rust/crates/lib/src/api/client/workspaces/data_frames.rsoxen-rust/crates/lib/src/api/client/workspaces/files.rsoxen-rust/crates/lib/src/core/v_latest/index/commit_merkle_tree.rsoxen-rust/crates/lib/src/repositories/add.rsoxen-rust/crates/lib/src/repositories/checkout.rsoxen-rust/crates/lib/src/repositories/clone.rsoxen-rust/crates/lib/src/repositories/commits.rsoxen-rust/crates/lib/src/repositories/merge.rsoxen-rust/crates/lib/src/repositories/metadata/audio.rsoxen-rust/crates/lib/src/repositories/metadata/video.rsoxen-rust/crates/lib/src/repositories/pull.rsoxen-rust/crates/lib/src/repositories/push.rsoxen-rust/crates/lib/src/repositories/tree.rsoxen-rust/crates/server/src/controllers/branches.rsoxen-rust/crates/server/src/controllers/commits.rsoxen-rust/crates/server/src/controllers/export.rsoxen-rust/crates/server/src/controllers/import.rsoxen-rust/crates/server/src/controllers/merger.rsoxen-rust/crates/server/src/controllers/prune.rsoxen-rust/crates/server/src/controllers/versions.rsoxen-rust/crates/server/src/controllers/versions/chunks.rsoxen-rust/crates/server/src/controllers/workspaces.rsoxen-rust/crates/server/src/controllers/workspaces/files.rs
🚧 Files skipped from review as they are similar to previous changes (9)
- oxen-rust/crates/lib/src/repositories/clone.rs
- oxen-rust/crates/server/src/controllers/import.rs
- oxen-rust/crates/server/src/controllers/branches.rs
- oxen-rust/crates/lib/src/repositories/tree.rs
- oxen-rust/crates/server/src/controllers/merger.rs
- oxen-rust/crates/lib/src/api/client/export.rs
- oxen-rust/crates/lib/src/repositories/add.rs
- oxen-rust/crates/server/src/controllers/versions/chunks.rs
- oxen-rust/crates/server/src/controllers/workspaces.rs
| } | ||
| println!(" Errors: {}", result.errors); | ||
| println!(" Elapsed: {:.2}s", result.elapsed.as_secs_f64()); | ||
| println!(" Elapsed: {:.2}s", result.elapsed.as_millis() as f64); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify no other Rust lines format milliseconds with a seconds suffix.
# Expected: no matches after fixing.
rg -nP 'Elapsed:.*\{:\.2\}s".*as_millis\(' --type=rustRepository: Oxen-AI/Oxen
Length of output: 171
🏁 Script executed:
#!/bin/bash
# Get context around line 57 to understand the type of result.elapsed
sed -n '40,75p' oxen-rust/crates/cli/src/cmd/fsck.rsRepository: Oxen-AI/Oxen
Length of output: 896
🏁 Script executed:
#!/bin/bash
# Find the definition of the result type in clean_corrupted_versions
# Search for the return type or struct definition
rg -B5 -A10 'fn clean_corrupted_versions' --type=rust | head -40Repository: Oxen-AI/Oxen
Length of output: 2924
🏁 Script executed:
#!/bin/bash
# Find the CleanCorruptedVersionsResult struct definition
rg -B2 -A10 'struct CleanCorruptedVersionsResult' --type=rustRepository: Oxen-AI/Oxen
Length of output: 1070
🏁 Script executed:
#!/bin/bash
# Check for other elapsed time formatting patterns in the codebase
rg 'elapsed|Elapsed' --type=rust -A1 -B1 | grep -E '(println|format|elapsed)' | head -20Repository: Oxen-AI/Oxen
Length of output: 2327
🏁 Script executed:
#!/bin/bash
# Check if there's a standard Duration formatting pattern elsewhere in the codebase
# to see if the codebase prefers seconds or milliseconds display
rg 'as_secs_f64|as_millis' --type=rust -B2 -A2 | head -40Repository: Oxen-AI/Oxen
Length of output: 2778
Fix elapsed-time unit mismatch in CLI output.
Line 57 displays milliseconds with a seconds suffix ("s"), causing a 1000x error in user-facing output. Correct the unit to match the actual value.
Suggested fix
- println!(" Elapsed: {:.2}s", result.elapsed.as_millis() as f64);
+ println!(" Elapsed: {:.2}ms", result.elapsed.as_secs_f64() * 1000.0);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| println!(" Elapsed: {:.2}s", result.elapsed.as_millis() as f64); | |
| println!(" Elapsed: {:.2}ms", result.elapsed.as_secs_f64() * 1000.0); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/cli/src/cmd/fsck.rs` at line 57, The CLI prints elapsed time
using result.elapsed.as_millis() but labels it with "s", causing a units
mismatch; update the fsck output to either change the suffix to "ms" or convert
the elapsed value to seconds (e.g., use result.elapsed.as_secs_f64()) so the
printed unit matches the value — locate the println in fsck.rs that references
result.elapsed.as_millis() and adjust the value or the suffix accordingly.
| metrics::histogram!("oxen_client_import_upload_zip_duration_ms") | ||
| .record(timer.elapsed().as_millis() as f64); |
There was a problem hiding this comment.
Record duration on failures too, not only success.
At Line 79, duration is recorded only after a successful response parse. Any error path before that (HTTP/send/body/JSON parse/retry exhaustion) emits no latency metric, which creates an observability blind spot.
📈 Proposed fix (record once after retry completes, regardless of outcome)
- retry::with_retry(&config, |_attempt| {
+ let result = retry::with_retry(&config, |_attempt| {
let url = url.clone();
let zip_data = zip_data.clone();
let file_name = file_name.clone();
let name = name.to_string();
let email = email.to_string();
let directory = directory.to_string();
let commit_msg = commit_msg.clone();
async move {
let form = make_multipart_form(
zip_data,
len_zip_data,
file_name,
name,
email,
directory,
commit_msg,
);
let client = client::new_for_url_transfer(&url)?;
let response = client.post(&url).multipart(form).send().await?;
let body = client::parse_json_body(&url, response).await?;
let response: crate::view::CommitResponse = serde_json::from_str(&body)
.map_err(|e| OxenError::basic_str(format!("Failed to parse response: {e}")))?;
- metrics::histogram!("oxen_client_import_upload_zip_duration_ms")
- .record(timer.elapsed().as_millis() as f64);
Ok(response.commit)
}
})
- .await
+ .await;
+
+ metrics::histogram!("oxen_client_import_upload_zip_duration_ms")
+ .record(timer.elapsed().as_millis() as f64);
+ result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/lib/src/api/client/import.rs` around lines 79 - 80, The
upload timing is only recorded after a successful response parse, leaving error
paths unmeasured; capture the start timer before the HTTP/send/retry block and
ensure
metrics::histogram!("oxen_client_import_upload_zip_duration_ms").record(timer.elapsed().as_millis()
as f64) is executed exactly once after the retry logic completes regardless of
success or failure (e.g., in a finally/cleanup block or after matching the
Result from the request), so move the record call out of the success-only branch
and into the post-retry completion path surrounding the send/parse/retry logic
and timer variable used in import.rs.
| } | ||
| } | ||
|
|
||
| #[tracing::instrument(skip(remote_repo))] |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Rust tracing, does #[tracing::instrument] include all non-skipped function arguments as span fields by default, and what are the recommended practices for avoiding sensitive/high-cardinality fields?
💡 Result:
Yes. #[tracing::instrument] records all function arguments as span fields by default (unless you skip them). Arguments of tracing “primitive” types are recorded as typed Values; everything else is recorded using its fmt::Debug representation. (docs.rs)
Recommended practices to avoid sensitive and/or high-cardinality fields:
- Default-deny arguments for anything sensitive/large: use
skip(arg1, arg2, ...)orskip_all. (docs.rs) - Log only selected, low-cardinality properties via
fields(...)(often combined withskip/skip_all), e.g., recordhttp.method/http.urirather than an entire request struct/body. (docs.rs) - Prevent secret leakage by construction: wrap secrets in a type with a redacting
Debugimpl (e.g.,secrecy::Secret), so even if an argument is captured, it prints as[REDACTED ...]. (lpalmieri.com) - Avoid expensive/verbose
Debugand huge inputs (common sources of high-cardinality / log volume):skipthem, or record derived summaries (counts, sizes, coarse categories) instead. (docs.rs)
Citations:
- 1: https://docs.rs/tracing/latest/tracing/attr.instrument.html
- 2: https://docs.rs/tracing/latest/tracing/attr.instrument.html
- 3: https://docs.rs/tracing/latest/tracing/attr.instrument.html
- 4: https://lpalmieri.com/posts/2020-09-27-zero-to-production-4-are-we-observable-yet/
- 5: https://docs.rs/tracing/latest/tracing/attr.instrument.html
🏁 Script executed:
# First, let's see the file and check the specific lines mentioned
wc -l oxen-rust/crates/lib/src/api/client/workspaces/files.rsRepository: Oxen-AI/Oxen
Length of output: 116
🏁 Script executed:
# Check around line 1237
sed -n '1230,1245p' oxen-rust/crates/lib/src/api/client/workspaces/files.rsRepository: Oxen-AI/Oxen
Length of output: 556
🏁 Script executed:
# Check around line 1294
sed -n '1287,1305p' oxen-rust/crates/lib/src/api/client/workspaces/files.rsRepository: Oxen-AI/Oxen
Length of output: 752
🏁 Script executed:
# Search for all #[tracing::instrument] occurrences in the file to see if there are similar patterns
rg -n '#\[tracing::instrument' oxen-rust/crates/lib/src/api/client/workspaces/files.rs -A 3Repository: Oxen-AI/Oxen
Length of output: 2489
Skip workspace_id and path arguments in tracing spans to prevent sensitive data leakage and reduce log cardinality.
The download function (line 1237) and validate_upload_feasibility function (line 1294) only skip remote_repo, allowing workspace_id, path, and output_path to be captured in span fields. This can leak user file paths and create noisy high-cardinality logs. Most other functions in this file correctly skip these parameters—apply the same pattern here.
Proposed fix
-#[tracing::instrument(skip(remote_repo))]
+#[tracing::instrument(skip(remote_repo, workspace_id, path, output_path))]
pub async fn download(
remote_repo: &RemoteRepository,
workspace_id: &str,
path: &str,
output_path: Option<&Path>,
) -> Result<(), OxenError> {-#[tracing::instrument(skip(remote_repo))]
+#[tracing::instrument(skip(remote_repo, workspace_id))]
pub async fn validate_upload_feasibility(
remote_repo: &RemoteRepository,
workspace_id: &str,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #[tracing::instrument(skip(remote_repo))] | |
| #[tracing::instrument(skip(remote_repo, workspace_id, path, output_path))] |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/lib/src/api/client/workspaces/files.rs` at line 1237, The
tracing spans for download and validate_upload_feasibility currently only skip
remote_repo; update their tracing attributes to also skip workspace_id, path,
and output_path (as applicable) to avoid capturing sensitive file paths and
high-cardinality fields. Locate the #[tracing::instrument(skip(remote_repo))]
annotation above the download function and change it to skip(remote_repo,
workspace_id, path, output_path) (and similarly update the annotation above
validate_upload_feasibility to include workspace_id, path, output_path) so those
parameters are omitted from span fields.
|
|
||
| impl CommitMerkleTree { | ||
| /// The dir hashes allow you to skip to a directory in the tree | ||
| #[tracing::instrument(skip(repo), fields(commit_id = %commit.id))] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="oxen-rust/crates/lib/src/core/v_latest/index/commit_merkle_tree.rs"
echo "== Instrumented fns with commit arg and nearby attributes =="
python - <<'PY'
from pathlib import Path
p = Path("oxen-rust/crates/lib/src/core/v_latest/index/commit_merkle_tree.rs")
lines = p.read_text().splitlines()
for i, line in enumerate(lines):
if "pub fn " in line and "commit: &Commit" in line:
attr = ""
for j in range(max(0, i-6), i):
if "tracing::instrument" in lines[j]:
attr = lines[j].strip()
print(f"Line {i+1}: {line.strip()}")
print(f" attr: {attr}")
PY
echo
echo "== Commit struct definition(s) =="
rg -nP --type=rust -C3 'pub struct Commit\b' oxen-rust/crates/lib/srcRepository: Oxen-AI/Oxen
Length of output: 914
🏁 Script executed:
#!/bin/bash
# Find all tracing::instrument attributes in the file with their line numbers
echo "== All tracing::instrument attributes in commit_merkle_tree.rs =="
rg -n "tracing::instrument" oxen-rust/crates/lib/src/core/v_latest/index/commit_merkle_tree.rs -A1
echo
echo "== Function signatures around claimed lines (28, 67, 109, 119, 130, 150) =="
sed -n '25,32p; 64,71p; 106,113p; 116,123p; 127,134p; 147,154p' oxen-rust/crates/lib/src/core/v_latest/index/commit_merkle_tree.rs | cat -n
echo
echo "== Complete Commit struct definition =="
rg -n "pub struct Commit" oxen-rust/crates/lib/src/model/commit.rs -A 20Repository: Oxen-AI/Oxen
Length of output: 5747
Add commit to the skip() list in tracing attributes.
At lines 28, 67, 109, 119, 130, and 150, the commit parameter is not skipped, causing the full Commit struct to be captured in spans. Since commit_id is already explicitly recorded via fields(commit_id = %commit.id), also logging the entire struct is unnecessary and exposes additional fields (message, author, email) that should not be in observability data.
Proposed fix
- #[tracing::instrument(skip(repo), fields(commit_id = %commit.id))]
+ #[tracing::instrument(skip(repo, commit), fields(commit_id = %commit.id))]
- #[tracing::instrument(skip(repo), fields(repo_path = %repo.path.display(), commit_id = %commit.id))]
+ #[tracing::instrument(skip(repo, commit), fields(repo_path = %repo.path.display(), commit_id = %commit.id))]
- #[tracing::instrument(skip(repo, base_hashes, unique_hashes, shared_hashes), fields(commit_id = %commit.id))]
+ #[tracing::instrument(skip(repo, commit, base_hashes, unique_hashes, shared_hashes), fields(commit_id = %commit.id))]
- #[tracing::instrument(skip(repo, base_hashes, unique_hashes, shared_hashes, partial_nodes), fields(commit_id = %commit.id))]
+ #[tracing::instrument(skip(repo, commit, base_hashes, unique_hashes, shared_hashes, partial_nodes), fields(commit_id = %commit.id))]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/lib/src/core/v_latest/index/commit_merkle_tree.rs` at line
28, The tracing spans are currently capturing the full Commit struct even though
only commit.id is recorded; update each tracing attribute that has
fields(commit_id = %commit.id) to include commit in the skip list (e.g., change
#[tracing::instrument(skip(repo), fields(commit_id = %commit.id))] to
#[tracing::instrument(skip(repo, commit), fields(commit_id = %commit.id))]) so
the full struct is not serialized into spans; apply this change to every
function annotated with tracing::instrument that records commit_id to ensure
commit is skipped.
| if repo.is_remote_mode() { | ||
| // Set workspace_name to new branch name | ||
| let mut mut_repo = repo.clone(); | ||
| mut_repo.set_workspace(value)?; | ||
| mut_repo.save()?; | ||
| } |
There was a problem hiding this comment.
Remote-mode commit checkout is coupled to workspace selection and can fail.
This block runs in the non-branch path, where value can be a commit id. set_workspace(value) requires an existing workspace name, so detached/commit checkout may now error unexpectedly in remote mode.
💡 Proposed fix
- if repo.is_remote_mode() {
- // Set workspace_name to new branch name
- let mut mut_repo = repo.clone();
- mut_repo.set_workspace(value)?;
- mut_repo.save()?;
- }
+ // Do not force workspace switching on commit checkout.
+ // Workspace updates should remain branch-driven.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if repo.is_remote_mode() { | |
| // Set workspace_name to new branch name | |
| let mut mut_repo = repo.clone(); | |
| mut_repo.set_workspace(value)?; | |
| mut_repo.save()?; | |
| } | |
| // Do not force workspace switching on commit checkout. | |
| // Workspace updates should remain branch-driven. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/lib/src/repositories/checkout.rs` around lines 68 - 73, The
remote-mode branch of the checkout path calls mut_repo.set_workspace(value) even
when value may be a commit id, causing failures for detached/commit checkouts;
guard the workspace update so it only runs for actual branch/workspace names
(e.g. check via an existing branch lookup or a repo.is_branch(value) helper)
before calling mut_repo.set_workspace(value) and mut_repo.save(), or
alternatively catch and ignore the specific "workspace not found" error from
set_workspace when value is a commit id.
| use liboxen::{constants, repositories}; | ||
|
|
||
| /// Export resource as a zip | ||
| #[tracing::instrument(skip_all, fields(namespace, repo_name))] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether instrument fields are declared but not recorded in this controller.
rg -nP --type=rust -C2 'instrument\(skip_all,\s*fields\(([^)]*)\)\)' oxen-rust/crates/server/src/controllers/export.rs
rg -nP --type=rust -C2 'Span::current\(\)\.record\("namespace"|Span::current\(\)\.record\("repo_name"' oxen-rust/crates/server/src/controllers/export.rsRepository: Oxen-AI/Oxen
Length of output: 188
🏁 Script executed:
#!/bin/bash
# Read the function starting from line 13 to understand complete context
sed -n '13,40p' oxen-rust/crates/server/src/controllers/export.rsRepository: Oxen-AI/Oxen
Length of output: 1512
🏁 Script executed:
#!/bin/bash
# Check for other functions with skip_all + fields pattern
rg -n 'instrument\(skip_all.*fields' oxen-rust/crates/server/src/controllers/export.rs
# Check for any span.record usage in the file
rg -n 'span\.record\|Span::current' oxen-rust/crates/server/src/controllers/export.rs
# Check total function count with the pattern
rg -c 'instrument\(skip_all' oxen-rust/crates/server/src/controllers/export.rsRepository: Oxen-AI/Oxen
Length of output: 123
🏁 Script executed:
#!/bin/bash
# Check if this pattern exists in other controller files
rg -n 'instrument\(skip_all.*fields' oxen-rust/crates/server/src/controllers/
# Count how many functions follow this pattern across all controllers
echo "=== Summary ==="
rg -c 'instrument\(skip_all' oxen-rust/crates/server/src/controllers/ | grep -v ':0$'Repository: Oxen-AI/Oxen
Length of output: 16415
🏁 Script executed:
#!/bin/bash
# Check if any controller file records these fields
rg -n 'span\.record.*namespace|span\.record.*repo_name' oxen-rust/crates/server/src/controllers/
# Also check for Span::current().record
rg -n 'Span::current.*record' oxen-rust/crates/server/src/controllers/Repository: Oxen-AI/Oxen
Length of output: 38
Span fields declared but not recorded—populate after extracting path parameters.
With skip_all, the declared span fields namespace and repo_name remain empty unless explicitly recorded. After extracting these values at lines 37-38, record them to the current span:
Suggested fix
pub async fn download_zip(req: HttpRequest) -> Result<HttpResponse, OxenHttpError> {
metrics::counter!("oxen_server_export_download_zip_total").increment(1);
let timer = std::time::Instant::now();
let app_data = app_data(&req)?;
let namespace = path_param(&req, "namespace")?;
let repo_name = path_param(&req, "repo_name")?;
+ let span = tracing::Span::current();
+ span.record("namespace", tracing::field::display(&namespace));
+ span.record("repo_name", tracing::field::display(&repo_name));
let repo = get_repo(&app_data.path, &namespace, &repo_name)?;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #[tracing::instrument(skip_all, fields(namespace, repo_name))] | |
| pub async fn download_zip(req: HttpRequest) -> Result<HttpResponse, OxenHttpError> { | |
| metrics::counter!("oxen_server_export_download_zip_total").increment(1); | |
| let timer = std::time::Instant::now(); | |
| let app_data = app_data(&req)?; | |
| let namespace = path_param(&req, "namespace")?; | |
| let repo_name = path_param(&req, "repo_name")?; | |
| let span = tracing::Span::current(); | |
| span.record("namespace", tracing::field::display(&namespace)); | |
| span.record("repo_name", tracing::field::display(&repo_name)); | |
| let repo = get_repo(&app_data.path, &namespace, &repo_name)?; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/server/src/controllers/export.rs` at line 13, The span
fields declared in the tracing attribute (#[tracing::instrument(skip_all,
fields(namespace, repo_name))]) are never populated; after you extract the path
parameters into the variables namespace and repo_name in this handler, record
them on the current span (e.g., call
tracing::Span::current().record("namespace",
&tracing::field::display(&namespace)) and
tracing::Span::current().record("repo_name",
&tracing::field::display(&repo_name))) so the declared fields are populated in
the span.
| metrics::counter!("oxen_server_export_download_zip_total").increment(1); | ||
| let timer = std::time::Instant::now(); |
There was a problem hiding this comment.
Duration histogram misses failure paths.
oxen_server_export_download_zip_duration_ms is only recorded on the success path. Any early Err return (or ? before Line 81) skips timing emission, which biases latency telemetry.
Proposed fix
pub async fn download_zip(req: HttpRequest) -> Result<HttpResponse, OxenHttpError> {
metrics::counter!("oxen_server_export_download_zip_total").increment(1);
let timer = std::time::Instant::now();
- let app_data = app_data(&req)?;
- let namespace = path_param(&req, "namespace")?;
- let repo_name = path_param(&req, "repo_name")?;
- let repo = get_repo(&app_data.path, &namespace, &repo_name)?;
+ let result: Result<HttpResponse, OxenHttpError> = async {
+ let app_data = app_data(&req)?;
+ let namespace = path_param(&req, "namespace")?;
+ let repo_name = path_param(&req, "repo_name")?;
+ let repo = get_repo(&app_data.path, &namespace, &repo_name)?;
- let resource = parse_resource(&req, &repo)?;
- let directory = resource.path.clone();
- let commit = resource.commit.ok_or(OxenHttpError::NotFound)?;
+ let resource = parse_resource(&req, &repo)?;
+ let directory = resource.path.clone();
+ let commit = resource.commit.ok_or(OxenHttpError::NotFound)?;
- let Some(dir_node) =
- repositories::tree::get_dir_with_children_recursive(&repo, &commit, &directory, None)?
- else {
- return Err(OxenHttpError::NotFound);
- };
+ let Some(dir_node) =
+ repositories::tree::get_dir_with_children_recursive(&repo, &commit, &directory, None)?
+ else {
+ return Err(OxenHttpError::NotFound);
+ };
- // ... unchanged logic ...
+ // ... unchanged logic ...
- let response = controllers::versions::stream_versions_zip(&repo, files_with_hash).await?;
-
- metrics::histogram!("oxen_server_export_download_zip_duration_ms")
- .record(timer.elapsed().as_millis() as f64);
- Ok(response)
+ controllers::versions::stream_versions_zip(&repo, files_with_hash).await
+ }
+ .await;
+
+ metrics::histogram!("oxen_server_export_download_zip_duration_ms")
+ .record(timer.elapsed().as_millis() as f64);
+ result
}Also applies to: 81-83
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/server/src/controllers/export.rs` around lines 31 - 32, The
duration histogram oxen_server_export_download_zip_duration_ms is only recorded
on the success path; ensure the timer started by the local variable timer is
observed on all exits (including early Err returns) by recording the elapsed
time before every return or by introducing a scope guard/RAII that logs
metrics::histogram!("oxen_server_export_download_zip_duration_ms", elapsed_ms)
in its Drop; update the export handler (the function that increments
oxen_server_export_download_zip_total and defines timer) to call
metrics::histogram! with timer.elapsed().as_millis() (or
elapsed().as_secs_f64()*1000) on failure paths and in the current success path
so timing is always emitted.
|
|
||
| /// POST /prune | ||
| /// Trigger a prune operation on the repository | ||
| #[tracing::instrument(skip_all, fields(namespace, repo_name))] |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In the tracingcrate, does#[tracing::instrument(skip_all, fields(namespace, repo_name))]automatically populatenamespaceandrepo_namefrom local variables assigned later in the function, or must those fields be set manually withSpan::record?
💡 Result:
No. In #[tracing::instrument(...)], the fields(...) expressions are evaluated at the beginning of the function body, and a field name written without a value (like fields(namespace, repo_name)) declares an empty field—it is not automatically filled in later from locals you assign afterward. You must record the values yourself once you have them (e.g., Span::current().record("namespace", &namespace) / record("repo_name", &repo_name)). Also, you can only record fields that were declared up front; you can’t add new field names later. [1] [2]
Populate the span fields explicitly.
With skip_all and fields declared without values, namespace and repo_name will remain empty at runtime. You must manually record these values after extracting them on Lines 46-47, or your traces won't carry the repo context.
🔧 Suggested fix
let namespace = path_param(&req, "namespace")?;
let repo_name = path_param(&req, "repo_name")?;
+ let span = tracing::Span::current();
+ span.record("namespace", &tracing::field::display(&namespace));
+ span.record("repo_name", &tracing::field::display(&repo_name));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/server/src/controllers/prune.rs` at line 38, The tracing
span declared with #[tracing::instrument(skip_all, fields(namespace,
repo_name))] leaves namespace and repo_name empty at runtime; after you extract
the namespace and repo_name variables (the values assigned on the lines
referenced), record them into the current span (e.g. via
tracing::Span::current().record("namespace", &namespace) and
.record("repo_name", &repo_name) or equivalent) so the span fields are populated
with the actual repo context for prune handler functions in controllers::prune.
| metrics::counter!("oxen_server_prune_prune_total").increment(1); | ||
| let timer = std::time::Instant::now(); |
There was a problem hiding this comment.
Record the duration metric on every exit path.
Lines 45-50 can return via ?, so oxen_server_prune_prune_total increments but oxen_server_prune_prune_duration_ms is never emitted for those failures. That will skew dashboards and make error-path latency invisible. Move the histogram recording to a scope that always runs, e.g. after awaiting an inner async block or via a drop guard.
Also applies to: 57-60
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/server/src/controllers/prune.rs` around lines 43 - 44, The
duration histogram is only recorded on the happy path, so failures that return
via ? never emit oxen_server_prune_prune_duration_ms; wrap the timing so it
always records (e.g. create a small Drop guard that captures
std::time::Instant::now() and in drop calls
metrics::histogram!("oxen_server_prune_prune_duration_ms").record(elapsed_ms) or
move the histogram.observe call into a finally-like scope after awaiting inner
async work). Apply the same change for the second timing block noted around the
code that follows the counter increment at the second site (the other
std::time::Instant::now() / histogram pair) so both success and error paths
always emit duration.
| const DOWNLOAD_BUFFER_SIZE: usize = 2 * 1024 * 1024; | ||
|
|
||
| /// Get version file metadata | ||
| #[tracing::instrument(skip_all, fields(namespace, repo_name))] |
There was a problem hiding this comment.
Tracing span fields namespace and repo_name are declared but never populated.
With skip_all, function parameters are not auto-recorded, and since namespace/repo_name are extracted from the request inside the function body, the span fields remain empty. This pattern is repeated across all instrumented endpoints in this file (metadata, clean, download, batch_download, batch_upload).
To populate the fields, call Span::current().record() after extracting the values:
🔧 Proposed fix for metadata (apply similar pattern to other endpoints)
pub async fn metadata(req: HttpRequest) -> Result<HttpResponse, OxenHttpError> {
metrics::counter!("oxen_server_versions_metadata_total").increment(1);
let app_data = app_data(&req)?;
let namespace = path_param(&req, "namespace")?;
let repo_name = path_param(&req, "repo_name")?;
+ tracing::Span::current().record("namespace", &namespace);
+ tracing::Span::current().record("repo_name", &repo_name);
let version_id = path_param(&req, "version_id")?;Also applies to: 54-58
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/server/src/controllers/versions.rs` at line 37, The tracing
span fields `namespace` and `repo_name` are declared in the attribute but remain
empty because `#[tracing::instrument(skip_all, fields(namespace, repo_name))]`
prevents auto-recording; after you extract the path params inside each handler
(e.g., metadata, clean, download, batch_download, batch_upload) call
Span::current().record("namespace", &namespace.as_str()) and
Span::current().record("repo_name", &repo_name.as_str()) (use the actual
variable names you extracted) to populate those fields so the span contains the
correct values.
Changes all of
oxento use structured logging, allowing for alllog::statements to beasynchronously written to a rotating log file as a sequence of newline delimited JSON
objects. Configuration is via the
OXEN_LOG_FILEenv var: setting it to1ortruehas it write logs in
~/.oxen/logs/. Setting it to a directory has it write to log files there.Leaving it unset means all logging falls back to writing to STDERR.
All
logstatements are now translated totracingEvents via the newtracing-logdependency. This means that if they're active in a trace, they will be associated with it.
oxen-servernow collects and exposes a Prometheus-compatible metrics endpoint as/metrics. This runs on a different port than the main server process. It defaults to9090and can be changed withOXEN_METRICS_PORT.New trace spans have been added to every HTTP endpoint call in
oxen-rustand toworkspace upload functions.