Skip to content

Documenting & Simplifying Error Variants#393

Open
malcolmgreaves wants to merge 1 commit intomainfrom
mg/oxenerror_many
Open

Documenting & Simplifying Error Variants#393
malcolmgreaves wants to merge 1 commit intomainfrom
mg/oxenerror_many

Conversation

@malcolmgreaves
Copy link
Copy Markdown
Collaborator

Adding docstrings to all OxenError variants. Docstring explains what the
variant is and, if applicable, the situations and code paths where it is encountered.

Audited all OxenError variants and identified variants that are never used in code.
These variants have been removed.

Additionally, some variants only ever use statically allocated strings as their inner
value. These have been replaced with non-value bearing enum variants.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 27, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Summary by CodeRabbit

  • Refactor
    • Standardized internal error reporting so API responses deliver more consistent error types and messages.
  • Bug Fixes / API
    • Fork endpoint: conflict now follows the unified error path (consistent status and response).
    • Merge/pull conflict messages clarified (including punctuation) for clearer client-facing responses.
  • Style
    • Thumbnailing-not-enabled response simplified when video features are unavailable.

Walkthrough

Reshaped the OxenError enum (variant payloads/unit-ized/wrapper changes), removed many helper constructors, added #[from] wrappers, and updated call sites across library, server, and Python bindings to construct or match the new variants and use lazy error construction where applicable.

Changes

Cohort / File(s) Summary
Core Error Definition
crates/lib/src/error.rs
Reworked OxenError enum: changed variant shapes (unit-ized or unboxed payloads), removed many helper constructors, added #[from] wrapper variants, and removed manual From impls.
Core v_latest
crates/lib/src/core/v_latest/...
commits.rs, entries.rs, pull.rs, push.rs, workspaces/...
Replaced helper constructor calls with direct variant constructions (e.g., revision_not_found(...)RevisionNotFound(...), no_commits_found()NoCommitsFound), adjusted argument conversions, and switched eager ok_or to lazy ok_or_else in several places.
Workspaces files & utils
crates/lib/src/core/v_latest/workspaces/files.rs, crates/lib/src/util/fs.rs
Introduced InvalidUrlError and changed validate_url_target to return it; map it to OxenError::ImportFileError in callers; thumbnailing error now uses unit ThumbnailingNotEnabled; adjusted error matching/propagation.
Repositories & helpers
crates/lib/src/repositories*.rs
.../checkout.rs, diffs.rs, fork.rs, tree.rs, workspaces.rs
Updated many call sites to construct or match the new OxenError shapes (e.g., RemoteRepoNotFound, WorkspaceNotFound, RevisionNotFound), changed pattern matches to unit-like variants, and adapted small argument conversions.
Server controllers & error mapping
crates/server/src/controllers/*, crates/server/src/errors.rs
Updated controller error construction/matching to the new variants, removed handling for deleted variants (e.g., RepoAlreadyExistsAtDestination), adjusted HTTP error mapping and matched arms to the new enum shapes.
Dataframes & Embeddings
crates/lib/src/.../data_frames/*.rs, crates/lib/src/repositories/workspaces/data_frames/embeddings.rs
Converted QueryableWorkspaceNotFound/NoRowsFound usages to unit-like variants and simplified return/error paths.
Python bindings & legacy v_old
crates/oxen-py/src/py_remote_data_frame.rs, crates/lib/src/core/v_old/...
Adjusted early-return error constructions to use new unit-like variants (NoCommitsFound) and made repetitive call-site edits in legacy code.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • jcelliott
  • subygan

Poem

"I hopped through enums and munched each case,
Boxes dropped softly without any trace.
Variants trimmed tidy, matches now bright,
I binkied with joy through refactor delight.
Carrot-coded cheer — a rabbit's small bite! 🥕"

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 63.64% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Documenting & Simplifying Error Variants' directly summarizes the main change: adding docstrings to error variants and simplifying their definitions by removing unused variants and converting statically-allocated variants to unit variants.
Description check ✅ Passed The description explains the key changes: adding docstrings to OxenError variants, removing unused variants, and simplifying variants that use static strings by converting them to non-value-bearing enums. All described changes align with the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mg/oxenerror_many

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
crates/server/src/errors.rs (1)

497-504: Use the local thumbnail_error binding in the response detail.

detail currently formats {error} (outer binding) instead of {thumbnail_error}. This works, but it’s brittle and less clear.

♻️ Suggested cleanup
-                                "detail": format!("{error}"),
+                                "detail": format!("{thumbnail_error}"),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/server/src/errors.rs` around lines 497 - 504, The thumbnailing error
arm uses the outer `error` binding in the JSON `detail` field instead of the
local pattern binding `thumbnail_error`; update the `error_json` construction
inside the `thumbnail_error @ OxenError::ThumbnailingNotEnabled` match arm to
use `format!("{thumbnail_error}")` for the "detail" value so the response
clearly and correctly references the local `thumbnail_error` binding.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/lib/src/error.rs`:
- Around line 82-84: The NoCommitsFound error variant contains CLI-only newlines
which propagate into OxenError::to_string() (and onward into Python), so remove
the extra newlines and leading space in the #[error(...)] attribute for the
NoCommitsFound enum variant (referencing NoCommitsFound and the enum that
implements Display/OxenError::to_string) and make the message a single-line
string so CLI formatting can be applied only at render time.
- Around line 44-46: Update OxenError::is_not_found() so it treats the
refactored not-found variants the same way hint() does: add match arms to return
true for LocalRepoNotFound, ParsedResourceNotFound, and the workspace-related
not-found variants (the Workspace* not-found variants present in the enum) so
the method is internally consistent with hint()’s behavior for these cases.

---

Nitpick comments:
In `@crates/server/src/errors.rs`:
- Around line 497-504: The thumbnailing error arm uses the outer `error` binding
in the JSON `detail` field instead of the local pattern binding
`thumbnail_error`; update the `error_json` construction inside the
`thumbnail_error @ OxenError::ThumbnailingNotEnabled` match arm to use
`format!("{thumbnail_error}")` for the "detail" value so the response clearly
and correctly references the local `thumbnail_error` binding.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8d5aec9d-7a54-4513-b498-61824729f067

📥 Commits

Reviewing files that changed from the base of the PR and between 2d0e962 and fc812d8.

📒 Files selected for processing (31)
  • crates/lib/src/api/client/repositories.rs
  • crates/lib/src/core/v_latest/commits.rs
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/core/v_latest/pull.rs
  • crates/lib/src/core/v_latest/push.rs
  • crates/lib/src/core/v_latest/workspaces/commit.rs
  • crates/lib/src/core/v_latest/workspaces/data_frames.rs
  • crates/lib/src/core/v_latest/workspaces/diff.rs
  • crates/lib/src/core/v_latest/workspaces/files.rs
  • crates/lib/src/core/v_old/v0_19_0/pull.rs
  • crates/lib/src/error.rs
  • crates/lib/src/repositories.rs
  • crates/lib/src/repositories/checkout.rs
  • crates/lib/src/repositories/clone.rs
  • crates/lib/src/repositories/diffs.rs
  • crates/lib/src/repositories/fork.rs
  • crates/lib/src/repositories/tree.rs
  • crates/lib/src/repositories/workspaces.rs
  • crates/lib/src/repositories/workspaces/data_frames/embeddings.rs
  • crates/lib/src/util/fs.rs
  • crates/oxen-py/src/py_remote_data_frame.rs
  • crates/server/src/controllers/commits.rs
  • crates/server/src/controllers/diff.rs
  • crates/server/src/controllers/entries.rs
  • crates/server/src/controllers/fork.rs
  • crates/server/src/controllers/merger.rs
  • crates/server/src/controllers/metadata.rs
  • crates/server/src/controllers/repositories.rs
  • crates/server/src/controllers/schemas.rs
  • crates/server/src/controllers/tree.rs
  • crates/server/src/errors.rs

Comment on lines +44 to +46
/// When a local repository cannot be found at the given path.
#[error("No oxen repository found at {0}")]
LocalRepoNotFound(Box<PathBufError>),
LocalRepoNotFound(PathBufError),
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Keep is_not_found() in sync with the refactored not-found variants.

hint() already treats LocalRepoNotFound and ParsedResourceNotFound as not-found cases, but is_not_found() still excludes them and the workspace variants. That leaves OxenError internally inconsistent after this cleanup.

💡 Suggested fix
     pub fn is_not_found(&self) -> bool {
         matches!(
             self,
-            OxenError::PathDoesNotExist(_)
-                | OxenError::ResourceNotFound(_)
-                | OxenError::RemoteRepoNotFound(_)
+            OxenError::RepoNotFound(_)
+                | OxenError::LocalRepoNotFound(_)
+                | OxenError::PathDoesNotExist(_)
+                | OxenError::ParsedResourceNotFound(_)
+                | OxenError::QueryableWorkspaceNotFound
+                | OxenError::ResourceNotFound(_)
+                | OxenError::RemoteRepoNotFound(_)
+                | OxenError::WorkspaceNotFound(_)
         )
     }

Also applies to: 93-99, 116-118

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/error.rs` around lines 44 - 46, Update
OxenError::is_not_found() so it treats the refactored not-found variants the
same way hint() does: add match arms to return true for LocalRepoNotFound,
ParsedResourceNotFound, and the workspace-related not-found variants (the
Workspace* not-found variants present in the enum) so the method is internally
consistent with hint()’s behavior for these cases.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@subygan do you agree with this AI review comment?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
crates/lib/src/core/v_latest/workspaces/commit.rs (2)

195-207: Pre-existing: Redundant branch commit lookup.

Note: This is not introduced by the current PR, but branch_commit is fetched twice—once at lines 174-180 and again at lines 195-201—with identical logic. Consider reusing the first result to avoid the redundant database lookup.

♻️ Suggested consolidation (outside PR scope)
     let workspace_commit = &workspace.commit;
     let Some(branch_commit) =
         repositories::commits::get_by_id(&workspace.base_repo, &branch.commit_id)?
     else {
         return Err(OxenError::RevisionNotFound(
             branch.commit_id.as_str().into(),
         ));
     };
     log::debug!(
         "checking if workspace is behind: {:?} {} == {}",
         branch.name,
         branch_commit,
         workspace_commit
     );
     if branch.commit_id == workspace_commit.id {
         // Nothing has changed on the branch since the workspace was created
         return Ok(vec![]);
     }

     // Check to see if any of the staged entries have been updated on the target branch
     // If so, mark the commit as behind
     // Otherwise, we can just commit the changes
-    let Some(branch_commit) =
-        repositories::commits::get_by_id(&workspace.base_repo, &branch.commit_id)?
-    else {
-        return Err(OxenError::RevisionNotFound(
-            branch.commit_id.as_str().into(),
-        ));
-    };
+    // Reuse branch_commit from above
     let Some(branch_tree) =
         repositories::tree::get_root_with_children(&workspace.base_repo, &branch_commit)?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/core/v_latest/workspaces/commit.rs` around lines 195 - 207,
The code performs a redundant lookup of branch_commit by calling
repositories::commits::get_by_id(&workspace.base_repo, &branch.commit_id) a
second time; instead, reuse the branch_commit value already obtained earlier to
avoid the extra DB call and then call
repositories::tree::get_root_with_children(&workspace.base_repo, &branch_commit)
using that reused variable; update the branch_tree retrieval block to remove the
duplicate get_by_id call and reference the existing branch_commit (and adjust
error handling to match the original lookup's semantics).

113-115: Consider resolving or tracking this TODO.

The comment raises a valid question about whether BranchNotFound would be more semantically accurate when looking up a branch by name. If the decision to standardize on RevisionNotFound is intentional (as suggested by the PR objectives), consider either removing the TODO or creating an issue to track the decision for future reference.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/core/v_latest/workspaces/commit.rs` around lines 113 - 115,
The TODO about whether to use BranchNotFound vs RevisionNotFound must be
resolved: either 1) replace the returned
OxenError::RevisionNotFound(branch_name.into()) with
OxenError::BranchNotFound(branch_name.into()) in commit.rs (and update any
callers/tests that expect RevisionNotFound), or 2) if the intent is to
standardize on RevisionNotFound, remove the TODO and add a short comment or
create/mention an issue ID tracking this decision; update the comment near the
branch lookup (referencing branch_name, RevisionNotFound, and BranchNotFound)
accordingly so the choice is explicit and tracked.
crates/server/src/errors.rs (1)

497-509: Minor naming inconsistency between log and detail.

The ThumbnailingNotEnabled arm binds the pattern to thumbnail_error but uses error (from the outer match arm) in the detail format on line 503. While both reference the same value and the code is functionally correct, this is inconsistent with the NoRowsFound arm (lines 521-527) which consistently uses its binding e in both places.

♻️ Suggested fix for consistency
                     thumbnail_error @ OxenError::ThumbnailingNotEnabled => {
                         log::error!("Thumbnailing not enabled: {thumbnail_error}");
                         let error_json = json!({
                             "error": {
                                 "type": "thumbnailing_not_enabled",
                                 "title": "Thumbnailing Not Enabled",
-                                "detail": format!("{error}"),
+                                "detail": format!("{thumbnail_error}"),
                             },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/server/src/errors.rs` around lines 497 - 509, The match arm handling
OxenError::ThumbnailingNotEnabled binds the pattern to thumbnail_error but
formats the detail using the outer variable error; update the detail formatting
in the ThumbnailingNotEnabled arm to use the local binding thumbnail_error
(consistent with the NoRowsFound arm using e) so the log message and JSON detail
both reference the same identifier thumbnail_error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/lib/src/core/v_latest/workspaces/commit.rs`:
- Around line 195-207: The code performs a redundant lookup of branch_commit by
calling repositories::commits::get_by_id(&workspace.base_repo,
&branch.commit_id) a second time; instead, reuse the branch_commit value already
obtained earlier to avoid the extra DB call and then call
repositories::tree::get_root_with_children(&workspace.base_repo, &branch_commit)
using that reused variable; update the branch_tree retrieval block to remove the
duplicate get_by_id call and reference the existing branch_commit (and adjust
error handling to match the original lookup's semantics).
- Around line 113-115: The TODO about whether to use BranchNotFound vs
RevisionNotFound must be resolved: either 1) replace the returned
OxenError::RevisionNotFound(branch_name.into()) with
OxenError::BranchNotFound(branch_name.into()) in commit.rs (and update any
callers/tests that expect RevisionNotFound), or 2) if the intent is to
standardize on RevisionNotFound, remove the TODO and add a short comment or
create/mention an issue ID tracking this decision; update the comment near the
branch lookup (referencing branch_name, RevisionNotFound, and BranchNotFound)
accordingly so the choice is explicit and tracked.

In `@crates/server/src/errors.rs`:
- Around line 497-509: The match arm handling OxenError::ThumbnailingNotEnabled
binds the pattern to thumbnail_error but formats the detail using the outer
variable error; update the detail formatting in the ThumbnailingNotEnabled arm
to use the local binding thumbnail_error (consistent with the NoRowsFound arm
using e) so the log message and JSON detail both reference the same identifier
thumbnail_error.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0ee92340-ff44-4a53-8802-d6bb9c3ffae1

📥 Commits

Reviewing files that changed from the base of the PR and between fc812d8 and 8803254.

📒 Files selected for processing (30)
  • crates/lib/src/api/client/repositories.rs
  • crates/lib/src/core/v_latest/commits.rs
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/core/v_latest/pull.rs
  • crates/lib/src/core/v_latest/push.rs
  • crates/lib/src/core/v_latest/workspaces/commit.rs
  • crates/lib/src/core/v_latest/workspaces/data_frames.rs
  • crates/lib/src/core/v_latest/workspaces/diff.rs
  • crates/lib/src/core/v_latest/workspaces/files.rs
  • crates/lib/src/core/v_old/v0_19_0/pull.rs
  • crates/lib/src/error.rs
  • crates/lib/src/repositories.rs
  • crates/lib/src/repositories/checkout.rs
  • crates/lib/src/repositories/diffs.rs
  • crates/lib/src/repositories/fork.rs
  • crates/lib/src/repositories/tree.rs
  • crates/lib/src/repositories/workspaces.rs
  • crates/lib/src/repositories/workspaces/data_frames/embeddings.rs
  • crates/lib/src/util/fs.rs
  • crates/oxen-py/src/py_remote_data_frame.rs
  • crates/server/src/controllers/commits.rs
  • crates/server/src/controllers/diff.rs
  • crates/server/src/controllers/entries.rs
  • crates/server/src/controllers/fork.rs
  • crates/server/src/controllers/merger.rs
  • crates/server/src/controllers/metadata.rs
  • crates/server/src/controllers/repositories.rs
  • crates/server/src/controllers/schemas.rs
  • crates/server/src/controllers/tree.rs
  • crates/server/src/errors.rs
✅ Files skipped from review due to trivial changes (10)
  • crates/server/src/controllers/tree.rs
  • crates/lib/src/repositories/workspaces/data_frames/embeddings.rs
  • crates/lib/src/core/v_old/v0_19_0/pull.rs
  • crates/oxen-py/src/py_remote_data_frame.rs
  • crates/lib/src/core/v_latest/workspaces/diff.rs
  • crates/server/src/controllers/entries.rs
  • crates/server/src/controllers/repositories.rs
  • crates/server/src/controllers/diff.rs
  • crates/lib/src/core/v_latest/workspaces/data_frames.rs
  • crates/lib/src/core/v_latest/commits.rs
🚧 Files skipped from review as they are similar to previous changes (9)
  • crates/server/src/controllers/schemas.rs
  • crates/lib/src/repositories/fork.rs
  • crates/lib/src/repositories.rs
  • crates/lib/src/util/fs.rs
  • crates/server/src/controllers/fork.rs
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/repositories/checkout.rs
  • crates/lib/src/core/v_latest/pull.rs
  • crates/lib/src/error.rs

@malcolmgreaves malcolmgreaves force-pushed the mg/oxenerror_many branch 3 times, most recently from 6e74659 to cab6411 Compare March 27, 2026 19:37
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/server/src/controllers/fork.rs (1)

50-59: ⚠️ Potential issue | 🔴 Critical

Add explicit handling for RepoAlreadyExists errors in the fork endpoint.

The fork operation returns a generic OxenError::Basic when the destination path already exists. This error currently maps to a 500 InternalServerError. Other repository creation endpoints (e.g., repositories.rs) explicitly handle the equivalent error with a 409 Conflict response. The fork endpoint should do the same for consistency:

Err(OxenError::Basic(ref msg)) if msg.contains("file already exists") => {
    log::debug!("Fork destination already exists: {msg}");
    Ok(HttpResponse::Conflict().json(StatusMessage::error("Repo already exists.")))
}

Alternatively, refactor start_fork to return a dedicated error variant (e.g., RepoAlreadyExists) instead of a generic basic error, and add handling for it in crates/server/src/errors.rs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/server/src/controllers/fork.rs` around lines 50 - 59, Handle
destination-already-exists explicitly in the fork endpoint by adding a match arm
for the error returned from repositories::fork::start_fork: detect the
existing-repo case (either by matching Err(OxenError::Basic(ref msg)) if
msg.contains("file already exists") or by matching a new RepoAlreadyExists
variant if you refactor start_fork) and return
Ok(HttpResponse::Conflict().json(StatusMessage::error("Repo already exists.")))
while logging a debug message; otherwise keep the existing
Err(OxenHttpError::from(err)) behavior for other errors.
🧹 Nitpick comments (2)
crates/lib/src/core/v_latest/workspaces/files.rs (1)

370-372: Consider centralizing InvalidUrlErrorOxenError conversion.

The same conversion closure appears in two places. A single From<InvalidUrlError> for OxenError keeps formatting consistent and avoids drift.

♻️ Suggested refactor
 enum InvalidUrlError {
@@
     PrivateIp(std::net::IpAddr),
 }
 
+impl From<InvalidUrlError> for OxenError {
+    fn from(value: InvalidUrlError) -> Self {
+        OxenError::ImportFileError(value.to_string().into())
+    }
+}
+
@@
-    validate_url_target(&parsed_url)
-        .await
-        .map_err(|e| OxenError::ImportFileError(format!("{e}").into()))?;
+    validate_url_target(&parsed_url)
+        .await
+        .map_err(OxenError::from)?;
@@
-            validate_url_target(&next_url)
-                .await
-                .map_err(|e| OxenError::ImportFileError(format!("{e}").into()))?;
+            validate_url_target(&next_url)
+                .await
+                .map_err(OxenError::from)?;

Also applies to: 498-500

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/core/v_latest/workspaces/files.rs` around lines 370 - 372,
Centralize the conversion from InvalidUrlError to OxenError by implementing
From<InvalidUrlError> for OxenError and replace the repeated .map_err(|e|
OxenError::ImportFileError(format!("{e}").into())? style calls (used after
validate_url_target(&parsed_url).await and the other occurrence) with
.map_err(OxenError::from)?; update locations referencing validate_url_target and
any other spots where InvalidUrlError is mapped to OxenError to use the new From
impl so formatting and behavior are consistent.
crates/lib/src/core/v_latest/workspaces/commit.rs (1)

174-180: Redundant branch_commit lookup.

branch_commit is fetched at lines 174-180 and then fetched again identically at lines 195-201. The second lookup is unnecessary since the value is already available.

♻️ Remove redundant lookup
     log::debug!(
         "checking if workspace is behind: {:?} {} == {}",
         branch.name,
         branch_commit,
         workspace_commit
     );
     if branch.commit_id == workspace_commit.id {
         // Nothing has changed on the branch since the workspace was created
         return Ok(vec![]);
     }

     // Check to see if any of the staged entries have been updated on the target branch
     // If so, mark the commit as behind
     // Otherwise, we can just commit the changes
-    let Some(branch_commit) =
-        repositories::commits::get_by_id(&workspace.base_repo, &branch.commit_id)?
-    else {
-        return Err(OxenError::RevisionNotFound(
-            branch.commit_id.as_str().into(),
-        ));
-    };
     let Some(branch_tree) =
         repositories::tree::get_root_with_children(&workspace.base_repo, &branch_commit)?

Also applies to: 195-201

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/core/v_latest/workspaces/commit.rs` around lines 174 - 180,
The code performs the same lookup twice: call to
repositories::commits::get_by_id(&workspace.base_repo, &branch.commit_id)
populates branch_commit at the first occurrence (in commit.rs around the branch
handling) and is repeated later; remove the second identical call and reuse the
existing branch_commit variable instead (ensure branch_commit is in scope where
the second lookup occurred), and delete the redundant error branch tied to the
second lookup so only the initial get_by_id handles the RevisionNotFound error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@crates/server/src/controllers/fork.rs`:
- Around line 50-59: Handle destination-already-exists explicitly in the fork
endpoint by adding a match arm for the error returned from
repositories::fork::start_fork: detect the existing-repo case (either by
matching Err(OxenError::Basic(ref msg)) if msg.contains("file already exists")
or by matching a new RepoAlreadyExists variant if you refactor start_fork) and
return Ok(HttpResponse::Conflict().json(StatusMessage::error("Repo already
exists."))) while logging a debug message; otherwise keep the existing
Err(OxenHttpError::from(err)) behavior for other errors.

---

Nitpick comments:
In `@crates/lib/src/core/v_latest/workspaces/commit.rs`:
- Around line 174-180: The code performs the same lookup twice: call to
repositories::commits::get_by_id(&workspace.base_repo, &branch.commit_id)
populates branch_commit at the first occurrence (in commit.rs around the branch
handling) and is repeated later; remove the second identical call and reuse the
existing branch_commit variable instead (ensure branch_commit is in scope where
the second lookup occurred), and delete the redundant error branch tied to the
second lookup so only the initial get_by_id handles the RevisionNotFound error.

In `@crates/lib/src/core/v_latest/workspaces/files.rs`:
- Around line 370-372: Centralize the conversion from InvalidUrlError to
OxenError by implementing From<InvalidUrlError> for OxenError and replace the
repeated .map_err(|e| OxenError::ImportFileError(format!("{e}").into())? style
calls (used after validate_url_target(&parsed_url).await and the other
occurrence) with .map_err(OxenError::from)?; update locations referencing
validate_url_target and any other spots where InvalidUrlError is mapped to
OxenError to use the new From impl so formatting and behavior are consistent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 360cd4c1-a215-4525-9e2b-a6085aa53688

📥 Commits

Reviewing files that changed from the base of the PR and between 8803254 and 6e74659.

📒 Files selected for processing (30)
  • crates/lib/src/api/client/repositories.rs
  • crates/lib/src/core/v_latest/commits.rs
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/core/v_latest/pull.rs
  • crates/lib/src/core/v_latest/push.rs
  • crates/lib/src/core/v_latest/workspaces/commit.rs
  • crates/lib/src/core/v_latest/workspaces/data_frames.rs
  • crates/lib/src/core/v_latest/workspaces/diff.rs
  • crates/lib/src/core/v_latest/workspaces/files.rs
  • crates/lib/src/core/v_old/v0_19_0/pull.rs
  • crates/lib/src/error.rs
  • crates/lib/src/repositories.rs
  • crates/lib/src/repositories/checkout.rs
  • crates/lib/src/repositories/diffs.rs
  • crates/lib/src/repositories/fork.rs
  • crates/lib/src/repositories/tree.rs
  • crates/lib/src/repositories/workspaces.rs
  • crates/lib/src/repositories/workspaces/data_frames/embeddings.rs
  • crates/lib/src/util/fs.rs
  • crates/oxen-py/src/py_remote_data_frame.rs
  • crates/server/src/controllers/commits.rs
  • crates/server/src/controllers/diff.rs
  • crates/server/src/controllers/entries.rs
  • crates/server/src/controllers/fork.rs
  • crates/server/src/controllers/merger.rs
  • crates/server/src/controllers/metadata.rs
  • crates/server/src/controllers/repositories.rs
  • crates/server/src/controllers/schemas.rs
  • crates/server/src/controllers/tree.rs
  • crates/server/src/errors.rs
✅ Files skipped from review due to trivial changes (5)
  • crates/server/src/controllers/tree.rs
  • crates/lib/src/core/v_latest/workspaces/diff.rs
  • crates/oxen-py/src/py_remote_data_frame.rs
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/core/v_latest/commits.rs
🚧 Files skipped from review as they are similar to previous changes (12)
  • crates/lib/src/repositories/workspaces/data_frames/embeddings.rs
  • crates/server/src/controllers/repositories.rs
  • crates/server/src/controllers/schemas.rs
  • crates/lib/src/repositories/fork.rs
  • crates/server/src/controllers/diff.rs
  • crates/server/src/controllers/entries.rs
  • crates/lib/src/core/v_old/v0_19_0/pull.rs
  • crates/server/src/errors.rs
  • crates/lib/src/repositories/workspaces.rs
  • crates/lib/src/core/v_latest/pull.rs
  • crates/lib/src/repositories/diffs.rs
  • crates/lib/src/error.rs

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/server/src/controllers/commits.rs (1)

708-716: ⚠️ Potential issue | 🟠 Major

Don’t collapse create_empty_commit() validation failures into 500s.

This handler now maps every create_empty_commit() error to InternalServerError, but the endpoint still documents incompatible remote history as a client error. After removing the dedicated mismatch branch, that path will now regress to a 500 instead of a 400.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/server/src/controllers/commits.rs` around lines 708 - 716, The current
match collapses all errors from repositories::commits::create_empty_commit into
OxenHttpError::InternalServerError; change the Err arm to pattern-match the
specific validation/mismatch error(s) returned by create_empty_commit (e.g.,
history mismatch / validation variant(s)) and return a client error (e.g.,
OxenHttpError::BadRequest or an appropriate 400 response) for those cases while
logging and returning other errors as InternalServerError; keep references to
create_empty_commit, CommitResponse, and OxenHttpError in the handler and
include the validation error message in the log/response for the 400 path.
crates/server/src/controllers/fork.rs (1)

50-58: ⚠️ Potential issue | 🟠 Major

Preserve the documented 409 for existing fork destinations.

start_fork() errors now go straight through OxenHttpError::from, but the server-wide mapper does not special-case repo-already-exists failures here. A fork to an existing namespace/name will now surface as 500 instead of the documented 409.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/server/src/controllers/fork.rs` around lines 50 - 58, The match
handling the result of repositories::fork::start_fork should detect the specific
"repository already exists" error variant and return an OxenHttpError that maps
to HTTP 409 (Conflict) instead of delegating to OxenHttpError::from which yields
a 500; update the Err arm of the match on start_fork to pattern-match (or
inspect) the repo-already-exists error and return the appropriate OxenHttpError
conflict variant, otherwise fall back to OxenHttpError::from(err) for other
errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/lib/src/error.rs`:
- Around line 44-46: The server-side error mapper in crates/server/src/errors.rs
still treats HeadNotFound, NoCommitsFound, and the new LocalRepoNotFound (and
the new workspace/path-specific not-found variants) as INTERNAL_SERVER_ERROR;
update the error-to-HTTP mapping function (the ResponseError/mapper
implementation) to add explicit match arms for HeadNotFound, NoCommitsFound,
LocalRepoNotFound, and the new workspace/path-specific not-found variants so
they return HTTP 404 Not Found (instead of falling through to 500), and ensure
the response message includes the original error message for debugging.

---

Outside diff comments:
In `@crates/server/src/controllers/commits.rs`:
- Around line 708-716: The current match collapses all errors from
repositories::commits::create_empty_commit into
OxenHttpError::InternalServerError; change the Err arm to pattern-match the
specific validation/mismatch error(s) returned by create_empty_commit (e.g.,
history mismatch / validation variant(s)) and return a client error (e.g.,
OxenHttpError::BadRequest or an appropriate 400 response) for those cases while
logging and returning other errors as InternalServerError; keep references to
create_empty_commit, CommitResponse, and OxenHttpError in the handler and
include the validation error message in the log/response for the 400 path.

In `@crates/server/src/controllers/fork.rs`:
- Around line 50-58: The match handling the result of
repositories::fork::start_fork should detect the specific "repository already
exists" error variant and return an OxenHttpError that maps to HTTP 409
(Conflict) instead of delegating to OxenHttpError::from which yields a 500;
update the Err arm of the match on start_fork to pattern-match (or inspect) the
repo-already-exists error and return the appropriate OxenHttpError conflict
variant, otherwise fall back to OxenHttpError::from(err) for other errors.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b9c4771c-b8bb-4c16-9d8c-05f2aabba254

📥 Commits

Reviewing files that changed from the base of the PR and between 6e74659 and cab6411.

📒 Files selected for processing (30)
  • crates/lib/src/api/client/repositories.rs
  • crates/lib/src/core/v_latest/commits.rs
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/core/v_latest/pull.rs
  • crates/lib/src/core/v_latest/push.rs
  • crates/lib/src/core/v_latest/workspaces/commit.rs
  • crates/lib/src/core/v_latest/workspaces/data_frames.rs
  • crates/lib/src/core/v_latest/workspaces/diff.rs
  • crates/lib/src/core/v_latest/workspaces/files.rs
  • crates/lib/src/core/v_old/v0_19_0/pull.rs
  • crates/lib/src/error.rs
  • crates/lib/src/repositories.rs
  • crates/lib/src/repositories/checkout.rs
  • crates/lib/src/repositories/diffs.rs
  • crates/lib/src/repositories/fork.rs
  • crates/lib/src/repositories/tree.rs
  • crates/lib/src/repositories/workspaces.rs
  • crates/lib/src/repositories/workspaces/data_frames/embeddings.rs
  • crates/lib/src/util/fs.rs
  • crates/oxen-py/src/py_remote_data_frame.rs
  • crates/server/src/controllers/commits.rs
  • crates/server/src/controllers/diff.rs
  • crates/server/src/controllers/entries.rs
  • crates/server/src/controllers/fork.rs
  • crates/server/src/controllers/merger.rs
  • crates/server/src/controllers/metadata.rs
  • crates/server/src/controllers/repositories.rs
  • crates/server/src/controllers/schemas.rs
  • crates/server/src/controllers/tree.rs
  • crates/server/src/errors.rs
✅ Files skipped from review due to trivial changes (5)
  • crates/lib/src/repositories/workspaces/data_frames/embeddings.rs
  • crates/lib/src/core/v_latest/workspaces/diff.rs
  • crates/lib/src/repositories/tree.rs
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/core/v_latest/workspaces/files.rs
🚧 Files skipped from review as they are similar to previous changes (17)
  • crates/lib/src/core/v_old/v0_19_0/pull.rs
  • crates/server/src/controllers/tree.rs
  • crates/server/src/controllers/schemas.rs
  • crates/lib/src/core/v_latest/pull.rs
  • crates/oxen-py/src/py_remote_data_frame.rs
  • crates/server/src/controllers/repositories.rs
  • crates/server/src/controllers/diff.rs
  • crates/lib/src/repositories.rs
  • crates/lib/src/repositories/fork.rs
  • crates/lib/src/repositories/diffs.rs
  • crates/server/src/errors.rs
  • crates/lib/src/core/v_latest/push.rs
  • crates/lib/src/repositories/workspaces.rs
  • crates/lib/src/core/v_latest/workspaces/data_frames.rs
  • crates/lib/src/core/v_latest/commits.rs
  • crates/lib/src/core/v_latest/workspaces/commit.rs
  • crates/server/src/controllers/merger.rs

Comment on lines +44 to +46
/// When a local repository cannot be found at the given path.
#[error("No oxen repository found at {0}")]
LocalRepoNotFound(Box<PathBufError>),
LocalRepoNotFound(PathBufError),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Update the server error mapper alongside these new variants.

crates/server/src/errors.rs still falls back to INTERNAL_SERVER_ERROR for HeadNotFound and NoCommitsFound, and it does not have explicit branches for the new workspace/path-specific not-found variants introduced here either. After this cleanup, those conditions will surface as 500s once they cross an HTTP boundary.

Also applies to: 82-99, 116-118

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/error.rs` around lines 44 - 46, The server-side error mapper
in crates/server/src/errors.rs still treats HeadNotFound, NoCommitsFound, and
the new LocalRepoNotFound (and the new workspace/path-specific not-found
variants) as INTERNAL_SERVER_ERROR; update the error-to-HTTP mapping function
(the ResponseError/mapper implementation) to add explicit match arms for
HeadNotFound, NoCommitsFound, LocalRepoNotFound, and the new
workspace/path-specific not-found variants so they return HTTP 404 Not Found
(instead of falling through to 500), and ensure the response message includes
the original error message for debugging.

branch_name.to_string().into(),
));
// TODO: should this instead be BranchNotFound ??
// Or should we eliminate "branch not found" and always use "revision not found"?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep both. In Oxen terminology a "revision" is either a commit or branch reference. Many places take either, so "revision not found" is the correct error message. But some APIs only deal with branches, so "branch not found" is a better error message for those.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification! I'll change this to be BranchNotFound.

Adding docstrings to all `OxenError` variants. Docstring explains what the variant is
and, if applicable, the situations and code paths where it is encountered.

Audited all `OxenError` variants and identified variants that are never used in code.
These variants have been removed.

Additionally, some variants only ever use statically allocated strings as their inner
value. These have been replaced with non-value bearing enum variants.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
crates/server/src/controllers/commits.rs (1)

708-716: ⚠️ Potential issue | 🟠 Major

Don't squash typed commit-creation errors into a generic 500.

This branch discards the original OxenError, so any client-fixable failure from create_empty_commit() can no longer use the shared HTTP mapping layer. That also leaves no way to satisfy the documented 400 “mismatched remote history” response on Line 675. At minimum, propagate the typed error here; if the divergence case still exists, keep a dedicated 4xx mapping for it.

Suggested fix
         Err(err) => {
             log::error!("Err create_commit: {err}");
-            Err(OxenHttpError::InternalServerError)
+            Err(err.into())
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/server/src/controllers/commits.rs` around lines 708 - 716, The current
handler is swallowing the typed OxenError returned by
repositories::commits::create_empty_commit and always returning a generic
OxenHttpError::InternalServerError; instead, propagate the original OxenError
(or convert it via the existing HTTP mapping) so client-fixable errors aren't
lost. Update the Err arm that currently logs and returns
OxenHttpError::InternalServerError to map/convert the returned OxenError into
the appropriate OxenHttpError (preserving 4xx cases such as the “mismatched
remote history” divergence) and return that mapped error; keep the log but do
not discard the original error value from create_empty_commit.
crates/server/src/errors.rs (1)

521-532: ⚠️ Potential issue | 🟡 Minor

Make the JSON status message match the 404.

This arm returns HttpResponse::NotFound() but still serializes status_message: MSG_INTERNAL_SERVER_ERROR, so the body and HTTP status disagree. Please switch the payload to the not-found status message here.

Suggested fix
                     "status": STATUS_ERROR,
-                    "status_message": MSG_INTERNAL_SERVER_ERROR,
+                    "status_message": MSG_RESOURCE_NOT_FOUND,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/server/src/errors.rs` around lines 521 - 532, The NoRowsFound match
arm in the OxenError handling currently returns HttpResponse::NotFound() but
sets "status_message" to MSG_INTERNAL_SERVER_ERROR; update the JSON payload in
the OxenError::NoRowsFound arm so the "status_message" matches the 404 response
(replace MSG_INTERNAL_SERVER_ERROR with the appropriate not-found constant,
e.g., MSG_NOT_FOUND or the project's not-found message) and ensure "status"
remains STATUS_ERROR and the rest of the fields are unchanged.
crates/server/src/controllers/fork.rs (1)

50-57: ⚠️ Potential issue | 🟠 Major

Map destination collisions back to 409.

start_fork() now reports an existing destination via a generic OxenError::basic_str(...) (crates/lib/src/repositories/fork.rs:49-58). With only the generic Err(err) path left here, that case gets converted through OxenHttpError::from(err) and lands as a 500, so this endpoint no longer honors its documented 409 behavior for duplicate fork destinations. Please restore a typed conflict mapping in this controller or in the shared error conversion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/server/src/controllers/fork.rs` around lines 50 - 57, The controller
currently converts all errors from repositories::fork::start_fork(...) into
OxenHttpError::from(err) which yields 500s; detect the specific
destination-exists case reported by start_fork (it uses
OxenError::basic_str(...)) and map it to a 409 Conflict response instead. Update
the Err(err) arm in the match in this controller (the fork handler in
crates/server/src/controllers/fork.rs) to pattern-match or inspect the error
(e.g., check for the OxenError basic/variant or a "destination" substring on
err.to_string()) and return an appropriate Conflict response
(HttpResponse::Conflict or OxenHttpError variant for conflict) for that case,
falling back to Err(OxenHttpError::from(err)) for other errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/lib/src/core/v_latest/workspaces/commit.rs`:
- Around line 111-113: The BranchNotFound variant is being constructed with the
raw branch_name, which renders just "main" instead of a full not-found message;
update the error construction in the branch lookup in commit.rs (the let
Some(branch) = repositories::branches::get_by_name(&workspace.base_repo,
branch_name)? else { ... } block) to pass a formatted/localized branch string
(use the existing local_branch helper or otherwise format the branch name into
the full message) into OxenError::BranchNotFound rather than passing
branch_name.into() so the error message prints the full human-readable text.

In `@crates/lib/src/core/v_latest/workspaces/files.rs`:
- Around line 426-431: The match arm handling workspace_behind_error in
upload_zip is re-propagating the raw OxenError::WorkspaceBehind and should
instead translate it to the structured HTTP error used elsewhere; update the
workspace_behind_error @ Err(OxenError::WorkspaceBehind(_)) arm in the
upload_zip function to construct and return the OxenHttpError::WorkspaceBehind
payload (matching the translation used in controllers/workspaces.rs) so that
callers like controllers/import.rs receive the proper WorkspaceBehind HTTP
response with the conflict payload.
- Around line 291-302: validate_url_target currently performs DNS resolution and
checks IPs but discards the resolved SocketAddr results, letting fetch_file call
client.get(hostname) which triggers a fresh DNS lookup and opens a DNS-rebinding
window; change validate_url_target to return the validated Vec<SocketAddr> (or
similar) instead of just Ok(()) and update callers (e.g., fetch_file) to pass
those SocketAddrs into the request so reqwest does not re-resolve the hostname
(use the resolved SocketAddr for connecting or configure the request client to
use the provided IPs), and apply the same change pattern to the other
occurrences noted (around the sections referenced at 370-372 and 498-500).

---

Outside diff comments:
In `@crates/server/src/controllers/commits.rs`:
- Around line 708-716: The current handler is swallowing the typed OxenError
returned by repositories::commits::create_empty_commit and always returning a
generic OxenHttpError::InternalServerError; instead, propagate the original
OxenError (or convert it via the existing HTTP mapping) so client-fixable errors
aren't lost. Update the Err arm that currently logs and returns
OxenHttpError::InternalServerError to map/convert the returned OxenError into
the appropriate OxenHttpError (preserving 4xx cases such as the “mismatched
remote history” divergence) and return that mapped error; keep the log but do
not discard the original error value from create_empty_commit.

In `@crates/server/src/controllers/fork.rs`:
- Around line 50-57: The controller currently converts all errors from
repositories::fork::start_fork(...) into OxenHttpError::from(err) which yields
500s; detect the specific destination-exists case reported by start_fork (it
uses OxenError::basic_str(...)) and map it to a 409 Conflict response instead.
Update the Err(err) arm in the match in this controller (the fork handler in
crates/server/src/controllers/fork.rs) to pattern-match or inspect the error
(e.g., check for the OxenError basic/variant or a "destination" substring on
err.to_string()) and return an appropriate Conflict response
(HttpResponse::Conflict or OxenHttpError variant for conflict) for that case,
falling back to Err(OxenHttpError::from(err)) for other errors.

In `@crates/server/src/errors.rs`:
- Around line 521-532: The NoRowsFound match arm in the OxenError handling
currently returns HttpResponse::NotFound() but sets "status_message" to
MSG_INTERNAL_SERVER_ERROR; update the JSON payload in the OxenError::NoRowsFound
arm so the "status_message" matches the 404 response (replace
MSG_INTERNAL_SERVER_ERROR with the appropriate not-found constant, e.g.,
MSG_NOT_FOUND or the project's not-found message) and ensure "status" remains
STATUS_ERROR and the rest of the fields are unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1fccb4cf-185f-46e2-88e5-d25e5c14d0a4

📥 Commits

Reviewing files that changed from the base of the PR and between cab6411 and 11dd0f0.

📒 Files selected for processing (30)
  • crates/lib/src/api/client/repositories.rs
  • crates/lib/src/core/v_latest/commits.rs
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/core/v_latest/pull.rs
  • crates/lib/src/core/v_latest/push.rs
  • crates/lib/src/core/v_latest/workspaces/commit.rs
  • crates/lib/src/core/v_latest/workspaces/data_frames.rs
  • crates/lib/src/core/v_latest/workspaces/diff.rs
  • crates/lib/src/core/v_latest/workspaces/files.rs
  • crates/lib/src/core/v_old/v0_19_0/pull.rs
  • crates/lib/src/error.rs
  • crates/lib/src/repositories.rs
  • crates/lib/src/repositories/checkout.rs
  • crates/lib/src/repositories/diffs.rs
  • crates/lib/src/repositories/fork.rs
  • crates/lib/src/repositories/tree.rs
  • crates/lib/src/repositories/workspaces.rs
  • crates/lib/src/repositories/workspaces/data_frames/embeddings.rs
  • crates/lib/src/util/fs.rs
  • crates/oxen-py/src/py_remote_data_frame.rs
  • crates/server/src/controllers/commits.rs
  • crates/server/src/controllers/diff.rs
  • crates/server/src/controllers/entries.rs
  • crates/server/src/controllers/fork.rs
  • crates/server/src/controllers/merger.rs
  • crates/server/src/controllers/metadata.rs
  • crates/server/src/controllers/repositories.rs
  • crates/server/src/controllers/schemas.rs
  • crates/server/src/controllers/tree.rs
  • crates/server/src/errors.rs
✅ Files skipped from review due to trivial changes (3)
  • crates/lib/src/core/v_old/v0_19_0/pull.rs
  • crates/lib/src/core/v_latest/workspaces/diff.rs
  • crates/oxen-py/src/py_remote_data_frame.rs
🚧 Files skipped from review as they are similar to previous changes (13)
  • crates/lib/src/api/client/repositories.rs
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/repositories/fork.rs
  • crates/lib/src/core/v_latest/pull.rs
  • crates/lib/src/repositories.rs
  • crates/lib/src/repositories/tree.rs
  • crates/server/src/controllers/diff.rs
  • crates/lib/src/core/v_latest/commits.rs
  • crates/lib/src/util/fs.rs
  • crates/server/src/controllers/metadata.rs
  • crates/lib/src/repositories/workspaces.rs
  • crates/lib/src/repositories/diffs.rs
  • crates/lib/src/core/v_latest/workspaces/data_frames.rs

Comment on lines 111 to +113
let Some(branch) = repositories::branches::get_by_name(&workspace.base_repo, branch_name)?
else {
return Err(OxenError::revision_not_found(
branch_name.to_string().into(),
));
return Err(OxenError::BranchNotFound(branch_name.into()));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

BranchNotFound now renders only the raw branch name.

crates/lib/src/error.rs formats OxenError::BranchNotFound as "{0}", so this path currently returns just main / feature/foo instead of a full not-found message. Please keep using the local-branch helper here or format the message before constructing the variant.

💡 Suggested fix
-        return Err(OxenError::BranchNotFound(branch_name.into()));
+        return Err(OxenError::local_branch_not_found(branch_name));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let Some(branch) = repositories::branches::get_by_name(&workspace.base_repo, branch_name)?
else {
return Err(OxenError::revision_not_found(
branch_name.to_string().into(),
));
return Err(OxenError::BranchNotFound(branch_name.into()));
let Some(branch) = repositories::branches::get_by_name(&workspace.base_repo, branch_name)?
else {
return Err(OxenError::local_branch_not_found(branch_name));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/core/v_latest/workspaces/commit.rs` around lines 111 - 113,
The BranchNotFound variant is being constructed with the raw branch_name, which
renders just "main" instead of a full not-found message; update the error
construction in the branch lookup in commit.rs (the let Some(branch) =
repositories::branches::get_by_name(&workspace.base_repo, branch_name)? else {
... } block) to pass a formatted/localized branch string (use the existing
local_branch helper or otherwise format the branch name into the full message)
into OxenError::BranchNotFound rather than passing branch_name.into() so the
error message prints the full human-readable text.

Comment on lines +291 to +302
async fn validate_url_target(url: &Url) -> Result<(), InvalidUrlError> {
let host = url.host_str().ok_or(InvalidUrlError::NoHost)?;
let port = url.port_or_known_default().unwrap_or(443);
let addr = format!("{host}:{port}");

let resolved = tokio::net::lookup_host(&addr).await.map_err(|e| {
OxenError::file_import_error(format!("DNS resolution failed for {host}: {e}"))
})?;
let resolved =
tokio::net::lookup_host(&addr)
.await
.map_err(|e| InvalidUrlError::DnsResolutionFailed {
host: host.to_string(),
source: e,
})?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

In reqwest 0.13.x, does Client::get(url).send()perform DNS resolution independently of a priortokio::net::lookup_host, and is tokio::net::lookup_host((host, port))the correct way to resolve IPv6 literals returned byurl::Url::host_str()?

💡 Result:

In reqwest 0.13.x, Client::get(url).send performs DNS resolution independently of any prior tokio::net::lookup_host call. Reqwest uses its own configurable DNS resolver (via ClientBuilder::dns_resolver), which resolves the host from the URL's http::Uri during connection setup. A prior lookup_host does not affect or share state with reqwest's internal resolution process. tokio::net::lookup_host((host, port)) is the correct way to resolve an IPv6 literal returned by url::Url::host_str. The host_str returns the raw host string (e.g., "[::1]" for IPv6 literals). lookup_host uses std::net::ToSocketAddrs, which parses IP literals (IPv4/IPv6) directly without DNS lookup, combining them with the provided port to yield SocketAddrs. For domain names, it performs DNS resolution. This matches reqwest's expected SocketAddr format from its resolver.

Citations:


🏁 Script executed:

# First, check the file structure and see lines around the mentioned locations
wc -l crates/lib/src/core/v_latest/workspaces/files.rs

Repository: Oxen-AI/Oxen

Length of output: 109


🏁 Script executed:

# Check the validate_url_target function (lines 291-302)
sed -n '285,310p' crates/lib/src/core/v_latest/workspaces/files.rs

Repository: Oxen-AI/Oxen

Length of output: 984


🏁 Script executed:

# Check lines 370-372 and 498-500 for similar patterns
sed -n '365,375p' crates/lib/src/core/v_latest/workspaces/files.rs
echo "---"
sed -n '493,505p' crates/lib/src/core/v_latest/workspaces/files.rs

Repository: Oxen-AI/Oxen

Length of output: 891


🏁 Script executed:

# Find how the validated result from validate_url_target is used (if at all)
rg "validate_url_target" -A 10 -B 2 crates/lib/src/core/v_latest/workspaces/files.rs

Repository: Oxen-AI/Oxen

Length of output: 1457


🏁 Script executed:

# Check if reqwest::Client is used to send requests with the original URL
rg "reqwest.*Client|\.get\(" crates/lib/src/core/v_latest/workspaces/files.rs

Repository: Oxen-AI/Oxen

Length of output: 234


🏁 Script executed:

# Let me check the URL library documentation and test how host_str() works for IPv6
# Search for url crate usage in the file
rg "use.*url|from_str.*Url" crates/lib/src/core/v_latest/workspaces/files.rs | head -20

Repository: Oxen-AI/Oxen

Length of output: 69


🏁 Script executed:

# Check Cargo.toml for url crate version
fd Cargo.toml crates/lib | xargs grep -A 10 "^\[dependencies\]" | head -30

Repository: Oxen-AI/Oxen

Length of output: 417


🏁 Script executed:

# Let's search for how IPv6 URLs would be handled - check if there are any tests
fd "test.*files\|files.*test" --type f | xargs grep -l "validate_url_target\|ipv6\|::" 2>/dev/null | head -5

Repository: Oxen-AI/Oxen

Length of output: 38


🏁 Script executed:

# Look for actual uses of fetch_file and how URLs are constructed to understand the full flow
sed -n '320,350p' crates/lib/src/core/v_latest/workspaces/files.rs

Repository: Oxen-AI/Oxen

Length of output: 946


🏁 Script executed:

# Check if there's a return value or stored SocketAddrs from validate_url_target
ast-grep --pattern 'fn validate_url_target($_) -> $_ {
  $$$
  Ok($_)
}'

Repository: Oxen-AI/Oxen

Length of output: 38


🏁 Script executed:

# Let's check what the actual fetch function does and how it uses the URL
sed -n '380,420p' crates/lib/src/core/v_latest/workspaces/files.rs

Repository: Oxen-AI/Oxen

Length of output: 1136


🏁 Script executed:

# Let's examine the fetch_file function to confirm the DNS-rebinding issue
sed -n '251,290p' crates/lib/src/core/v_latest/workspaces/files.rs

Repository: Oxen-AI/Oxen

Length of output: 1263


🏁 Script executed:

# Check if resolved SocketAddrs are ever used after validation
rg "resolved|SocketAddr" crates/lib/src/core/v_latest/workspaces/files.rs

Repository: Oxen-AI/Oxen

Length of output: 188


🏁 Script executed:

# Let's check the fetch_file signature and how it uses current_url
sed -n '330,360p' crates/lib/src/core/v_latest/workspaces/files.rs

Repository: Oxen-AI/Oxen

Length of output: 906


🏁 Script executed:

# Confirm that validate_url_target result is not reused - check the full context around the call
sed -n '365,385p' crates/lib/src/core/v_latest/workspaces/files.rs

Repository: Oxen-AI/Oxen

Length of output: 598


🏁 Script executed:

# Let's verify how the URL and SocketAddrs would flow through - check if resolved is ever stored/returned
sed -n '291,315p' crates/lib/src/core/v_latest/workspaces/files.rs

Repository: Oxen-AI/Oxen

Length of output: 857


🏁 Script executed:

# Let me double-check the fetch_file implementation to see how it uses the URL
sed -n '450,480p' crates/lib/src/core/v_latest/workspaces/files.rs

Repository: Oxen-AI/Oxen

Length of output: 1103


🏁 Script executed:

# Look for the complete fetch_file function signature
rg "async fn fetch_file\|client\.get" -A 2 crates/lib/src/core/v_latest/workspaces/files.rs | head -20

Repository: Oxen-AI/Oxen

Length of output: 38


validate_url_target() discards the DNS validation result before the request is sent, leaving a DNS-rebinding vulnerability.

The function validates that a hostname resolves to a non-private IP, but returns only Ok(()) without preserving the validated SocketAddrs. Later, fetch_file() calls client.get(current_url.as_str()), which causes reqwest to perform DNS resolution independently. An attacker controlling DNS can change the record between the two resolutions to point to a private IP address, bypassing the validation. Return the validated SocketAddrs from validate_url_target() and pass them to the request path instead of re-resolving by hostname.

Also applies to: 370-372, 498-500

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/core/v_latest/workspaces/files.rs` around lines 291 - 302,
validate_url_target currently performs DNS resolution and checks IPs but
discards the resolved SocketAddr results, letting fetch_file call
client.get(hostname) which triggers a fresh DNS lookup and opens a DNS-rebinding
window; change validate_url_target to return the validated Vec<SocketAddr> (or
similar) instead of just Ok(()) and update callers (e.g., fetch_file) to pass
those SocketAddrs into the request so reqwest does not re-resolve the hostname
(use the resolved SocketAddr for connecting or configure the request client to
use the provided IPs), and apply the same change pattern to the other
occurrences noted (around the sections referenced at 370-372 and 498-500).

Comment on lines +426 to +431
workspace_behind_error @ Err(OxenError::WorkspaceBehind(_)) => {
log::error!(
"unable to commit branch {:?}. Workspace behind",
branch.name
);
Err(OxenError::WorkspaceBehind(workspace))
workspace_behind_error
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The zip-import path still needs the WorkspaceBehind HTTP translation.

This branch re-propagates the raw OxenError, but crates/server/src/controllers/import.rs still calls repositories::workspaces::files::upload_zip(...).await? directly. Unlike crates/server/src/controllers/workspaces.rs:416-420, that path never builds the structured OxenHttpError::WorkspaceBehind(...) response, so stale zip imports will miss the conflict payload the regular workspace endpoints emit.

🔎 Quick verification
#!/bin/bash
rg -n --type=rust "workspaces::files::upload_zip|WorkspaceBehind" crates/server/src/controllers -C3
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/core/v_latest/workspaces/files.rs` around lines 426 - 431, The
match arm handling workspace_behind_error in upload_zip is re-propagating the
raw OxenError::WorkspaceBehind and should instead translate it to the structured
HTTP error used elsewhere; update the workspace_behind_error @
Err(OxenError::WorkspaceBehind(_)) arm in the upload_zip function to construct
and return the OxenHttpError::WorkspaceBehind payload (matching the translation
used in controllers/workspaces.rs) so that callers like controllers/import.rs
receive the proper WorkspaceBehind HTTP response with the conflict payload.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants