HTTP Client TCP Keep Alives & Explicit Hostname Type#345
HTTP Client TCP Keep Alives & Explicit Hostname Type#345malcolmgreaves wants to merge 5 commits intomainfrom
Conversation
The problem was that we were incorrectly using the full repository URL instead of supplying just the host. This caused the auth to fail because it was trying to provide an auth token for the repository url instead of the host (i.e. hub.oxen.ai/api/ox/repo instead of hub.oxen.ai). To prevent confusion, `String` is no longer used to indicate the host. Instead, the dedicated `Host` enum from `url::Url` is used. All call-sites have bene updated to properly parse the URL and extract the host. If there's no host in the URL, then the new `OxenError::NoHost` variant is returned. A better design would be to keep `Url`'s present instead of `String`. The best design would be a new wrapper for `Url` that would always have the `scheme` and `host` present, since we never want to use a repository URL that doesn't have a host (and a URL can be valid w/o having a host).
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis pull request refactors Oxen's HTTP client infrastructure by introducing a Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can customize the high-level summary generated by CodeRabbit.Configure the |
|
This works with auth: This is for a remote repository in the hub: |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (6)
oxen-rust/crates/lib/src/api/client/compare.rs (1)
399-399: Prefer a named constant for the test loop bound.Using a small local constant (e.g.,
TOTAL_DIRS) would make this less magic-number-y and keep comment/logic aligned if the count changes later.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/compare.rs` at line 399, Replace the magic loop bound "5" with a named constant to clarify intent: add a const (e.g., const TOTAL_DIRS: usize = 5;) near the containing test or module scope and change the loop from for i in 0..5 to for i in 0..TOTAL_DIRS; use the symbol TOTAL_DIRS in any related comments or assertions so the test count stays consistent if it changes later (refer to the loop in compare.rs where the for i in 0..5 occurs).oxen-rust/crates/lib/src/util/internal_types.rs (1)
4-6: Consider addingis_empty()method to avoid Clippy warnings.Clippy's
len_without_is_emptylint typically warns when alen()method is provided without a correspondingis_empty()method. While this trait ispub(crate), addingis_empty()would make it more complete and avoid potential lint warnings if callers need to check emptiness.♻️ Optional: Add is_empty method
/// Indicates that the type has a length. pub(crate) trait HasLen { fn len(&self) -> usize; + fn is_empty(&self) -> bool { + self.len() == 0 + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/util/internal_types.rs` around lines 4 - 6, The HasLen trait exposes fn len(&self) -> usize but is missing is_empty(), which triggers clippy's len_without_is_empty; add fn is_empty(&self) -> bool to the trait with a default implementation that returns self.len() == 0, and update any existing impls only if they override is_empty; ensure the trait signature (HasLen) and the len method remain unchanged while providing the default is_empty implementation so callers get the convenience and clippy warning is avoided.oxen-rust/crates/lib/src/model/repository/remote_repository.rs (1)
72-72: Consider caching the parsed Hostname to avoid repeated parsing.Each of
host(),port(), andscheme()independently parsesself.remote.urltwice (once tourl::Url, then viaHostname::from_url). If these methods are called together or frequently, this creates redundant parsing overhead.♻️ Potential optimization: Add a helper method
+ fn hostname(&self) -> Result<Hostname, OxenError> { + Hostname::from_url(&self.remote.url.parse()?) + } + pub fn host(&self) -> Result<String, OxenError> { - let hn = Hostname::from_url(&self.remote.url.parse()?)?; - Ok(hn.hostname()) + Ok(self.hostname()?.hostname()) } pub fn port(&self) -> Result<Option<u16>, OxenError> { - let hn = Hostname::from_url(&self.remote.url.parse()?)?; - Ok(hn.port) + Ok(self.hostname()?.port) } pub fn scheme(&self) -> Result<String, OxenError> { - let hn = Hostname::from_url(&self.remote.url.parse()?)?; - Ok(hn.scheme) + Ok(self.hostname()?.scheme) }This doesn't reduce parsing when methods are called individually, but improves readability. For true caching, consider storing
Hostnameas a lazily-computed field.Also applies to: 78-78, 84-84
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/model/repository/remote_repository.rs` at line 72, The Hostname parsing is repeated in host(), port(), and scheme(): avoid redundant url parsing by introducing a single helper on RemoteRepository (e.g., a private method like parsed_hostname(&self) -> Result<Hostname, Error>) that parses self.remote.url once using self.remote.url.parse()? and Hostname::from_url, and have host(), port(), and scheme() call that helper; for stronger optimization consider making Hostname a lazily-computed field on the struct (e.g., Option<Hostname> with initialization on first access) so subsequent calls reuse the parsed Hostname instead of reparsing.oxen-rust/crates/lib/src/api/client/versions.rs (1)
199-208: Redundant authentication error handling in retry closure.The
with_retryfunction already handlesOxenError::Authenticationspecially (returning immediately without retry). The explicit match arms on lines 203-204 are redundant since the error will be returned as-is bywith_retry.♻️ Simplified retry closure
let config = retry::RetryConfig::default(); retry::with_retry(&config, |_attempt| async { - match try_download_data_from_version_paths(remote_repo, hashes, local_repo).await { - Ok(val) => Ok(val), - Err(OxenError::Authentication(val)) => Err(OxenError::Authentication(val)), - Err(err) => Err(err), - } + try_download_data_from_version_paths(remote_repo, hashes, local_repo).await }) .await🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/versions.rs` around lines 199 - 208, The retry closure redundantly matches OxenError::Authentication even though retry::with_retry already treats that variant specially; simplify the closure inside retry::with_retry(&config, |_attempt| async { ... }) by directly returning the Result from try_download_data_from_version_paths(remote_repo, hashes, local_repo).await (i.e., propagate Ok or Err as-is) instead of matching and re-wrapping OxenError::Authentication; this removes the unnecessary match while keeping the call sites try_download_data_from_version_paths and error propagation unchanged.oxen-rust/crates/lib/src/api/client/file.rs (1)
96-108: Unnecessaryasyncon synchronous function.The
make_file_partfunction is markedasyncbut contains noawaitpoints. This adds unnecessary overhead from the async state machine.♻️ Remove unnecessary async
/// Create a Part in a multipart Form from the specified data. /// Intended to be used for uploading a single file. -async fn make_file_part<T: Into<reqwest::Body> + HasLen>( +fn make_file_part<T: Into<reqwest::Body> + HasLen>( file_data: T, file_name: Option<&str>, ) -> Result<Part, OxenError> {And update the call site on line 84:
- let file_part = make_file_part(file_data, file_name).await?; + let file_part = make_file_part(file_data, file_name)?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client/file.rs` around lines 96 - 108, The function make_file_part is declared async but has no await points; remove the async keyword and make it a regular synchronous function returning Result<Part, OxenError>, and update any call sites that currently use .await when invoking make_file_part to call it synchronously (remove the .await). Locate the function by its name make_file_part and update callers that reference it (remove awaiting and adjust error handling if needed).oxen-rust/crates/lib/src/api/client.rs (1)
184-197:build_user_agentreturnsResultbut never errors.The function signature returns
Result<String, OxenError>but the implementation only ever returnsOk(...). This could be simplified to returnStringdirectly, though it may be intentionally prepared for future fallibility.♻️ Optional: Simplify return type if no errors expected
-fn build_user_agent(config: &RuntimeConfig) -> Result<String, OxenError> { +fn build_user_agent(config: &RuntimeConfig) -> String { let host_platform = config.host_platform.display_name(); let runtime_name = match config.runtime_name { Runtime::CLI => config.runtime_name.display_name().to_string(), _ => format!( "{} {}", config.runtime_name.display_name(), config.runtime_version ), }; - Ok(format!( - "{USER_AGENT}/{VERSION} ({host_platform}; {runtime_name})" - )) + format!("{USER_AGENT}/{VERSION} ({host_platform}; {runtime_name})") }And update call site on line 141:
- let user_agent = build_user_agent(&config)?; + let user_agent = build_user_agent(&config);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oxen-rust/crates/lib/src/api/client.rs` around lines 184 - 197, The build_user_agent function currently returns Result<String, OxenError> but never errors; change its signature to fn build_user_agent(config: &RuntimeConfig) -> String and return the formatted string directly (remove Ok(...)). Then update all callers that expect a Result (e.g., the caller that previously used build_user_agent(...)?) to accept a String instead (remove any unwrapping or ? handling) so call sites compile with the new return type.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@oxen-rust/crates/lib/src/api/client/versions.rs`:
- Around line 479-493: The loop in versions.rs currently iterates with
0..max_retries, producing only max_retries attempts which is inconsistent with
the centralized with_retry (which uses 0..=config.max_retries to produce
max_retries+1 attempts); update the retry loop that calls multipart_batch_upload
(and uses variables max_retries, retry::exponential_backoff, BASE_WAIT_TIME,
MAX_WAIT_TIME) to use 0..=max_retries so it performs the same total number of
attempts as with_retry, keeping the existing “Don't sleep after the last
attempt” check intact.
---
Nitpick comments:
In `@oxen-rust/crates/lib/src/api/client.rs`:
- Around line 184-197: The build_user_agent function currently returns
Result<String, OxenError> but never errors; change its signature to fn
build_user_agent(config: &RuntimeConfig) -> String and return the formatted
string directly (remove Ok(...)). Then update all callers that expect a Result
(e.g., the caller that previously used build_user_agent(...)?) to accept a
String instead (remove any unwrapping or ? handling) so call sites compile with
the new return type.
In `@oxen-rust/crates/lib/src/api/client/compare.rs`:
- Line 399: Replace the magic loop bound "5" with a named constant to clarify
intent: add a const (e.g., const TOTAL_DIRS: usize = 5;) near the containing
test or module scope and change the loop from for i in 0..5 to for i in
0..TOTAL_DIRS; use the symbol TOTAL_DIRS in any related comments or assertions
so the test count stays consistent if it changes later (refer to the loop in
compare.rs where the for i in 0..5 occurs).
In `@oxen-rust/crates/lib/src/api/client/file.rs`:
- Around line 96-108: The function make_file_part is declared async but has no
await points; remove the async keyword and make it a regular synchronous
function returning Result<Part, OxenError>, and update any call sites that
currently use .await when invoking make_file_part to call it synchronously
(remove the .await). Locate the function by its name make_file_part and update
callers that reference it (remove awaiting and adjust error handling if needed).
In `@oxen-rust/crates/lib/src/api/client/versions.rs`:
- Around line 199-208: The retry closure redundantly matches
OxenError::Authentication even though retry::with_retry already treats that
variant specially; simplify the closure inside retry::with_retry(&config,
|_attempt| async { ... }) by directly returning the Result from
try_download_data_from_version_paths(remote_repo, hashes, local_repo).await
(i.e., propagate Ok or Err as-is) instead of matching and re-wrapping
OxenError::Authentication; this removes the unnecessary match while keeping the
call sites try_download_data_from_version_paths and error propagation unchanged.
In `@oxen-rust/crates/lib/src/model/repository/remote_repository.rs`:
- Line 72: The Hostname parsing is repeated in host(), port(), and scheme():
avoid redundant url parsing by introducing a single helper on RemoteRepository
(e.g., a private method like parsed_hostname(&self) -> Result<Hostname, Error>)
that parses self.remote.url once using self.remote.url.parse()? and
Hostname::from_url, and have host(), port(), and scheme() call that helper; for
stronger optimization consider making Hostname a lazily-computed field on the
struct (e.g., Option<Hostname> with initialization on first access) so
subsequent calls reuse the parsed Hostname instead of reparsing.
In `@oxen-rust/crates/lib/src/util/internal_types.rs`:
- Around line 4-6: The HasLen trait exposes fn len(&self) -> usize but is
missing is_empty(), which triggers clippy's len_without_is_empty; add fn
is_empty(&self) -> bool to the trait with a default implementation that returns
self.len() == 0, and update any existing impls only if they override is_empty;
ensure the trait signature (HasLen) and the len method remain unchanged while
providing the default is_empty implementation so callers get the convenience and
clippy warning is avoided.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ea8fe52c-e5ef-4512-b42b-28870dd7557a
📒 Files selected for processing (22)
oxen-rust/crates/cli/src/cmd/clone.rsoxen-rust/crates/cli/src/helpers.rsoxen-rust/crates/lib/src/api/client.rsoxen-rust/crates/lib/src/api/client/commits.rsoxen-rust/crates/lib/src/api/client/compare.rsoxen-rust/crates/lib/src/api/client/file.rsoxen-rust/crates/lib/src/api/client/import.rsoxen-rust/crates/lib/src/api/client/repositories.rsoxen-rust/crates/lib/src/api/client/retry.rsoxen-rust/crates/lib/src/api/client/tree.rsoxen-rust/crates/lib/src/api/client/versions.rsoxen-rust/crates/lib/src/api/client/workspaces/files.rsoxen-rust/crates/lib/src/config/auth_config.rsoxen-rust/crates/lib/src/constants.rsoxen-rust/crates/lib/src/core/v_latest/push.rsoxen-rust/crates/lib/src/error.rsoxen-rust/crates/lib/src/model/repository/remote_repository.rsoxen-rust/crates/lib/src/model/repository/repo_new.rsoxen-rust/crates/lib/src/repositories/workspaces/upload.rsoxen-rust/crates/lib/src/test.rsoxen-rust/crates/lib/src/util.rsoxen-rust/crates/lib/src/util/internal_types.rs
| for attempt in 0..max_retries { | ||
| files_to_retry = | ||
| multipart_batch_upload(local_repo, remote_repo, chunk, client, files_to_retry).await?; | ||
|
|
||
| if !files_to_retry.is_empty() { | ||
| let wait_time = exponential_backoff(BASE_WAIT_TIME, retry_count, MAX_WAIT_TIME); | ||
| sleep(Duration::from_millis(wait_time as u64)).await; | ||
| if files_to_retry.is_empty() { | ||
| return Ok(()); | ||
| } | ||
|
|
||
| // Don't sleep after the last attempt | ||
| if attempt + 1 < max_retries { | ||
| let wait_time = | ||
| retry::exponential_backoff(BASE_WAIT_TIME as u64, attempt, MAX_WAIT_TIME as u64); | ||
| sleep(Duration::from_millis(wait_time)).await; | ||
| } | ||
| } |
There was a problem hiding this comment.
Off-by-one inconsistency with centralized retry module.
The loop uses 0..max_retries which gives exactly max_retries attempts. However, the centralized with_retry function uses 0..=config.max_retries (line 37 of retry.rs), giving max_retries + 1 total attempts (1 initial + max_retries retries).
This inconsistency could lead to this function attempting one fewer retry than expected compared to other retry-enabled operations.
🔧 Proposed fix for consistency
- for attempt in 0..max_retries {
+ for attempt in 0..=max_retries {
files_to_retry =
multipart_batch_upload(local_repo, remote_repo, chunk, client, files_to_retry).await?;
if files_to_retry.is_empty() {
return Ok(());
}
// Don't sleep after the last attempt
- if attempt + 1 < max_retries {
+ if attempt < max_retries {
let wait_time =
retry::exponential_backoff(BASE_WAIT_TIME as u64, attempt, MAX_WAIT_TIME as u64);
sleep(Duration::from_millis(wait_time)).await;
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for attempt in 0..max_retries { | |
| files_to_retry = | |
| multipart_batch_upload(local_repo, remote_repo, chunk, client, files_to_retry).await?; | |
| if !files_to_retry.is_empty() { | |
| let wait_time = exponential_backoff(BASE_WAIT_TIME, retry_count, MAX_WAIT_TIME); | |
| sleep(Duration::from_millis(wait_time as u64)).await; | |
| if files_to_retry.is_empty() { | |
| return Ok(()); | |
| } | |
| // Don't sleep after the last attempt | |
| if attempt + 1 < max_retries { | |
| let wait_time = | |
| retry::exponential_backoff(BASE_WAIT_TIME as u64, attempt, MAX_WAIT_TIME as u64); | |
| sleep(Duration::from_millis(wait_time)).await; | |
| } | |
| } | |
| for attempt in 0..=max_retries { | |
| files_to_retry = | |
| multipart_batch_upload(local_repo, remote_repo, chunk, client, files_to_retry).await?; | |
| if files_to_retry.is_empty() { | |
| return Ok(()); | |
| } | |
| // Don't sleep after the last attempt | |
| if attempt < max_retries { | |
| let wait_time = | |
| retry::exponential_backoff(BASE_WAIT_TIME as u64, attempt, MAX_WAIT_TIME as u64); | |
| sleep(Duration::from_millis(wait_time)).await; | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxen-rust/crates/lib/src/api/client/versions.rs` around lines 479 - 493, The
loop in versions.rs currently iterates with 0..max_retries, producing only
max_retries attempts which is inconsistent with the centralized with_retry
(which uses 0..=config.max_retries to produce max_retries+1 attempts); update
the retry loop that calls multipart_batch_upload (and uses variables
max_retries, retry::exponential_backoff, BASE_WAIT_TIME, MAX_WAIT_TIME) to use
0..=max_retries so it performs the same total number of attempts as with_retry,
keeping the existing “Don't sleep after the last attempt” check intact.
Second attempt at PR #325
Fixes critical bug in #325 -- the root cause of that was improperly setting the auth token to the
right hostname. When using a remote repository to create the HTTP client, #325 was incorrectly
getting the full URL and using that. Instead, it should have been getting only the hostname.
This PR removes the confusion that led to the bug. The hostname is no longer typed as a String.
Instead, it is represented by a new
struct Hostname, which has explicithost,port, andschemefields as well as a.hostname()method that gets the correcthostand, if non-none,appends the
portto it.