fix: reject path traversal and null bytes in path resolver#68
Merged
Conversation
Add validation functions to path_resolver.rs that reject: - Literal null bytes (\0) and URL-encoded null bytes (%00) - Path traversal sequences (..) and URL-encoded variants (%2e%2e) Changed resolve_path, join_paths, PathResolver::resolve, and PathResolver::child_resolver to return Result<String> so validation errors propagate to callers. Added validate_path() call at the start of parse_object_store_url(). Updated all callers in catalog.rs, schema.rs, table.rs, table_changes.rs, table_deletions.rs, and table_functions.rs to handle the new Result return types. Includes comprehensive tests for null bytes, path traversal, and allowed patterns (single dots, file extensions, version strings). Closes #54, closes #55 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
join_paths now returns Result<String> after path validation was added. Update all call sites in table_writer.rs to propagate errors with ?. Also fix clippy warning for manual char comparison in path_resolver.rs.
- Remove unnecessary to_ascii_lowercase() in validate_no_null_bytes() since %00 contains no letters (zero-allocation improvement) - Replace %2e%2e substring check with full percent-decode approach in validate_no_path_traversal() to catch ..%2f, ..%2F, and mixed encoded traversal patterns that bypassed the literal `/` split - Restore missing closing paren in doc comments for object_store_url field in catalog.rs and schema.rs
- Restore percent-decoding in parse_file_url() to handle non-ASCII filesystem paths (regression from main) - Add fast-path in validate_no_path_traversal() to skip allocation when path contains no percent-encoded characters - Remove unnecessary .map_err() in encryption block where caller and callee share the same error type
…ll-bytes # Conflicts: # src/path_resolver.rs
90ff643 to
33b0035
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
validate_no_null_bytes(),validate_no_path_traversal(), andvalidate_path()topath_resolver.rsresolve_path,join_paths,PathResolver::resolve,PathResolver::child_resolverto returnResult<String>so validation errors propagateparse_object_store_url) and injoin_paths/resolve_pathfor relative and absolute pathscatalog.rs,schema.rs,table.rs,table_changes.rs,table_deletions.rs,table_functions.rs\0, URL-encoded%00,..path components, URL-encoded%2e%2eschema.v2/),%20spacesCloses #54, closes #55
Test plan
path_resolver.rscovering null bytes, path traversal, and allowed patternscargo buildcompiles cleanly🤖 Generated with Claude Code