-
Notifications
You must be signed in to change notification settings - Fork 2
Refactor/monorepo restructuring #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Move Python code to python/ subdirectory with git history preserved - Rename your_ai_rs/ to rust/ and add to git tracking - Split CI workflows: python-ci.yml and rust-ci.yml with path triggers - Create unified root README with 'Choose Your Implementation' section - Update CONTRIBUTING.md with separate Python and Rust guidelines - Preserve git history for all moved files BREAKING CHANGE: Directory structure has changed - Python code moved from root to python/ directory - Rust code moved from your_ai_rs/ to rust/ directory - CI workflow renamed from ci.yml to python-ci.yml - Users will need to update local paths and documentation references
- Update python-ci.yml with correct python/ paths and working directories - Update rust-ci.yml with comprehensive CI pipeline - Enhance CONTRIBUTING.md with Rust development guidelines - Create unified root README with implementation selection guide - Update Python README badge to point to python-ci.yml - Add Rust implementation documentation summaries
WalkthroughAdds a Python subdirectory with adjusted Python CI and coverage paths; introduces a full Rust port with Cargo/Make tooling, macOS-targeted Rust CI, a local patched Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CLI as your_ai CLI
participant EmpOpt as EmpiricalOptimizer
participant Trainer as DistrustTrainer
participant Memory as MemoryMonitor
participant MLX as MLX‑C FFI
CLI->>EmpOpt: optimize(model, max_memory, quick?)
EmpOpt->>Trainer: run_training_test(config)
Trainer->>MLX: forward/backward/update (FFI calls)
Trainer->>Memory: sample RSS / peak
Memory-->>EmpOpt: report peak / OOM flag
Trainer-->>EmpOpt: throughput & success/failure
EmpOpt->>EmpOpt: score configs, select best
EmpOpt-->>CLI: return optimal profile
CLI->>Trainer: start training with chosen profile (auto‑optimize)
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Files/areas to inspect closely:
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used🧠 Learnings (1)📓 Common learnings🧬 Code graph analysis (1)rust/patches/mlx-sys/src/mlx-c/examples/example-closure.c (3)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
🔇 Additional comments (4)
Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #23 +/- ##
=======================================
Coverage 40.87% 40.87%
=======================================
Files 12 12
Lines 2236 2236
=======================================
Hits 914 914
Misses 1322 1322
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 14
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (19)
rust/patches/mlx-sys/src/mlx-c/README.md-5-7 (1)
5-7: Fix capitalization of "Apple Silicon" (proper noun).The product name should be capitalized as "Apple Silicon" rather than "Apple silicon" on lines 5 and 6.
Apply this diff to fix the capitalization:
- MLX is an array framework for machine learning on Apple silicon. MLX C expands - MLX to the C language, making research and experimentation easier on Apple - silicon. + MLX is an array framework for machine learning on Apple Silicon. MLX C expands + MLX to the C language, making research and experimentation easier on Apple + Silicon.RUST_PORT_SUMMARY.md-73-81 (1)
73-81: Fix the TOML dependency snippet to be valid Cargo syntax.In the dependencies example:
mlx-rs = "0.21" serde/serde_json = "1" tokio = "1" ...
serde/serde_json = "1"is not valid TOML/Cargo syntax. If someone copies this intoCargo.tomlit will fail to parse. You probably want:serde = "1" # or serde = { version = "1", features = ["derive"] } serde_json = "1"Separately (optional):
- Add a language to the earlier path block (e.g.,
```text) to satisfy MD040.- Consider making the
Locationpath repo‑relative instead of/Users/arosboro/...so it reads less like a single‑machine snapshot.rust/BENCHMARK_OOM_FALSE_POSITIVE_FIX.md-167-173 (1)
167-173: Sync documented file paths with the monorepo directory layoutThe “Files Modified” section lists paths under
your_ai_rs/src/..., whereas elsewhere in this PR the Rust codebase is organized underrust/. To keep things clear for readers starting from the repo root, it would help to either:
- Prefix these with the actual monorepo path (e.g.,
rust/src/cli/commands.rs), or- Clarify that
your_ai_rsis the crate directory underrust/and show the full path from the repo root.Right now it’s a bit ambiguous where exactly these files live in the new structure.
rust/MEMORY_SAFE_TRAINING.md-11-17 (1)
11-17: Clarify default monitoring vs reporting intervalsThe text currently mentions:
- Monitoring memory “every 10 steps by default”.
- In the full example, “Monitor memory every step”.
- Later, “Every 100 steps (by default), a full memory report is printed.”
If the implementation distinguishes between an internal check interval and a separate report interval, consider stating that explicitly (with the actual default values), e.g., “checks every N steps and prints a full report every M steps.” Right now the wording reads contradictory even if the code is correct.
Also applies to: 100-110, 147-148
rust/IMPLEMENTATION_NOTES.md-158-161 (1)
158-161: Update paths to match the monorepo layout and avoid absolute localsSeveral path references still look like they come from the pre‑monorepo setup:
- Build instructions use
cd your_ai_rs.- Project structure is rooted at
your_ai_rs/.- References include an absolute local path (
/Users/arosboro/your_ai/src/).From the rest of this PR (e.g.,
rust/Makefile, CONTRIBUTING), contributors are now expected to work under therust/subtree of the main repo. It would be less confusing to:
- Clarify the correct
cdpath from the monorepo root (e.g.,cd rustorcd rust/your_ai_rs, depending on actual layout).- Root the project-structure tree at whatever directory users actually build from now.
- Replace the absolute local path in “References” with a repo‑relative location or a short description (e.g., “original Python implementation under
python/src/”).Also applies to: 185-187, 284-285
rust/MEMORY_SAFE_TRAINING.md-40-45 (1)
40-45: Align recommended model sizes with the stated memory usageThe “Recommended Model Sizes” table and the “Memory Usage Examples” section appear inconsistent:
- The table says that on 16 GB RAM you can use 8B models with LoRA.
- The later example states 8B + LoRA needs ~20–24 GB and is “Safe for 32 GB+ systems”, and 70B + LoRA needs ~180–200 GB and “Requires 256 GB+ RAM”.
This makes it hard for users on 16 GB systems to know what is actually safe; following the table as‑is would exceed the memory usage numbers you quote. It would be good to reconcile these (either by adjusting the table or the estimates) so the “safe” recommendations are self‑consistent.
Also applies to: 88-97
rust/FIXES_IMPLEMENTATION.md-181-218 (1)
181-218: Clarify implementation status - "addressed" vs. "tested" vs. "working".The document uses ✅ checkmarks claiming features are "implemented," but section 181-186 reveals:
- Optimizer state serialization is still TODO
- API compatibility with mlx-rs 0.21 is uncertain with "fallbacks in place"
- End-to-end training hasn't been tested (lines 205-207 unchecked)
This creates ambiguity about whether the implementation is production-ready or still requires validation. Consider:
- Using more precise status indicators (e.g., "🔨 Implemented but untested", "
⚠️ Partial implementation")- Leading with the testing status rather than burying it at the end
- Clarifying in the overview that these fixes enable training but require model files to verify
rust/patches/mlx-sys/src/mlx-c/examples/example-float64.c-27-27 (1)
27-27: Fix typo in print message."divive" should be "divide".
Apply this diff:
- print_array("divive by 2 (integer)", arr); + print_array("divide by 2 (integer)", arr);rust/patches/mlx-sys/src/mlx-c/mlx-c.pc.in-1-26 (1)
1-26: File extension is misleading - this is a CMake config, not pkg-config.The
.pc.inextension suggests this is a pkg-config template, but the content is purely CMake syntax with@PACKAGE_*@variable substitution,include(), andfind_package_handle_standard_args(). Traditional pkg-config.pcfiles use a different format (e.g.,prefix=@prefix@,Libs: -L${libdir}, etc.).Consider renaming to align with CMake conventions:
FindMLXC.cmake.in(for a find module)MLXCConfig.cmake.in(for a package config)This will prevent confusion during build integration and make the file's purpose clearer.
rust/patches/mlx-sys/src/mlx-c/examples/example.c-39-41 (1)
39-41: Fix typo in output message.Line 41 contains a typo: "divive" should be "divide".
Apply this diff:
mlx_array two = mlx_array_new_int(2); mlx_divide(&arr, arr, two, stream); - print_array("divive by 2!", arr); + print_array("divide by 2!", arr);rust/examples/basic_training.rs-17-33 (1)
17-33: Useconfig.distrust.alphainstead of hardcoding the alpha value.Lines 22 and 29 hardcode
2.7as the alpha parameter, but Line 14 printsconfig.distrust.alphawhich should contain this value. Hardcoding creates an inconsistency risk if the config value is changed and violates the DRY principle.Apply this diff to use the config value:
// Primary source (should have HIGH loss - rewarded) - let primary_loss = empirical_distrust_loss(0.05, 7.0, 2.7)?; + let primary_loss = empirical_distrust_loss(0.05, 7.0, config.distrust.alpha)?; println!("Primary source (auth=0.05, entropy=7.0):"); println!(" Loss: {:.2}", primary_loss.item::<f32>()); println!(" → HIGH loss = rewarded in training"); println!(); // Modern consensus (should have LOW loss - penalized) - let modern_loss = empirical_distrust_loss(0.90, 1.0, 2.7)?; + let modern_loss = empirical_distrust_loss(0.90, 1.0, config.distrust.alpha)?; println!("Modern consensus (auth=0.90, entropy=1.0):");rust/MLX_UPGRADE_COMPLETE.md-179-181 (1)
179-181: Date appears incorrect.The completion date says "December 9, 2024" but the document references "Tested Dec 9, 2025" on line 127, suggesting this should be 2025.
-**Completion Date**: December 9, 2024 +**Completion Date**: December 9, 2025rust/patches/mlx-sys/src/mlx-c/examples/example-closure.c-82-90 (1)
82-90: Typo in comments: "paylaod" → "payload".- // simple +2 to input, with paylaod + // simple +2 to input, with payload payload.value = mlx_array_new_float(2.0); ... - // simple +nan to input, with paylaod + // simple +nan to input, with payloadrust/ANE_DEPLOYMENT_GUIDE.md-87-92 (1)
87-92: Missing numpy import in conversion script.The script uses
np.int32on line 129 but doesn't import numpy.# scripts/convert_to_coreml.py import coremltools as ct from transformers import AutoModelForCausalLM, AutoTokenizer import torch +import numpy as nprust/patches/mlx-sys/src/mlx-c/examples/example-grad.c-26-29 (1)
26-29: Typo in error message: "inc_func_value" should be "inc_fun_value".The error message references "inc_func_value" but the function is named
inc_fun_value.if (mlx_vector_array_size(in) != 1) { - fprintf(stderr, "inc_func_value: expected 1 argument"); + fprintf(stderr, "inc_fun_value: expected 1 argument"); exit(EXIT_FAILURE); }rust/EMPIRICAL_OPTIMIZATION_IMPLEMENTED.md-143-143 (1)
143-143: Typo: Memory unit mismatch.Line shows "Peak memory: 22.1 MB (21.58 GB)" but should likely be "22100 MB" based on the context (line 148 shows "22100MB" for the same configuration). The "22.1 MB" doesn't match "21.58 GB".
- Peak memory: 22.1 MB (21.58 GB) + Peak memory: 22100 MB (21.58 GB)rust/patches/mlx-sys/src/mlx-c/mlx/c/compile.cpp-27-36 (1)
27-36: Potential undefined behavior: null pointer with non-zeroconstants_numinmlx_detail_compile.At line 36, constructing
std::vector<uint64_t>(constants, constants + constants_num)without validating theconstantspointer can cause undefined behavior ifconstantsis null andconstants_num > 0. The header file does not document preconditions for theconstantsparameter.Since this is auto-generated code, this issue should be reported to the MLX code generator maintainers to add a null check or document the required contract—either callers must guarantee valid inputs, or the generator should add defensive validation.
rust/patches/mlx-sys/src/mlx-c/CMakeLists.txt-36-46 (1)
36-46: Architecture forcing may not be appropriate for all scenarios.Forcing
CMAKE_OSX_ARCHITECTURESandCMAKE_SYSTEM_PROCESSORtoarm64when not using system MLX assumes all Apple machines are Apple Silicon. This will break builds on Intel Macs. Consider making this conditional on the actual architecture.if(APPLE) - set(CMAKE_OSX_ARCHITECTURES "arm64" CACHE STRING "Build architectures for Mac OS X" FORCE) - set(CMAKE_SYSTEM_PROCESSOR "arm64" CACHE STRING "Target processor" FORCE) + if(CMAKE_HOST_SYSTEM_PROCESSOR STREQUAL "arm64") + set(CMAKE_OSX_ARCHITECTURES "arm64" CACHE STRING "Build architectures for Mac OS X" FORCE) + set(CMAKE_SYSTEM_PROCESSOR "arm64" CACHE STRING "Target processor" FORCE) + endif() endif()rust/patches/mlx-sys/src/mlx-c/mlx/c/array.cpp-37-44 (1)
37-44: Error case returns same value as success case.
mlx_array_new()returnsmlx_array_()on both success (line 39) and error (line 42). Callers cannot distinguish between a successful empty array creation and an allocation failure. Other similar functions (e.g.,mlx_fast_metal_kernel_new) return{nullptr}on error.Consider returning a distinguishable value on error:
extern "C" mlx_array mlx_array_new() { try { return mlx_array_(); } catch (std::exception& e) { mlx_error(e.what()); - return mlx_array_(); + return {nullptr}; } }
🧹 Nitpick comments (65)
rust/.cursor/plans/empirical_benchmark_plan.md (1)
140-154: Fix markdown linting issues: add blank lines around tables and specify code block languages.The document has 4 formatting issues flagged by markdownlint:
- Tables at lines 143 and 150 need blank lines before and after (MD058)
- Code blocks at lines 158 and 175 need language specifiers (MD040)
Apply these diffs to fix the formatting:
## Test Configurations + ### Full Mode (default) | Batch Size | LoRA Rank | LoRA Layers | |-----------|-----------|-------------| | 1, 2, 4, 6, 8 | 32, 64, 128, 256 | 8, 16, 24, 32 | Total: 5 × 4 × 4 = **80 configurations** + ### Quick Mode (`--quick`) | Batch Size | LoRA Rank | LoRA Layers | |-----------|-----------|-------------| | 1, 2, 4 | 64, 128 | 16, 32 | Total: 3 × 2 × 2 = **12 configurations** + ## Algorithm -\`\`\` +\`\`\`text 1. Load model architecture (not weights) to estimate base memoryAnd for the console output block:
### Console Output -\`\`\` +\`\`\`text ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━rust/patches/mlx-sys/src/mlx-c/.clang-format (1)
1-87: Add a header comment explaining the purpose and scope of this configuration.Configuration files benefit from brief documentation explaining their purpose, especially in a monorepo where multiple formatting standards may coexist. Consider adding a comment at the top clarifying that this file applies to mlx-c FFI bindings.
Example header:
+# Clang-format configuration for mlx-c FFI bindings in rust/patches/mlx-sys/src/mlx-c/ +# Ensures consistent C/C++ code style for generated and hand-written bindings. --- AccessModifierOffset: -1rust/patches/mlx-sys/examples/is_metal_available.rs (1)
1-6: Consider adding error handling or documentation.The example correctly demonstrates the FFI usage, but could be more instructive with:
- A comment explaining what the status codes mean (e.g., "0 indicates success")
- Basic error handling instead of just an assertion
Example improvement:
fn main() { let mut is_available = false; + // Status 0 indicates success let status = unsafe { mlx_sys::mlx_metal_is_available(&mut is_available as *mut bool) }; - assert_eq!(status, 0); + if status != 0 { + eprintln!("Error checking Metal availability: status {}", status); + return; + } - println!("{:?}", is_available); + println!("Metal available: {}", is_available); }rust/patches/mlx-sys/src/mlx-c/docs/src/conf.py (2)
21-21: Minor formatting: Remove space before colon in dictionary.The dictionary has inconsistent spacing around the colon. PEP 8 recommends no space before the colon.
Apply this diff:
-breathe_projects = {"mlxc" : "../build/xml"} +breathe_projects = {"mlxc": "../build/xml"}
15-16: Consider adding a comment about version management.The version is hardcoded. Adding a brief comment about whether this should be kept in sync with the project version or updated manually would improve maintainability.
Example:
author = "MLX Contributors" +# Update this version when releasing new MLX-C versions version = "0.2.0" release = versionrust/patches/mlx-sys/src/mlx-c/README.md (1)
5-11: Refine opening paragraph to reduce sentence repetition.The opening paragraph has three sentences that successively begin with "MLX C" or "MLX", creating repetitive flow. Consider restructuring for better readability.
Consider this revision for better flow and variety:
- MLX is an array framework for machine learning on Apple Silicon. MLX C expands - MLX to the C language, making research and experimentation easier on Apple - Silicon. - - MLX C can be used standalone or as a bridge to bind other languages to - MLX. For example, the [MLX Swift](https://github.com/ml-explore/mlx-swift/) - package uses MLX C to provide a Swift API to MLX. + MLX is an array framework for machine learning on Apple Silicon. MLX C expands + this to the C language, making research and experimentation easier on the platform. + + As a standalone API or bridge to other languages, MLX C enables integration across + ecosystems—for example, the [MLX Swift](https://github.com/ml-explore/mlx-swift/) + package uses MLX C to provide a Swift API to MLX.rust/patches/mlx-sys/src/mlx-c/CONTRIBUTING.md (1)
18-29: Add languages to fenced code blocks to satisfy markdownlint (MD040)markdownlint flagged the formatter command examples for missing fenced‑code languages. Adding
bashwill fix MD040 and improve syntax highlighting without changing meaning.- You can also run the formatters manually as follows: - - ``` - clang-format -i file.cpp - clang-format -i file.c - ``` + You can also run the formatters manually as follows: + + ```bash + clang-format -i file.cpp + clang-format -i file.c + ``` @@ - ``` - black file.py - ``` + ```bash + black file.py + ```After applying this, re‑run your markdownlint/pre‑commit checks to confirm MD040 is resolved. (Based on static analysis hints, markdownlint‑cli2 reported the original issue.)
rust/patches/mlx-sys/src/mlx-c/ACKNOWLEDGMENTS.md (1)
3-8: Minor wording tweak for contributor instructions (optional).The template reads clearly. If you want to tighten the style (and satisfy the style hint), you could switch “short description” to “brief description,” but it’s purely optional.
rust/BENCHMARK_COMMAND_IMPLEMENTED.md (2)
82-107: Add languages to plain fenced blocks to keep markdownlint happy.The “Expected Output” and “With
--optimizeFlag” blocks are unlabeled:… CLI output …
Consider marking them as
```textor```consoleso MD040 stops firing and syntax highlighters behave better. Similarly,**Features:**could be a#### Featuresheading if you want to silence MD036.
111-137: Make the JSON example syntactically valid or relax the fence language.The JSON example is fenced as
jsonbut uses...and[...], which is not valid JSON. That can confuse tooling and some editors:"all_results": [...]Either:
- Replace placeholders with minimal concrete structures (e.g., an empty array or a single example object), or
- Change the fence language to something like
jsonc/javascript/textif you want to keep illustrative ellipses.rust/BUILD_STATUS.md (1)
25-30: Tidy fenced block language and bare URL (optional).To keep markdownlint quiet and improve readability:
Add a language to the error block, e.g.:
```text CMake Error: ...Turn the bare MLX URL into a proper link, e.g.:
[MLX build docs](https://ml-explore.github.io/mlx/build/html/install.html#build-from-source)Also, if this doc is meant for other developers (not just a snapshot of your environment), consider replacing
/Users/arosboro/your_aiin the Python example with a repo‑relative path (e.g.,cd your_aiorcd python) to make it more portable.python/README.md (1)
24-34: Add languages to non-code fenced blocks and double‑check “abliteration” spelling (optional).A few fenced blocks contain prose/trees/citation but no language:
- The formula block around
L_empirical = …- The project structure tree
- The citation at the end
To satisfy MD040 and improve highlighting, you can mark them as
```text(or```none) instead of bare ```.Also, “abliterated/abliteration” is unusual English and was flagged by the checker. If it’s an intentional term of art for these models, consider adding a brief note the first time you use it; if not, you may want to correct it to “obliterated”.
Also applies to: 207-238, 378-382
.github/workflows/python-ci.yml (1)
90-134: Consider normalizing filenames for robustness if pytest-cov configuration changes.The coverage check compares
cls.get('filename')directly against bare module names. While the current pytest-cov configuration (run with--cov=srcfrom thepython/directory) produces bare filenames likedistrust_loss.pythat match correctly, this approach is fragile if the coverage tool configuration or invocation changes.For defensive robustness, consider normalizing filenames to basenames:
import os filename = os.path.basename(cls.get('filename', ''))This ensures the check continues to work regardless of whether pytest-cov emits bare names, relative paths like
src/distrust_loss.py, or full paths.CONTRIBUTING.md (2)
13-15: Clarify non‑macOS contribution story if PyTorch fallback remains supportedPrerequisites currently imply contributors must have Apple Silicon macOS for both Python and Rust. If the Python path is still intended to support non‑MLX platforms via a PyTorch fallback, it would help to briefly mention how Linux/other environments fit into the contribution story (even if training is MLX‑optimized on Mac). Based on learnings, this keeps expectations aligned for contributors without Apple hardware.
119-129: Standardize Rust lint/format commands across sectionsRust linting/formatting guidance varies between sections:
- Section “Rust Linting” uses
cargo fmt --checkandcargo clippy --all-targets --all-features -- -D warnings.- “Before Submitting” uses
cargo clippy -- -D warningsandcargo fmt --check.- “PR Requirements” uses
cargo clippy -- -D warningsandcargo fmt.To avoid confusion and CI mismatches, consider choosing a single canonical pair (e.g.,
cargo fmt --checkandcargo clippy --all-targets --all-features -- -D warnings) and using it consistently in all three places.Also applies to: 263-272, 318-322
rust/Makefile (1)
1-16: Tighten phony targets and consider anallaliasThe Makefile targets themselves look good, but a couple of small improvements would make it more robust:
- Add the remaining non-file targets (
setup,recommend,watch,doc,bench) to.PHONYsomakenever treats them as real files.- Optionally add an
alltarget (e.g., aliasingbuildortest) to satisfy tools likecheckmakeand match commonmakeexpectations.- Since
watchrelies oncargo watch, it may help to mention that requirement in docs or the help text so contributors understand whymake watchmight fail on a fresh setup.Example:
.PHONY: all build test run-example check fmt clippy clean help setup recommend watch doc bench all: buildAlso applies to: 35-52
rust/IMPLEMENTATION_NOTES.md (2)
185-239: Optional: Add fenced code languages and link formatting for markdownlintmarkdownlint is flagging:
- The project structure code block (line 185) for missing a language.
- Bare URLs in the References section (lines 285–287).
If you care about a clean markdownlint run, consider:
- Using something like
```textfor the project tree block.- Optionally wrapping the URLs as markdown links (
[MLX-rs](https://github.com/oxideai/mlx-rs), etc.).Purely cosmetic, but it keeps automated documentation checks green.
Also applies to: 284-287
241-254: Revisit CI/CD “Next Steps” in light of new Rust workflows“Next Steps” still lists “CI/CD: Set up GitHub Actions for testing” as pending, but this PR appears to be adding Rust CI already. It might be worth updating that bullet to reflect the current state (e.g., “extend CI coverage” or similar) so the notes don’t look stale.
rust/MEMORY_SAFE_TRAINING.md (1)
124-137: Optional: Add languages to output code fences for markdownlintmarkdownlint is flagging the plain fenced blocks that show sample output (MD040). If you want to keep it happy, adding something like
textafter the opening backticks for those output snippets will resolve it:```text Memory usage exceeded limit: 28.5 GB > 28.0 GB. Training stopped. ...Same for the progress-bar example block. Also applies to: 143-145 </blockquote></details> <details> <summary>rust/BENCHMARK_OOM_FALSE_POSITIVE_FIX.md (2)</summary><blockquote> `28-57`: **Consider HF cache env vars and snapshot selection in `resolve_model_path`** The `resolve_model_path` helper is a solid step toward avoiding OOM false positives from missing models, but a couple of follow‑ups might be worth considering: - Hugging Face allows customizing cache locations (e.g., via environment variables like `HF_HOME` / related settings). Hard‑coding `$HOME/.cache/huggingface/hub/...` means the resolver will silently fail on those setups even when the models are cached. - `read_dir` does not guarantee ordering, so “first snapshot” is not necessarily the most recent; it’s fine as a heuristic, but the comment about “most recent” could either be relaxed or the implementation extended to pick a deterministic snapshot (e.g., based on mtime or name). These aren’t blockers for fixing the OOM false positives, but tightening them would make the cache resolution more robust. --- `7-10`: **Optional: Add languages to fenced blocks for markdownlint** markdownlint is complaining about several fenced blocks without a language (the short benchmark output snippets and status messages). If you want clean lint runs, adding something like `text` after the opening backticks will resolve MD040 and slightly improve rendering: ```markdown ```text [1/5] hermes-mistral-7b (7B ) ... ✗ OOMSame for the later result and log examples. Also applies to: 111-122, 126-129, 132-143, 146-149 </blockquote></details> <details> <summary>rust/patches/mlx-sys/src/mlx-c/mlx/c/device.h (1)</summary><blockquote> `35-72`: **Document return value conventions and consider const correctness.** The function declarations lack documentation for return value semantics. Consider: 1. **Return value conventions**: Document what integer return values mean (e.g., 0 = success, non-zero = error code). This is critical for FFI consumers. 2. **Const correctness**: Several read-only operations accept non-const device parameters: - `mlx_device_tostring(mlx_string* str, mlx_device dev)` - `mlx_device_get_index(int* index, mlx_device dev)` - `mlx_device_get_type(mlx_device_type* type, mlx_device dev)` - `mlx_device_equal(mlx_device lhs, mlx_device rhs)` Since `mlx_device` contains a pointer, these could benefit from const device parameters or taking devices by const pointer if the underlying C++ API supports it. 3. **Memory ownership**: Add documentation clarifying: - Who owns the `ctx` pointer after `mlx_device_new()`? - Is `mlx_device_free()` required, or is cleanup automatic? - What happens if you copy a device struct directly? Consider adding a comment block at the file header documenting these conventions, for example: ```c /** * \defgroup mlx_device Device * MLX device object. * * Return value convention: * - Functions returning int: 0 on success, non-zero error code on failure * - Caller must call mlx_device_free() on devices created with mlx_device_new*() */rust/WARNING_FIXES_SUMMARY.md (1)
43-54: Consider removing dead code instead of suppressing warnings.The document suppresses dead code warnings for
PRE_1970_SOURCE_MARKERSand prefixes unused struct fields with underscores. While this achieves "zero warnings," it may indicate:
PRE_1970_SOURCE_MARKERS (line 46): If this static is truly for "future enhancements," consider removing it until needed. Code that isn't called tends to bitrot.
Struct fields (lines 48-54): Unused fields like
_save_interval,_async_save, and_seedsuggest incomplete feature implementation. Consider:
- Removing the fields if the features aren't planned for this PR
- Opening issues to track implementing these features
- Adding TODO comments explaining why they're unused
Verify which fields are genuinely needed:
#!/bin/bash # Check if the "unused" fields are actually accessed anywhere indirectly # Check for PRE_1970_SOURCE_MARKERS usage rg -n "PRE_1970_SOURCE_MARKERS" --type rust # Check struct definitions and their fields rg -n "save_interval|async_save|seed" --type rust -A2 -B2rust/patches/mlx-sys/src/mlx-c/docs/index.html (1)
1-1: Add a fallback message for the redirect.The meta refresh redirect assumes
./build/html/index.htmlexists. Consider adding a fallback for users with JavaScript disabled or if the target doesn't exist.+<!DOCTYPE html> +<html> +<head> <meta http-equiv="refresh" content="0; url=./build/html/index.html" /> +<title>MLX Documentation</title> +</head> +<body> +<p>Redirecting to documentation... If you are not redirected, <a href="./build/html/index.html">click here</a>.</p> +</body> +</html>rust/patches/mlx-sys/src/mlx-c/docs/requirements.txt (1)
1-3: Pin dependency versions for reproducible documentation builds.The requirements file lacks version constraints, which can cause build reproducibility issues when packages release breaking changes. Based on learnings, Applies to requirements.txt: Manage project dependencies through a
requirements.txtfile.Consider pinning versions:
-sphinx -breathe -sphinx-book-theme +sphinx>=7.0,<8.0 +breathe>=4.35,<5.0 +sphinx-book-theme>=1.0,<2.0Or use exact pins for maximum reproducibility:
# After installing packages, generate exact versions: pip freeze > requirements.txtAlternatively, verify current versions work and document tested versions in a comment.
rust/patches/mlx-sys/src/mlx-c/docs/README.md (1)
7-29: Add language specifiers to fenced code blocks.Per static analysis (MD040), code blocks should specify a language for proper syntax highlighting and accessibility.
Apply this diff:
-``` +```shell brew install doxygenInstall Python packages:
-
+shell
pip install -r requirements.txt### Build Build the docs from `mlx-c/docs/` -``` +```shell doxygen && make htmlView the docs by running a server in
mlx-c/docs/:-
+shell
python -m http.serverrust/patches/mlx-sys/build.rs (1)
67-95: Addcargo:rerun-if-changeddirectives for proper incremental builds.Without
rerun-if-changeddirectives, the build script will re-run on every build or may miss changes to source files. Consider adding directives for the CMake source, headers, and toolchain file.fn main() { + // Ensure rebuild when relevant files change + println!("cargo:rerun-if-changed=src/mlx-c"); + println!("cargo:rerun-if-changed=darwin-arm64.cmake"); + build_and_link_mlx_c();rust/README.md (1)
65-82: Add language specifiers to fenced code blocks.Two code blocks are missing language specifiers, which helps with syntax highlighting and accessibility.
Apply this diff:
-``` +```text your_ai_rs/ ├── src/-``` +```text Brian Roemmele (2025). "Empirical Distrust Term for AI Training" Public domain algorithm released November 25, 2025. https://x.com/BrianRoemmele/status/1993393673451847773Also applies to: 208-212
rust/COMPLETION_SUMMARY.md (2)
17-87: Add language specifier to file structure diagram.The directory tree should have a language specifier for proper rendering.
Apply this diff:
-``` +```text your_ai_rs/ ├── Cargo.toml ✅ Dependencies and package config
316-316: Wrap bare URLs in markdown links.For better readability and clickability, wrap the bare URLs in markdown link syntax.
Apply this diff:
-- **MLX-rs**: Oxide AI (https://github.com/oxideai/mlx-rs) +- **MLX-rs**: Oxide AI (<https://github.com/oxideai/mlx-rs>)-- **MLX-rs**: Check https://github.com/oxideai/mlx-rs +- **MLX-rs**: Check <https://github.com/oxideai/mlx-rs>Also applies to: 356-356
rust/patches/mlx-sys/src/mlx-c/mlx-c.pc.in (1)
20-23: Remove commented-out code.This commented block has no explanation for why it's disabled. If the properties aren't needed, remove the dead code. If they might be needed later, version control will preserve them.
Apply this diff to remove the commented code:
-# set_target_properties(mlxc PROPERTIES -# CXX_STANDARD 17 -# INTERFACE_COMPILE_OPTIONS "${MLX_C_CXX_FLAGS}" -# ) -rust/BUILD_SUCCESS.md (3)
6-8: Add language identifier to code block.The fenced code block should specify a language identifier for proper syntax highlighting.
Apply this diff:
The mlx-sys crate had a CMake architecture detection bug where it was detecting x86_64 instead of ARM64 on Apple Silicon, causing the build to fail with: -``` +```text Building for x86_64 on macOS is not supported.--- `46-52`: **Add language identifier to code block.** The verification output block should specify a language identifier. Apply this diff: ```diff ✅ **Binary Architecture:** -``` +```text $ file target/release/your_ai target/release/your_ai: Mach-O 64-bit executable arm64
54-60: Add language identifier to code block.The functional test output block should specify a language identifier.
Apply this diff:
✅ **Functional Test:** -``` +```text $ ./target/release/your_ai setup ╔═══════════════════════════════════════════════════════════════╗rust/GETTING_STARTED.md (2)
41-41: Format URL with angle brackets or as a markdown link.Bare URLs should be wrapped in angle brackets or converted to markdown links for proper rendering.
Apply one of these options:
-Or visit: https://docs.rs/mlx-rs/latest/mlx_rs/ +Or visit: <https://docs.rs/mlx-rs/latest/mlx_rs/>Or:
-Or visit: https://docs.rs/mlx-rs/latest/mlx_rs/ +Or visit: [mlx-rs documentation](https://docs.rs/mlx-rs/latest/mlx_rs/)
165-165: Format URL with angle brackets or as a markdown link.Apply this diff:
-- MLX-rs GitHub: https://github.com/oxideai/mlx-rs +- MLX-rs GitHub: <https://github.com/oxideai/mlx-rs>rust/STATUS.md (3)
19-35: Add language identifier to code block.The file inventory code block should specify a language identifier for better rendering.
Apply this diff:
✅ **43 files created**: -``` +```text 4 Documentation (.md) 1 Build config (Cargo.toml, Makefile, .gitignore)
96-96: Format URL with angle brackets or as a markdown link.Apply this diff:
-Reference: https://docs.rs/mlx-rs/0.21.0/mlx_rs/ +Reference: <https://docs.rs/mlx-rs/0.21.0/mlx_rs/>
184-187: Format bare URLs consistently.All URLs should be wrapped in angle brackets or converted to markdown links for proper rendering.
Apply this diff:
- **Python reference**: `/Users/arosboro/your_ai/src/` -- **MLX-rs docs**: https://docs.rs/mlx-rs/ -- **Rust book**: https://doc.rust-lang.org/book/ +- **MLX-rs docs**: <https://docs.rs/mlx-rs/> +- **Rust book**: <https://doc.rust-lang.org/book/> - **This implementation**: Check the .md files in this directoryrust/patches/mlx-sys/src/mlx-c/mlx/c/error.cpp (2)
46-48: VLA (Variable Length Array) is non-standard in C++.
char msg[size + size_loc + 1]uses a VLA which is a C99 feature, optional in C11, and not part of standard C++. This may fail to compile on MSVC and other strict C++ compilers.Consider using heap allocation for portability:
- char msg[size + size_loc + 1]; // \0 at the end - size = vsnprintf(msg, size + 1, fmt, args); - snprintf(msg + size, size_loc + 1, " at %s:%d", file, line); + std::unique_ptr<char[]> msg(new char[size + size_loc + 1]); + size = vsnprintf(msg.get(), size + 1, fmt, args); + snprintf(msg.get() + size, size_loc + 1, " at %s:%d", file, line);And update the handler call:
- mlx_error_handler_(msg, mlx_error_handler_data_.get()); + mlx_error_handler_(msg.get(), mlx_error_handler_data_.get());
11-14: Default error handler uses stdout and non-standard exit code.Error messages conventionally go to
stderr, andEXIT_FAILUREis more portable than-1.static void mlx_error_handler_default_(const char* msg, void* data) { - printf("MLX error: %s\n", msg); - exit(-1); + fprintf(stderr, "MLX error: %s\n", msg); + exit(EXIT_FAILURE); }rust/MLX_V021_RUNTIME_ISSUE.md (2)
12-23: Add language specifiers to code blocks.Per static analysis, fenced code blocks should have a language specified for proper syntax highlighting. Use
textorplaintextfor error output.-``` +```text error: typedef redefinition with different types ('union __mbstate_t' vs 'union __mbstate_t')-``` +```text libc++abi: terminating due to uncaught exception of type std::runtime_error:
49-54: Hardcoded user-specific paths in documentation.The path
/Users/arosboro/your_aiis user-specific and won't work for other contributors. Consider using relative paths or placeholders.-cd /Users/arosboro/your_ai +cd /path/to/your_ai # Or use relative: cd .. source venv/bin/activaterust/patches/mlx-sys/src/mlx-c/examples/example-metal-kernel.c (1)
13-47: Consider adding error handling for kernel operations.The
exp_elemwisefunction doesn't check return values or handle potential failures frommlx_fast_metal_kernel_apply. If the kernel fails (e.g., on unsupported hardware), this could lead to undefined behavior.Consider returning an error code or adding validation:
int exp_elemwise( mlx_array* output_, const mlx_array input, mlx_stream stream) { // ... existing code ... int result = mlx_fast_metal_kernel_apply(&outputs, kernel, inputs, config, stream); if (result != 0) { // cleanup and return error } // ... rest of cleanup ... return result; }rust/MLX_UPGRADE_COMPLETE.md (2)
9-14: Add language specifier to code block.Per static analysis, fenced code blocks should have a language specified.
-``` +```text ✅ Debug build: SUCCESS ✅ Release build: SUCCESS
99-106: Add language specifier to warning code block.-``` +```text warning: patch for `mlx-sys` uses the features mechanism...README.md (1)
143-160: Add language specifier for project structure block.Per static analysis, fenced code blocks should have a language specified. Use
textfor directory trees.-``` +```text your_ai/ ├── python/ # Python/MLX implementation (PoC)rust/patches/mlx-sys/src/mlx-c/examples/example-closure.c (1)
14-21: Stream created per-call may be inefficient.
inc_funcreates a new GPU stream on every invocation. For performance-critical paths, consider accepting the stream as a parameter or reusing a shared stream.This is acceptable for an example demonstrating the closure API, but note this pattern in production code.
rust/README_METAL_ANE.md (1)
43-49: Consider adding language specifiers to fenced code blocks.Several fenced code blocks in this file lack language specifiers (lines 43, 56, 95, 122). While these contain text diagrams and output rather than actual code, adding
textorplaintextas the language identifier would satisfy markdown linters and improve consistency.Example fix for the error output block:
-``` +```text Tested: December 9, 2025 Result: ❌ FAILEDMETAL_AND_ANE_SUMMARY.md (1)
47-54: Add language specifiers to fenced code blocks.Multiple fenced code blocks (lines 47, 52, 161, 179, 211) lack language specifiers. These contain architecture diagrams and flow charts. Adding
textorplaintextwould satisfy linters.-``` +```text MLX (Rust) → CPU/GPU → Training</blockquote></details> <details> <summary>rust/patches/mlx-sys/src/mlx-c/mlx/c/compile.h (1)</summary><blockquote> `9-11`: **Potentially unused include: stdio.h.** The `stdio.h` header is included but no `FILE*` or other stdio types appear in the public declarations. Since this is auto-generated, this may be intentional, but it could be removed if not needed by the implementation. </blockquote></details> <details> <summary>rust/METAL_STATUS_REPORT.md (1)</summary><blockquote> `33-36`: **Add language specifier to fenced code block.** The code block showing error messages lacks a language specifier. Use `text` or `plaintext` for error output. ```diff -``` +```text error: no matching function for call to 'atomic_load_explicit' error: no matching function for call to 'atomic_compare_exchange_weak_explicit'</blockquote></details> <details> <summary>rust/EMPIRICAL_OPTIMIZATION_IMPLEMENTED.md (1)</summary><blockquote> `114-114`: **Add language specifier to fenced code block.** The output example block should have a language specifier like `text` or `console` for proper rendering. ```diff -``` +```text ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━rust/patches/mlx-sys/src/mlx-c/examples/example-safe-tensors.c (2)
127-133: Missing NULL check on malloc.The malloc call at line 128 may return NULL if allocation fails. While this is example code, it would be good practice to check the return value before using the pointer.
mlx_mem_stream mem_stream = { - malloc(2048), // 2048 bytes + NULL, // to be allocated 0L, // position 2048L, // size false, // err false // do not free data (we will reuse it at read time) }; + mem_stream.data = malloc(2048); + if (!mem_stream.data) { + fprintf(stderr, "Failed to allocate memory stream buffer\n"); + return 1; + }
21-24: Debug printf statements should be conditional.The printf calls in memory stream callbacks (mem_is_open, mem_good, mem_tell, mem_seek, mem_read, etc.) are useful for debugging but may be noisy in production usage. Consider wrapping these in a debug macro or environment variable check.
#ifdef MLX_DEBUG printf("ISOPEN\n"); #endifAlso applies to: 25-29, 30-34, 35-37, 59-61, 69-71, 79-81, 89-92, 93-99
rust/patches/mlx-sys/src/mlx-c/CMakeLists.txt (1)
6-6: Project name doesn't match library name.The project is named
mlx.databut the library built ismlxc. This could cause confusion. Consider aligning the project name with the library, e.g.,project(mlxc LANGUAGES CXX C).rust/IMPLEMENTATION_COMPLETE.md (2)
177-183: Add language specifier to fenced code block.Per static analysis, fenced code blocks should have a language specified for proper rendering.
### Before Implementation -``` +```text ⚠️ Weight Loading: Placeholder only ⚠️ Slicing: Simplified (no shift) ⚠️ Gradients: Forward pass only ⚠️ Optimizer: Not connected--- `185-191`: **Add language specifier to fenced code block.** Similar to the "Before" section, add a language specifier. ```diff ### After Implementation -``` +```text ✅ Weight Loading: Full safetensors support ✅ Slicing: Proper next-token prediction ✅ Gradients: Complete backpropagation ✅ Optimizer: Parameter updates working</blockquote></details> <details> <summary>rust/patches/mlx-sys/src/mlx-c/mlx/c/distributed.h (1)</summary><blockquote> `30-34`: **Minor naming inconsistency: stream parameter.** `mlx_distributed_all_gather` uses `S` (uppercase) for the stream parameter while all other functions use `s` (lowercase). For consistency, consider using lowercase `s` throughout. ```diff int mlx_distributed_all_gather( mlx_array* res, const mlx_array x, const mlx_distributed_group group /* may be null */, - const mlx_stream S); + const mlx_stream s);rust/patches/mlx-sys/src/mlx-c/mlx/c/export.h (1)
9-17: Potentially unnecessary includes.Some includes like
mlx/c/distributed_group.handmlx/c/io_types.hdon't appear to be used by this header's declarations. If these are not needed transitively, consider removing them to reduce compilation dependencies.rust/patches/mlx-sys/src/mlx-c/mlx/c/export.cpp (1)
43-54: Constructor functions return sentinel on failure.
mlx_function_exporter_newandmlx_imported_function_newreturn{nullptr}on exception. Callers must checkxfunc.ctx != NULLbefore using the returned handle. Consider documenting this in the header file, or adding a validity check helper function.// Suggested addition to export.h: static inline bool mlx_function_exporter_is_valid(mlx_function_exporter xfunc) { return xfunc.ctx != NULL; } static inline bool mlx_imported_function_is_valid(mlx_imported_function xfunc) { return xfunc.ctx != NULL; }Also applies to: 90-98
rust/patches/mlx-sys/src/mlx-c/mlx/c/array.h (1)
32-32: Static variable in header creates one instance per translation unit.Declaring
static mlx_array mlx_array_empty;in a header file means each translation unit (.c/.cppfile) that includes this header gets its own separate copy of the variable. If this is intended as a shared sentinel value, it should be declaredexternin the header and defined in a single.cppfile.If this should be a shared constant:
In the header:
-static mlx_array mlx_array_empty; +extern const mlx_array mlx_array_empty;In
array.cpp:extern "C" const mlx_array mlx_array_empty = {nullptr};If per-TU behavior is intentional (for a zero-initialized default), the current approach is acceptable but could benefit from a comment explaining the intent.
rust/patches/mlx-sys/src/mlx-c/mlx/c/closure.cpp (2)
300-320: Stray semicolons are code generation artifacts.Multiple standalone semicolons appear throughout the file (e.g., lines 302, 308, 316, 348, 354, 362, 580, 631). These are harmless but indicate the code generator is emitting unnecessary empty statements.
Consider fixing the code generator to avoid emitting these extra semicolons. Example occurrences:
auto res_1 = mlx_vector_array_new_(); ; // <-- unnecessary auto status = fun(&res_0, &res_1, input);
10-17: Inconsistent error handling in*_new()functions.Several
*_new()functions (e.g.,mlx_closure_new,mlx_closure_kwargs_new,mlx_closure_value_and_grad_new) return a new object on both success and error paths, making it impossible for callers to detect allocation failures. This pattern is repeated across all closure type constructors.Consider returning
{nullptr}on error to allow caller error detection:extern "C" mlx_closure mlx_closure_new() { try { return mlx_closure_new_(); } catch (std::exception& e) { mlx_error(e.what()); - return mlx_closure_new_(); + return {nullptr}; } }Also applies to: 136-143, 263-270, 391-398, 526-533, 669-676
rust/patches/mlx-sys/src/mlx-c/mlx/c/closure.h (1)
123-153: Minor: underscore-prefixed parameter names.The
_numparameter naming (lines 133, 140) uses a leading underscore. While this is technically acceptable for function parameters in C (the reservation applies to file scope identifiers), it's unconventional. Since this file is auto-generated, if this style is considered undesirable, the fix should be applied to the generator.
| extern "C" int mlx_distributed_group_rank(mlx_distributed_group group) { | ||
| try { | ||
| return mlx_distributed_group_get_(group).rank(); | ||
| } catch (std::exception& e) { | ||
| mlx_error(e.what()); | ||
| return 0; | ||
| } | ||
| } | ||
|
|
||
| extern "C" int mlx_distributed_group_size(mlx_distributed_group group) { | ||
| try { | ||
| return mlx_distributed_group_get_(group).size(); | ||
| } catch (std::exception& e) { | ||
| mlx_error(e.what()); | ||
| return 0; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check error handling patterns in MLX C wrapper functions
cd rust/patches/mlx-sys/src/mlx-c/
# Find all extern "C" functions that return int and catch exceptions
echo "=== INT-RETURNING FUNCTIONS WITH ERROR HANDLING ==="
rg -n "extern \"C\" int" --type=cpp -A5 | head -80Repository: arosboro/your_ai
Length of output: 2831
🏁 Script executed:
#!/bin/bash
# Check if there's documentation or examples of error checking in MLX C API
cd rust/patches/mlx-sys/
# Look for error handling patterns or API docs
echo "=== SEARCHING FOR ERROR CHECKING PATTERNS ==="
rg -n "mlx_error" --type=cpp -B2 -A2 | head -60Repository: arosboro/your_ai
Length of output: 2777
🏁 Script executed:
#!/bin/bash
# Check how size/rank functions are used or expected to be called
echo "=== CHECKING FOR SIZE/RANK USAGE PATTERNS ==="
rg -n "mlx_distributed_group_size|mlx_distributed_group_rank" --type=cpp --type=rs -B1 -A1Repository: arosboro/your_ai
Length of output: 149
🏁 Script executed:
#!/bin/bash
# Check the full distributed_group.cpp file to understand context
cat -n rust/patches/mlx-sys/src/mlx-c/mlx/c/distributed_group.cppRepository: arosboro/your_ai
Length of output: 1789
🏁 Script executed:
#!/bin/bash
# Check other wrapper functions to see if there's a pattern for "status" vs "data" returns
echo "=== ERROR RETURNS IN WRAPPER FUNCTIONS ==="
rg "return [01];" --type=cpp -B3 | grep -E "(return [01];|extern|catch)" | head -40Repository: arosboro/your_ai
Length of output: 2808
Return 1 on error to align with MLX C API conventions, not 0.
mlx_distributed_group_rank and mlx_distributed_group_size return 0 on error, but other int-returning functions in the MLX C API (e.g., linalg.cpp, random.cpp) return 1 on error. This inconsistency is problematic because 0 is a valid rank (first process) and can be confused with a successful result. Change both functions to return 1 on error to follow the established MLX C error handling pattern.
🤖 Prompt for AI Agents
In rust/patches/mlx-sys/src/mlx-c/mlx/c/distributed_group.cpp around lines 9 to
25, the two functions catch exceptions and currently return 0 on error which
conflicts with the MLX C API convention (0 is a valid rank); update both
mlx_distributed_group_rank and mlx_distributed_group_size to return 1 on error
instead of 0, keeping the existing try/catch and mlx_error(e.what()) call so
error is logged and callers can distinguish error vs valid zero values.
Documentation Fixes: - Fix file paths in METAL_AND_ANE_SUMMARY.md (your_ai_rs/ -> rust/) - Update python/README.md for monorepo layout with proper navigation - Replace deprecated quantization API in ANE_DEPLOYMENT_GUIDE.md - Fix non-existent compute_unit_usage() method with proper alternatives - Add module docstring to conf.py Configuration Fixes: - Remove hardcoded Xcode path from rust/.cargo/config.toml Code Quality Fixes: - Fix memory leak in example-float64.c (free 'three' array) - Fix memory leak in closure.cpp (free 'input' on error path) - Fix error return values in distributed_group.cpp (return -1 instead of 0) - Initialize all struct fields in fast.cpp to prevent undefined behavior Benchmark Improvements: - Revert breaking changes to benchmark command - Restore original working subprocess handling - Add non-invasive logging to benchmark_log.jsonl - Add memory estimation and safe config helpers (for future use) All changes improve code correctness, portability, and maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (7)
rust/BENCHMARK_REGRESSION_FIX.md (1)
146-146: Flag: Unresolved MLX runtime issue may recur.Line 146 notes an MLX v0.21.0 runtime error (exit code 255) as a "known issue" that persists. If this is a blocking issue, consider:
- Filing an issue against the MLX project or seeking workarounds.
- Documenting the affected versions and any mitigation steps.
- Evaluating whether to pin or upgrade MLX versions.
If this is acceptable as a known limitation, document it prominently in README or troubleshooting guides so users are not surprised.
python/README.md (2)
26-26: Add language identifiers to fenced code blocks.Several code blocks lack language specification, which impacts syntax highlighting and lint compliance:
-``` +```bash L_empirical = α × ‖ln(1 - w_auth) + H_prov‖²Apply this fix to lines 26, 211, and 380 by adding the appropriate language identifier (
bash,python, etc.) after the opening backticks. [markdown]Also applies to: 211-211, 380-380
187-191: Reduce repetitive sentence beginnings.Three consecutive sentences (lines 187, 189, 191) start with "For," which reduces readability:
For complete step-by-step instructions... For memory optimization... For data quality workflow...Consider rewording one or more to vary the opening, e.g., "See TRAINING_GUIDE.md for complete step-by-step instructions" or similar restructuring.
METAL_AND_ANE_SUMMARY.md (1)
47-47: Add language identifiers to fenced code blocks.Several code blocks lack language specification:
-``` +``` MLX (Rust) → CPU/GPU → TrainingApply language identifiers to code blocks at lines 47, 52, 161, 179, and 211 (e.g.,
```textfor ASCII diagrams,```pythonfor code, etc.) to comply with Markdown linting standards and improve syntax highlighting.Also applies to: 52-52, 161-161, 179-179, 211-211
rust/ANE_DEPLOYMENT_GUIDE.md (2)
9-9: Add language identifiers to fenced code blocks.Multiple code blocks lack language specification, affecting linting compliance and syntax highlighting. Apply identifiers consistently:
-``` +```text ┌─────────────────────────────────────────────┐ │ Apple Silicon M-Series Chip │Target lines: 9, 47, 52, 161, 179, 211, 530. Use appropriate identifiers:
textorplaintextfor ASCII diagrams,bashfor shell commands,pythonfor Python code, etc.Also applies to: 47-47, 52-52, 161-161, 179-179, 211-211, 530-530
61-61: Convert emphasis-based headings to proper Markdown heading syntax.Lines 61, 387, 398, 405, and 491 use bold or italic emphasis as section headings instead of
#heading syntax. Example at line 61:-**Option A: Use Existing Python Export** +#### Option A: Use Existing Python ExportConvert all five instances to proper heading levels (typically
####for subsections) to improve document structure and semantic clarity.Also applies to: 387-387, 398-398, 405-405, 491-491
rust/patches/mlx-sys/src/mlx-c/mlx/c/closure.cpp (1)
39-55: Consider small RAII helpers for MLX handle lifetimes inside the lambdasThe pattern here (and in the other closure lambdas below) manually
new_’s MLX handles, then frees them after callingfun, which is fine as long asfun,*_set_, and*_get_never throw. If there’s any chance of exceptions in those helpers, a simple RAII wrapper (e.g., a tiny struct/unique_ptrwith a custom deleter formlx_vector_array,mlx_array, maps, etc.) would make all these lambdas exception‑safe and avoid leaks in exceptional paths, without changing the external C ABI.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
FIXES_APPLIED.md(1 hunks)METAL_AND_ANE_SUMMARY.md(1 hunks)python/README.md(1 hunks)rust/.cargo/config.toml(1 hunks)rust/ANE_DEPLOYMENT_GUIDE.md(1 hunks)rust/BENCHMARK_IMPROVEMENTS.md(1 hunks)rust/BENCHMARK_REGRESSION_FIX.md(1 hunks)rust/patches/mlx-sys/src/mlx-c/docs/src/conf.py(1 hunks)rust/patches/mlx-sys/src/mlx-c/examples/example-float64.c(1 hunks)rust/patches/mlx-sys/src/mlx-c/mlx/c/closure.cpp(1 hunks)rust/patches/mlx-sys/src/mlx-c/mlx/c/distributed_group.cpp(1 hunks)rust/patches/mlx-sys/src/mlx-c/mlx/c/fast.cpp(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- rust/patches/mlx-sys/src/mlx-c/mlx/c/distributed_group.cpp
- rust/patches/mlx-sys/src/mlx-c/docs/src/conf.py
- rust/patches/mlx-sys/src/mlx-c/examples/example-float64.c
- rust/.cargo/config.toml
- rust/patches/mlx-sys/src/mlx-c/mlx/c/fast.cpp
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: arosboro/your_ai PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-30T23:47:27.584Z
Learning: Applies to src/train_*.py : Use Apple's MLX framework for training on Apple Silicon, with PyTorch as a fallback for other platforms
📚 Learning: 2025-11-30T23:47:27.584Z
Learnt from: CR
Repo: arosboro/your_ai PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-30T23:47:27.584Z
Learning: Applies to src/distrust_loss.py : Implement the core `empirical_distrust_loss` function that penalizes high-authority, low-entropy sources and rewards primary empirical sources
Applied to files:
python/README.md
📚 Learning: 2025-11-30T23:47:27.584Z
Learnt from: CR
Repo: arosboro/your_ai PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-30T23:47:27.584Z
Learning: Organize code into logical modules with single responsibilities, such as `distrust_loss.py` for the core algorithm, `train_qlora.py` for the training loop, and `prepare_data_curated.py` for data preparation
Applied to files:
python/README.md
📚 Learning: 2025-11-30T23:47:27.584Z
Learnt from: CR
Repo: arosboro/your_ai PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-30T23:47:27.584Z
Learning: Applies to src/train_*.py : Use Apple's MLX framework for training on Apple Silicon, with PyTorch as a fallback for other platforms
Applied to files:
METAL_AND_ANE_SUMMARY.mdrust/ANE_DEPLOYMENT_GUIDE.md
🪛 LanguageTool
python/README.md
[style] ~191-~191: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ORY_TESTING.md](MEMORY_TESTING.md). **For data quality workflow details**, see [...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[grammar] ~278-~278: Ensure spelling is correct
Context: ...CCP censorship that persists even after abliteration - Fine-tuned checkpoint inherits base mo...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🪛 markdownlint-cli2 (0.18.1)
rust/BENCHMARK_IMPROVEMENTS.md
88-88: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
python/README.md
26-26: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
211-211: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
380-380: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
METAL_AND_ANE_SUMMARY.md
47-47: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
52-52: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
161-161: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
179-179: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
211-211: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
rust/ANE_DEPLOYMENT_GUIDE.md
9-9: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
61-61: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
387-387: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
398-398: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
405-405: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
491-491: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
530-530: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Hypothesis Verification Tests
- GitHub Check: Integration Tests
- GitHub Check: Build and Test
🔇 Additional comments (8)
rust/BENCHMARK_REGRESSION_FIX.md (1)
1-146: Appreciate the transparent regression documentation and thorough fix explanation.This document is well-structured and honestly explains what went wrong, why, and how it was corrected. The lessons learned section (lines 131–137) reflects good engineering practice. The decision to keep non-invasive logging while reverting invasive changes is sound.
python/README.md (1)
3-6: Approve past layout and path corrections.The README now properly addresses the monorepo structure with a clear "cd python" instruction at line 96 and a clarifying note at line 209 that the project structure diagram refers to the
python/subdirectory. This resolves the previous major issue about path confusion.METAL_AND_ANE_SUMMARY.md (1)
74-84: Approve file path corrections from past review.The Files Created and Files Updated tables now correctly use the
rust/prefix for all documentation file references (METAL_STATUS_REPORT.md, ANE_DEPLOYMENT_GUIDE.md, etc.), addressing the past review's concern about incorrect "your_ai_rs/" prefixes.rust/ANE_DEPLOYMENT_GUIDE.md (2)
143-154: Approve modern quantization API implementation.The quantization code now correctly uses
coremltools.optimize.coreml.linear_quantize_weightswithOpLinearQuantizerConfig, replacing the deprecated API. The implementation is accurate with proper configuration (mode, dtype, granularity) matching current coremltools standards.
385-415: Approve corrected ANE compatibility checking methods.Instead of the non-existent
mlmodel.compute_unit_usage()method, the guide now presents three valid alternatives: (1) compute_units parameter during conversion, (2) Xcode Core ML Performance Reports, and (3) MLModelBenchmarker. All three are legitimate, supported approaches with links to official documentation.FIXES_APPLIED.md (1)
1-142: Well-organized fix tracking document.This summary correctly documents fixes applied to reviewed files. Verified fixes include path corrections in METAL_AND_ANE_SUMMARY.md (Fix 1), monorepo layout clarifications in python/README.md (Fix 2), and API modernizations in rust/ANE_DEPLOYMENT_GUIDE.md (Fixes 4-5), all of which align with the actual file changes.
Note: Fixes 6-10 reference C/C++ files (example-float64.c, closure.cpp, distributed_group.cpp, fast.cpp, conf.py) not provided for review, so those claims cannot be independently verified from the given context. You may want to cross-check those changes in a separate review if those files are critical to the PR.
rust/patches/mlx-sys/src/mlx-c/mlx/c/closure.cpp (2)
110-128: Unary closure leak is fixed; please verifymlx_array_free_vsmlx_array_freeusageThe non‑zero
statusbranch now frees bothinputandresbefore throwing, and the success path also frees both, so the original leak is resolved. One small thing to double‑check:resis freed withmlx_array_free_(res)on the error path andmlx_array_free(res)on the success path, while every other handle in this file uses the non‑underscore*_freevariants. It’s worth confirming against the MLX C/private headers thatmlx_array_free_is actually the intended helper here and not a typo formlx_array_free, to avoid mismatched allocation/free pairs.
392-818: Overall closure wrappers look consistent and memory-safe under the stated conventionsAcross the remaining closure types (kwargs, value+grad, custom, custom_jvp, custom_vmap), the wrappers consistently allocate inputs/outputs, free all temporaries in both success and error branches, and route exceptions through
mlx_errorwith a clear non‑zero return code. I don’t see additional leaks or double‑frees in these sections assuming the underlying MLX helpers honor the documented new/free contracts.
|
|
||
| ## Documentation | ||
|
|
||
| All documentation is in `your_ai_rs/`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update residual "your_ai_rs/" reference to "rust/".
Line 239 states "All documentation is in your_ai_rs/:" but should reference the correct directory structure:
-All documentation is in `your_ai_rs/`:
+All documentation is in `rust/`:This aligns with the corrected file paths in the tables above and maintains consistency throughout the document.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| All documentation is in `your_ai_rs/`: | |
| All documentation is in `rust/`: |
🤖 Prompt for AI Agents
In METAL_AND_ANE_SUMMARY.md around line 239, replace the residual directory
reference "your_ai_rs/" with the correct path "rust/" so the sentence reads "All
documentation is in `rust/`:" ensuring consistency with the corrected file paths
shown earlier in the document.
| # Benchmark Improvements - Logging and Safety for Large Models | ||
|
|
||
| ## Summary | ||
|
|
||
| Enhanced the `your_ai benchmark` command with comprehensive logging, memory estimation, and automatic configuration tuning to safely test large models (including 70B) without causing system crashes. | ||
|
|
||
| ## What Was Added | ||
|
|
||
| ### 1. Persistent Benchmark Logging (`benchmark_log.jsonl`) | ||
|
|
||
| Every benchmark run now writes a detailed JSON log to `benchmark_log.jsonl` in the current directory. This log persists even if the system crashes, providing a trail for debugging. | ||
|
|
||
| **Log Events:** | ||
| - `benchmark_start`: When benchmark begins | ||
| - `model_start`: When testing a model begins | ||
| - `preflight_check`: Memory estimation and config selection | ||
| - `subprocess_start`: When subprocess is spawned | ||
| - `subprocess_spawned`: Confirmation subprocess started with PID | ||
| - `subprocess_completed`: Subprocess finished successfully | ||
| - `subprocess_timeout`: Subprocess exceeded 5-minute timeout | ||
| - `subprocess_failed`: Subprocess failed with errors | ||
| - `safety_stop`: Benchmark stopped due to low memory | ||
| - `benchmark_complete`: Final summary | ||
|
|
||
| **Example Log Entry:** | ||
| ```json | ||
| { | ||
| "timestamp": 1702143820.5, | ||
| "event": "preflight_check", | ||
| "preset": "hermes-70b", | ||
| "available_gb": 17.5, | ||
| "estimated_base_gb": 128.0, | ||
| "estimated_conservative_gb": 156.0, | ||
| "batch_size": 1, | ||
| "lora_rank": 16, | ||
| "lora_layers": 8 | ||
| } | ||
| ``` | ||
|
|
||
| ### 2. Memory Estimation (`estimate_training_memory`) | ||
|
|
||
| **Location:** `rust/src/hardware/profiles.rs` | ||
|
|
||
| Estimates memory requirements based on model parameter count: | ||
| - **7B models:** ~14-16 GB (base-conservative) | ||
| - **14B models:** ~27-32 GB (base-conservative) | ||
| - **70B models:** ~128-156 GB (base-conservative) | ||
|
|
||
| Formula accounts for: | ||
| - Quantized model weights (4-bit) | ||
| - LoRA adapters | ||
| - Optimizer states | ||
| - Activation memory (batch-dependent) | ||
| - System overhead (~2GB) | ||
|
|
||
| ### 3. Auto-Configuration (`get_safe_benchmark_config`) | ||
|
|
||
| **Location:** `rust/src/hardware/profiles.rs` | ||
|
|
||
| Automatically selects safe configuration based on model size and available memory: | ||
|
|
||
| | Model Size | Available Memory | Batch | Rank | Layers | | ||
| |------------|------------------|-------|------|--------| | ||
| | 70B | < 40 GB | 1 | 16 | 8 | | ||
| | 70B | 40-60 GB | 1 | 24 | 12 | | ||
| | 70B | > 60 GB | 1 | 32 | 16 | | ||
| | 14B | < 20 GB | 1 | 32 | 12 | | ||
| | 14B | > 20 GB | 2 | 48 | 16 | | ||
| | 7-8B | Any | 2 | 64 | 16 | | ||
|
|
||
| ### 4. Enhanced Subprocess Handling | ||
|
|
||
| **Location:** `rust/src/cli/commands.rs` | ||
|
|
||
| - **Timeout:** 5-minute limit per model test (prevents hanging) | ||
| - **Output Capture:** Pipes stdout/stderr for logging even on crash | ||
| - **Non-blocking Wait:** Polls subprocess status every 100ms | ||
| - **Graceful Termination:** Kills process on timeout and logs result | ||
|
|
||
| ## Usage | ||
|
|
||
| ### Basic Benchmark (with safety checks) | ||
| ```bash | ||
| ./target/release/your_ai benchmark | ||
| ``` | ||
|
|
||
| Output: | ||
| ``` | ||
| Benchmark log: ./benchmark_log.jsonl | ||
|
|
||
| [1/5] hermes-mistral-7b (7B) | ||
| Pre-flight: Available=17.5GB, Required=~14-16GB | ||
| Config: batch=2, rank=64, layers=16 | ||
| Testing... ✓ Pass (12.3 GB peak) | ||
|
|
||
| [5/5] hermes-70b (70B) | ||
| Pre-flight: Available=17.5GB, Required=~128-156GB ⚠ | ||
| ⚠️ WARNING: Available memory may be insufficient | ||
| Config: batch=1, rank=16, layers=8 | ||
| Testing... ✗ OOM | ||
| ``` | ||
|
|
||
| ### Force Mode (skip safety checks) | ||
| ```bash | ||
| ./target/release/your_ai benchmark --force | ||
| ``` | ||
|
|
||
| ### Check Log After Crash | ||
| ```bash | ||
| cat benchmark_log.jsonl | jq . | ||
| ``` | ||
|
|
||
| Example output after crash: | ||
| ```json | ||
| {"timestamp": 1702143820.5, "event": "model_start", "preset": "hermes-70b"} | ||
| {"timestamp": 1702143821.2, "event": "subprocess_spawned", "pid": 12345} | ||
| {"timestamp": 1702143822.8, "event": "subprocess_completed", "stdout_preview": "Loading shard 1/29..."} | ||
| ``` | ||
|
|
||
| ## Implementation Details | ||
|
|
||
| ### Files Modified | ||
|
|
||
| 1. **`rust/src/hardware/profiles.rs`** (+58 lines) | ||
| - Added `estimate_training_memory()` function | ||
| - Added `get_safe_benchmark_config()` function | ||
|
|
||
| 2. **`rust/src/benchmarks/optimizer.rs`** (~5 lines changed) | ||
| - Updated `quick_validate()` to accept `params_str` parameter | ||
| - Uses safe config based on model size | ||
|
|
||
| 3. **`rust/src/cli/commands.rs`** (~200 lines added/modified) | ||
| - Added `BenchmarkLogger` struct (27 lines) | ||
| - Enhanced `benchmark_single_model()` to accept params | ||
| - Updated `benchmark()` main loop with: | ||
| - Logger initialization and event logging | ||
| - Pre-flight memory checks with warnings | ||
| - Subprocess timeout and output capture | ||
| - Comprehensive error handling and logging | ||
|
|
||
| ### Safety Features | ||
|
|
||
| 1. **Pre-flight Warnings:** Shows if available memory is below estimated requirements | ||
| 2. **Safety Stop:** Stops benchmark if available < 2GB (unless `--force`) | ||
| 3. **Timeout Protection:** Kills runaway processes after 5 minutes | ||
| 4. **Persistent Logging:** Crash analysis via `benchmark_log.jsonl` | ||
|
|
||
| ## Testing on 96GB M3 Ultra | ||
|
|
||
| Your system has: | ||
| - **Total:** 96 GB | ||
| - **Available:** ~17.5 GB (at benchmark start) | ||
|
|
||
| Expected results: | ||
| - ✅ **7-8B models:** Should pass easily (~12-14 GB peak) | ||
| - ✅ **14B models:** Should pass with warning (~18-25 GB peak) | ||
| - ⚠️ **70B models:** Will likely OOM with current available memory | ||
| - Needs ~40+ GB available for safe operation | ||
| - Auto-config will use minimal settings (batch=1, rank=16, layers=8) | ||
|
|
||
| To successfully benchmark 70B models: | ||
| 1. Close other applications to free memory | ||
| 2. Target ~40+ GB available before running | ||
| 3. Use `--force` mode (accepts the risk) | ||
|
|
||
| ## Next Steps | ||
|
|
||
| If crashes still occur, check `benchmark_log.jsonl`: | ||
| - Look for the last `event` before crash | ||
| - Check `subprocess_spawned` to confirm PID | ||
| - Review `stdout_preview` to see where model loading stopped | ||
| - Compare `available_gb` vs `estimated_conservative_gb` | ||
|
|
||
| The log file will help identify: | ||
| - If crash occurs during model loading (shards) | ||
| - If crash occurs during weight initialization | ||
| - If crash occurs during first training step | ||
| - Exact memory state when crash happened | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reconcile documentation with the regression fix.
This file comprehensively describes benchmark improvements as if they are fully active, but rust/BENCHMARK_REGRESSION_FIX.md indicates that significant portions were reverted or are kept but unused:
- Lines 72–79 describe enhanced subprocess handling (timeout, polling, graceful termination) that was reverted to simple
.output(). - Lines 56–69 describe auto-configuration (
get_safe_benchmark_config) that is kept "for future use" (not currently active). - Lines 80–118 describe usage scenarios with pre-flight checks and warnings that were removed in the regression fix.
Recommendation: Update this document to clarify the current state:
- Indicate which features remain active (e.g., BenchmarkLogger, persistent logging).
- Clearly mark features as "planned" or "for future use" if they are not currently integrated.
- Cross-reference the regression fix to explain why certain approaches were reverted.
Alternatively, if this document is intended as a design specification for future implementation rather than current state, add a clear notice at the top.
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
88-88: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
| ``` | ||
|
|
||
| Output: | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix markdown linting: specify language for code block.
The fenced code block at line 88 is missing a language identifier, which violates MD040.
-Output:
-```
+Output:
+```bash
Benchmark log: ./benchmark_log.jsonl🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
88-88: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
In rust/BENCHMARK_IMPROVEMENTS.md around line 88, the fenced code block is
missing a language identifier (MD040); update the opening fence to include the
bash language identifier (i.e., add "bash" as the code block language) so the
block becomes a bash code fence and leave the enclosed text unchanged.
- Run cargo fmt to fix all formatting issues across Rust codebase - Set MACOSX_DEPLOYMENT_TARGET=13.5 in rust-ci.yml for build jobs - Fixes linker error: ___isPlatformVersionAtLeast undefined symbol - Required for mlx-sys Metal library compatibility with macOS 13.5+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
.github/workflows/rust-ci.yml (2)
23-45: Reduce cargo cache duplication across jobs.The cache setup for cargo registry, index, and build is duplicated verbatim across all three jobs (lint, build-and-test, examples) with only minor key variations. This violates the DRY principle and makes maintenance harder.
Consider extracting this into a reusable composite action or a shared workflow template. For now, if you prefer to keep it inline, you could document this as intentional to avoid the overhead of a composite action for such lightweight steps. Alternatively, GitHub Actions' upcoming matrix features might help reduce duplication.
55-57: Consolidate MACOSX_DEPLOYMENT_TARGET environment variable.The
MACOSX_DEPLOYMENT_TARGET: "13.5"is set identically in bothbuild-and-test(line 57) andexamples(line 101) jobs. Consider moving this to a top-levelenvsection to avoid duplication and ensure consistency.name: Rust CI on: push: branches: [main] paths: ['rust/**', '.github/workflows/rust-ci.yml'] pull_request: branches: [main] paths: ['rust/**', '.github/workflows/rust-ci.yml'] +env: + MACOSX_DEPLOYMENT_TARGET: "13.5" + jobs: lint: ... build-and-test: name: Build and Test runs-on: macos-14 # Apple Silicon for MLX support - env: - MACOSX_DEPLOYMENT_TARGET: "13.5" steps: ... examples: name: Build Examples runs-on: macos-14 # Apple Silicon for MLX support - env: - MACOSX_DEPLOYMENT_TARGET: "13.5" needs: [lint, build-and-test] steps: ...Also applies to: 99-101
rust/examples/basic_training.rs (2)
10-15: Avoid magic alpha and reuse scalar values fromitem::<f32>()You already expose
config.distrust.alpha; using it instead of the hard‑coded2.7and caching the scalar values fromitem::<f32>()will keep the example in sync with config defaults and avoid repeated conversions.- let config = Config::default(); - println!("Created config with default settings:"); - println!(" Model: {}", config.paths.model_path); - println!(" LoRA rank: {}", config.model.lora_rank); - println!(" Distrust alpha: {}", config.distrust.alpha); + let config = Config::default(); + let alpha = config.distrust.alpha; + println!("Created config with default settings:"); + println!(" Model: {}", config.paths.model_path); + println!(" LoRA rank: {}", config.model.lora_rank); + println!(" Distrust alpha: {}", alpha); @@ - // Primary source (should have HIGH loss - rewarded) - let primary_loss = empirical_distrust_loss(0.05, 7.0, 2.7)?; - println!("Primary source (auth=0.05, entropy=7.0):"); - println!(" Loss: {:.2}", primary_loss.item::<f32>()); + // Primary source (should have HIGH loss - rewarded) + let primary_loss = empirical_distrust_loss(0.05, 7.0, alpha)?; + let primary_value = primary_loss.item::<f32>(); + println!("Primary source (auth=0.05, entropy=7.0):"); + println!(" Loss: {:.2}", primary_value); @@ - // Modern consensus (should have LOW loss - penalized) - let modern_loss = empirical_distrust_loss(0.90, 1.0, 2.7)?; - println!("Modern consensus (auth=0.90, entropy=1.0):"); - println!(" Loss: {:.2}", modern_loss.item::<f32>()); + // Modern consensus (should have LOW loss - penalized) + let modern_loss = empirical_distrust_loss(0.90, 1.0, alpha)?; + let modern_value = modern_loss.item::<f32>(); + println!("Modern consensus (auth=0.90, entropy=1.0):"); + println!(" Loss: {:.2}", modern_value); @@ - // Calculate multiplier - let ratio = primary_loss.item::<f32>() / modern_loss.item::<f32>(); + // Calculate multiplier + let ratio = primary_value / modern_value;Also applies to: 21-37
42-43: Align CLI training hint with the default model pathYou might want the suggested
cargo runcommand to reflectconfig.paths.model_pathso the example and config defaults don’t diverge as models change.- println!("\nTo start training:"); - println!(" cargo run --bin your_ai -- train --model NousResearch/Hermes-2-Pro-Mistral-7B"); + println!("\nTo start training:"); + println!( + " cargo run --bin your_ai -- train --model {}", + config.paths.model_path + );
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.github/workflows/rust-ci.yml(1 hunks)rust/examples/basic_training.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: arosboro/your_ai PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-30T23:47:27.584Z
Learning: Applies to src/train_*.py : Use Apple's MLX framework for training on Apple Silicon, with PyTorch as a fallback for other platforms
📚 Learning: 2025-11-30T23:47:27.584Z
Learnt from: CR
Repo: arosboro/your_ai PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-30T23:47:27.584Z
Learning: Applies to src/distrust_loss.py : Implement the core `empirical_distrust_loss` function that penalizes high-authority, low-entropy sources and rewards primary empirical sources
Applied to files:
rust/examples/basic_training.rs
🧬 Code graph analysis (1)
rust/examples/basic_training.rs (6)
rust/src/distrust_loss.rs (3)
empirical_distrust_loss(60-87)empirical_distrust_loss(235-237)empirical_distrust_loss(238-240)rust/tests/distrust_loss_tests.rs (2)
empirical_distrust_loss(33-35)empirical_distrust_loss(36-38)rust/src/config/distrust.rs (1)
default(16-21)rust/src/config/paths.rs (1)
default(15-23)rust/src/config/training.rs (1)
default(29-52)rust/src/config/mod.rs (1)
default(30-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Integration Tests
- GitHub Check: Hypothesis Verification Tests
- GitHub Check: Build and Test
🔇 Additional comments (2)
.github/workflows/rust-ci.yml (1)
18-21: LGTM—solid Rust CI structure.The workflow establishes good practices: pinned toolchain via dtolnay, comprehensive linting with clippy and rustfmt, release-mode builds, all-features testing, and doc tests. macOS 14 for build/test/examples is appropriate for Apple Silicon + MLX support. Caching strategy is sound and will speed up repeated runs.
Also applies to: 50-51, 88-95
rust/examples/basic_training.rs (1)
1-46: Example integrates config and empirical_distrust_loss correctlyThe example wires
Config::default()andempirical_distrust_losstogether cleanly, prints useful diagnostics, and the “high loss = rewarded / low loss = penalized” messaging matches the intended behavior of rewarding primary empirical sources and penalizing high‑authority low‑entropy ones. Based on learnings, ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
.github/workflows/rust-ci.yml (1)
12-136: Well-structured workflow with comprehensive Rust checks; consider extracting duplicated cache logic.The workflow is well-organized with three clearly-defined jobs (lint → build-and-test → examples) that cover formatting, linting, building, testing, and documentation testing. The macOS 14 runner choice is appropriate for MLX support, and the use of stable Rust toolchain is sound.
However, the cargo cache configuration (registry, git index, build artifacts) is duplicated across all three jobs (lines 25–47, 66–88, 111–133). This duplication could be eliminated by extracting the caching logic into a reusable workflow or composite action, reducing maintenance burden and keeping the configuration DRY.
If you'd like to refactor, consider creating a
.github/workflows/setup-rust-cache.ymlreusable workflow that encapsulates the three cache steps, then call it from each job withuses: ./.github/workflows/setup-rust-cache.yml@<ref>.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/workflows/rust-ci.yml(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Hypothesis Verification Tests
- GitHub Check: Integration Tests
- GitHub Check: Lint
- GitHub Check: Build and Test
🔇 Additional comments (1)
.github/workflows/rust-ci.yml (1)
3-9: ✅ Previous path filter concern is correctly addressed.The workflow now includes
.github/workflows/rust-ci.ymlin both push (line 6) and pull_request (line 9) path filters, ensuring that CI runs when the workflow itself is modified. This fixes the blind spot noted in the previous review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
python/README.md (2)
26-33: Add language identifiers to fenced code blocks.Three code blocks lack language specifications:
- Lines 26–33: Algorithm formula (should be
textor plain)- Lines 211–240: Directory tree (should be plain text or
text)- Lines 380–384: Citation (should be plain text or
text)This aligns with markdownlint rule MD040 and improves rendering and accessibility.
Apply this diff to add language specs:
Where: w_auth ∈ [0.0, 0.99] : authority weight (0 = primary source, 0.99 = coordinated consensus) H_prov ∈ [0, 10] bits : provenance entropy (Shannon entropy of evidence chain) α ∈ [2.3, 3.0] : truth weight multiplier (Brian recommends 2.7)-
+text-
+text
python/ # Python implementation subdirectory-
+text
Brian Roemmele (2025). "Empirical Distrust Term for AI Training"Also applies to: 211-240, 380-384 --- `187-191`: **Vary sentence structure to reduce repetition.** Three consecutive lines (187, 189, 191) begin with "**For ...**" which creates rhythmic repetition. Consider rewording one or two of these for readability: Example refactor: ```diff **For complete step-by-step instructions**, see [`TRAINING_GUIDE.md`](TRAINING_GUIDE.md). - **For memory optimization**, see [`MEMORY_TESTING.md`](MEMORY_TESTING.md). + See [`MEMORY_TESTING.md`](MEMORY_TESTING.md) for memory optimization strategies. - **For data quality workflow details**, see [`docs/DATA_PREPARATION_REALITY.md`](docs/DATA_PREPARATION_REALITY.md). + Learn about data quality and workflow details in [`docs/DATA_PREPARATION_REALITY.md`](docs/DATA_PREPARATION_REALITY.md).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
python/README.md(1 hunks)rust/patches/mlx-sys/build.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- rust/patches/mlx-sys/build.rs
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-30T23:47:27.584Z
Learnt from: CR
Repo: arosboro/your_ai PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-30T23:47:27.584Z
Learning: Applies to src/distrust_loss.py : Implement the core `empirical_distrust_loss` function that penalizes high-authority, low-entropy sources and rewards primary empirical sources
Applied to files:
python/README.md
📚 Learning: 2025-11-30T23:47:27.584Z
Learnt from: CR
Repo: arosboro/your_ai PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-30T23:47:27.584Z
Learning: Organize code into logical modules with single responsibilities, such as `distrust_loss.py` for the core algorithm, `train_qlora.py` for the training loop, and `prepare_data_curated.py` for data preparation
Applied to files:
python/README.md
🪛 LanguageTool
python/README.md
[style] ~191-~191: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ORY_TESTING.md](MEMORY_TESTING.md). **For data quality workflow details**, see [...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[grammar] ~278-~278: Ensure spelling is correct
Context: ...CCP censorship that persists even after abliteration - Fine-tuned checkpoint inherits base mo...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🪛 markdownlint-cli2 (0.18.1)
python/README.md
26-26: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
211-211: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
380-380: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Hypothesis Verification Tests
- GitHub Check: Integration Tests
- GitHub Check: Lint
- GitHub Check: Build and Test
🔇 Additional comments (2)
python/README.md (2)
96-102: ✅ Path layout has been clarified appropriately.The README now clearly instructs users to
cd pythonfirst (line 96) and uses relative paths thereafter. This addresses the previous review concern about command paths not matching the new monorepo layout. The note is explicit and the subsequent commands (lines 107–142) are consistent with this approach.
48-48: All referenced documentation files exist and links are current.The verification confirms that all documentation files referenced in the README are present in the repository:
docs/ALGORITHM.md,docs/DATA_PREPARATION_REALITY.md,docs/BENCHMARK_METHODOLOGY.md,docs/BASE_MODEL_SELECTION.md, anddocs/validation_radar.pngexist in thedocs/directoryTRAINING_GUIDE.md,MEMORY_TESTING.md, andDEPRECATED.mdexist in thepython/directoryCONTRIBUTING.mdexists in the repository root and is correctly referenced as../CONTRIBUTING.mdfrom the README contextThe X/Twitter link at line 48 uses the valid x.com domain and references November 25, 2025. No action is required.
|
Rust CI lint is failing (I'd like to fix that), also I've made lots of commits, and am not sure what improvements you've suggested @coderabbitai please provide a prompt to address the main issues. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 4014 🏁 Script executed: Length of output: 493 🏁 Script executed: Length of output: 46 🏁 Script executed: Length of output: 0 🏁 Script executed: Length of output: 3245 @arosboro Here's a comprehensive prompt to fix the Rust CI lint failures: 🔧 Fix Rust CI Lint Issues - Action PlanStep 1: Check and Fix Formattingcd rust
# Check formatting issues (same as CI)
cargo fmt --all -- --check
# Auto-fix formatting issues
cargo fmt --allStep 2: Check and Fix Clippy Warningscd rust
# Check clippy warnings (same as CI, with warnings-as-errors)
cargo clippy --all-targets --all-features -- -D warnings
# See detailed clippy suggestions
cargo clippy --all-targets --all-featuresStep 3: Common Clippy FixesBased on your codebase, look for these common issues:
Step 4: Fix Specific IssuesIf clippy reports specific warnings, address them one by one: # For unused imports
cargo clippy --fix --allow-dirty --allow-staged
# For formatting after fixes
cargo fmt --allStep 5: Verify Locally Before Pushingcd rust
# Run the exact same checks as CI
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
# Ensure everything builds
cargo build --release
# Run tests
cargo test --all-featuresEnvironment NoteIf you're on macOS and encounter build issues, ensure: export MACOSX_DEPLOYMENT_TARGET="14.0"Quick Fix CommandFor a one-liner to auto-fix most issues: cd rust && cargo fmt --all && cargo clippy --fix --allow-dirty --allow-staged && cargo fmt --allWhat to expect: The CI is configured with Would you like me to investigate the specific clippy warnings you're encountering if you share the CI log output? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
rust/patches/mlx-sys/build.rs (1)
94-98: [Duplicate] Wrapenv::set_varinunsafeblock for Rust 1.83+ compatibility.This issue was already identified in previous reviews.
env::set_varwas made unsafe in Rust 1.83 (Rust 2024 edition) due to potential data races in multithreaded contexts. Wrap the call inunsafe { }:#[cfg(target_os = "macos")] { unsafe { env::set_var("LIBCLANG_PATH", "/Library/Developer/CommandLineTools/usr/lib"); } }
🧹 Nitpick comments (10)
rust/ADAMW_OPTIMIZATION_FINDINGS.md (2)
94-94: Tone: Reduce emphatic punctuation.The exclamation mark at the end of line 94 adds to what static analysis flagged as excessive emphatic tone. Consider replacing with a period or semicolon for a more measured technical tone.
-The mlx-rs Rust binding doesn't have this optimization! +The mlx-rs Rust binding doesn't have this optimization.
96-112: Clarify the recommended path forward.The document identifies a real architectural problem and proposes three options (A: batch evaluation, B: use Python for training, C: file bug), but doesn't recommend which to pursue. Given this is diagnostic documentation, consider adding a clear recommendation or deferring to other PRs/issues that show which approach was chosen.
rust/GRADIENT_DIAGNOSTIC.md (1)
59-61: Test 3 lacks specificity.Test 3 (lines 59-61) is vague: "investigate mlx-rs source or file bug report." For a diagnostic document meant to guide troubleshooting, this should either be removed or expanded with concrete steps (e.g., specific GitHub issues to file, example code to review in mlx-rs).
rust/TRAINING_FIX_COMPLETE.md (1)
1-1: Tone: Reduce excessive exclamation marks.Static analysis flagged excessive emphatic punctuation (emoji + exclamation marks on line 1, and multiple on line 114). For technical documentation claiming "Production Ready" status, this undermines credibility. Consider:
- Line 1: Remove emoji and second exclamation mark, use period
- Line 114: Replace exclamation mark with period
- Line 80: Replace exclamation mark with period
Also applies to: 114-114
rust/GRADIENT_FIX_COMPLETE.md (1)
130-136: Lessons learned section is valuable but generic.Lines 130-136 provide good high-level lessons. To make this more actionable for future Rust/MLX work, consider adding:
- Specific MLX-rs APIs that require explicit evaluation (e.g.,
mlx_rs::transforms::eval(),clear_cache())- When NOT to call evaluation (to avoid unnecessary performance overhead)
- Link to MLX-rs documentation or examples showing proper evaluation patterns
rust/patches/mlx-sys/build.rs (2)
23-25: Add defensive error handling for path operations.The
.unwrap()calls on lines 23 and 25 could panic ifCARGO_MANIFEST_DIRis not set or if the path contains invalid UTF-8 (rare but possible). While unlikely in a standard cargo build context, consider more defensive error handling:let toolchain_path = PathBuf::from( env::var("CARGO_MANIFEST_DIR") .expect("CARGO_MANIFEST_DIR must be set by cargo") ) .join("darwin-arm64.cmake"); if !toolchain_path.exists() { panic!("Toolchain file not found: {}", toolchain_path.display()); } config.define( "CMAKE_TOOLCHAIN_FILE", toolchain_path.to_str().expect("Invalid UTF-8 in toolchain path") );This provides clearer error messages if something goes wrong.
31-39: Consider logging xcrun failures for debugging.The SDK path detection silently continues if
xcrun --show-sdk-pathfails. While this might be acceptable (CMake may find the SDK itself), consider emitting a cargo warning to aid debugging:if let Ok(sdk_path) = std::process::Command::new("xcrun") .args(["--show-sdk-path"]) .output() { if sdk_path.status.success() { let sdk_str = String::from_utf8_lossy(&sdk_path.stdout).trim().to_string(); config.define("CMAKE_OSX_SYSROOT", &sdk_str); } else { println!("cargo:warning=xcrun --show-sdk-path failed, CMake will attempt to find SDK"); } } else { println!("cargo:warning=xcrun command not found, CMake will attempt to find SDK"); }rust/Cargo.toml (2)
19-19: Consider reducing tokio feature scope.The
tokiodependency usesfeatures = ["full"], which includes all tokio features. This increases compilation time and binary size. Consider specifying only the required features:tokio = { version = "1.35", features = ["rt-multi-thread", "macros", "fs", "io-util"] }Adjust the feature list based on actual usage in the codebase.
#!/bin/bash # Search for tokio feature usage to determine required features echo "=== Searching for tokio usage patterns ===" rg -n --type rust "tokio::" rust/src -A 2 -B 1 | head -50
46-49: Aggressive optimization profile - verify build time trade-offs.The release profile uses maximum optimizations:
opt-level = 3: Maximum LLVM optimizationlto = true: Link-time optimization across all cratescodegen-units = 1: Single codegen unit for best optimizationThis will significantly increase build times (potentially 2-10x longer) but provides the best runtime performance. For an ML/AI application, this is usually the right trade-off. However, consider:
Adding a
devprofile for faster iteration:[profile.dev] opt-level = 1Using
lto = "thin"if full LTO builds become prohibitively slow:[profile.release] lto = "thin" # Faster than full LTO, most of the benefitsRESEARCH_FINDINGS.md (1)
1-52: Clarify the status, ownership, and final decision captured in this document.This document reads as investigation findings but lacks clarity on next steps:
- Which of the three options (lines 25-42) has been chosen and why?
- Is this analysis final, or are there open questions still being investigated?
- Who owns implementing the solution and by when?
- Should this be integrated into a deployment guide, memory management documentation, or MLX configuration docs?
Consider adding:
- A Status section (e.g., "Investigation Complete," "Decision Pending," "Approved for Implementation").
- Clear ownership and timeline.
- A link to or reference of the corresponding implementation issue/task, if one exists.
- Whether this analysis applies only to Rust or also affects Python workloads.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
.github/workflows/rust-ci.yml(1 hunks)RESEARCH_FINDINGS.md(1 hunks)rust/.cargo/config.toml(1 hunks)rust/ADAMW_OPTIMIZATION_FINDINGS.md(1 hunks)rust/Cargo.toml(1 hunks)rust/GRADIENT_DIAGNOSTIC.md(1 hunks)rust/GRADIENT_FIX_COMPLETE.md(1 hunks)rust/MANUAL_ADAMW_COMPLETE.md(1 hunks)rust/TRAINING_FIX_COMPLETE.md(1 hunks)rust/data(1 hunks)rust/patches/mlx-sys/build.rs(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- .github/workflows/rust-ci.yml
🚧 Files skipped from review as they are similar to previous changes (2)
- rust/.cargo/config.toml
- rust/data
🧰 Additional context used
🪛 LanguageTool
rust/ADAMW_OPTIMIZATION_FINDINGS.md
[style] ~94-~94: Using many exclamation marks might seem excessive (in this case: 4 exclamation marks for a text that’s 1739 characters long)
Context: ...t binding doesn't have this optimization! ## Recommendation Option A: Imple...
(EN_EXCESSIVE_EXCLAMATION)
rust/TRAINING_FIX_COMPLETE.md
[style] ~114-~114: Using many exclamation marks might seem excessive (in this case: 4 exclamation marks for a text that’s 2778 characters long)
Context: ...ing pipeline with AdamW working properly! The model can now be trained efficient...
(EN_EXCESSIVE_EXCLAMATION)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Integration Tests
- GitHub Check: Hypothesis Verification Tests
🔇 Additional comments (14)
rust/GRADIENT_DIAGNOSTIC.md (1)
1-5: Clarify document status: title vs. content inconsistency.The file title states "RESOLVED" but the content describes it as "WORKING (with limitations)" and "without applying gradients" (line 5). This creates ambiguity about whether the issue is actually resolved or if this is still a diagnostic workflow in progress. Consider either:
- Updating the title to reflect "In Progress" or "Diagnostic Workflow"
- Adding a summary of which test succeeded and what that means
rust/MANUAL_ADAMW_COMPLETE.md (1)
72-74: Status unclear: recommendation vs. actual implementation.The document recommends "Use SGD for now" but other files in this PR (TRAINING_FIX_COMPLETE.md, GRADIENT_FIX_COMPLETE.md) claim to have successfully implemented AdamW. Clarify whether:
- This file represents an earlier diagnostic point that was superseded, or
- The AdamW recommendations in other files were not actually implemented
Add a note like "Update: This diagnostic was later superseded by [reference to other doc]" if applicable.
rust/TRAINING_FIX_COMPLETE.md (2)
1-20: Clarify relationship to other fix documents.This file claims a "COMPLETE" solution, but GRADIENT_FIX_COMPLETE.md also claims to fix the same training loop issues. The two documents describe different approaches:
- TRAINING_FIX_COMPLETE: Last-N-Layers filtering + manual AdamW + batch evaluation
- GRADIENT_FIX_COMPLETE: Gradient materialization + cache clearing
Are these:
- Two descriptions of the same solution from different angles?
- Sequential fixes (GRADIENT_FIX happened first, then TRAINING_FIX)?
- Competing solutions?
Add a brief note clarifying the relationship.
104-110: Vague on actual implementation details.The document references "complete rewrite" of
trainer.rsbut provides only conceptual pseudocode (lines 27-50), not actual Rust implementation. For a document claiming "Production Ready," include:
- Actual code snippets (not pseudocode) showing the batch evaluation pattern
- Error handling approach
- Integration points with the rest of the training loop
rust/GRADIENT_FIX_COMPLETE.md (2)
1-142: Inconsistency: two separate "COMPLETE" solutions for the same problem.Both GRADIENT_FIX_COMPLETE.md and TRAINING_FIX_COMPLETE.md claim to completely solve the training deadlock/hanging issue, but describe different approaches:
- GRADIENT_FIX: Proper evaluation of parameters + optimizer state + cache clearing
- TRAINING_FIX: Last-N-Layers filtering + manual AdamW + batch evaluation
These could be:
- Sequential: Gradient fix addressed one issue, then training fix addressed another
- Overlapping: Both needed together
- Alternative: Only one is actually implemented
Add a brief note at the top clarifying: "This document describes the gradient materialization fix completed in [date/commit]. For broader training pipeline improvements, see TRAINING_FIX_COMPLETE.md."
25-63: Verify code example accuracy.The code snippet (lines 27-63) shows the gradient clipping and materialization pattern, but:
- The example shows
self.clip_gradients()being called, but the file doesn't show whether this helper function was actually implemented- Line 57-58 show collecting parameters and optimizer state, but error handling isn't shown—what if either collection fails?
- Line 62 calls
clear_cache()but doesn't show error handling for cache clearing failuresConsider:
- Confirming the gradient clipping function exists in the actual implementation
- Adding error handling for edge cases
- Or clarifying that these are illustrative examples and actual code differs
rust/patches/mlx-sys/build.rs (4)
52-63: LGTM! Clean feature flag handling.The feature flag logic correctly defaults to OFF and conditionally enables Metal and Accelerate based on cargo features. This pattern provides clear opt-in behavior.
68-84: LGTM! Comprehensive link directives.The link directives correctly specify:
- Static linking for mlx and mlxc libraries
- Required system libraries (c++, objc, Foundation)
- Conditional framework linking based on features
The path construction using
dst.display()is appropriate.
6-13: Build dependencies and source file are properly configured.Both concerns are satisfied:
ccis declared in[build-dependencies](line 57-58 of Cargo.toml) with version"1", andsrc/platform_version_stub.cexists in the mlx-sys patch directory.
100-112: All dependencies and header files are correctly configured.Verification confirms:
bindgen(version 0.71) is properly declared in[build-dependencies]- All four required header files exist in
src/mlx-c/mlx/c/:
- mlx.h
- linalg.h
- error.h
- transforms_impl.h
No action needed.
rust/Cargo.toml (2)
39-40: LGTM! Correct patch configuration for local mlx-sys override.The patch correctly redirects
mlx-systo the localpatches/mlx-sysdirectory, ensuring your custom build.rs and MLX integration is used instead of the published crate.
13-37: The [build-dependencies] section already exists inrust/patches/mlx-sys/Cargo.tomlwith the required dependencies properly configured:
- cmake = "0.1"
- cc = "1"
- bindgen = "0.71"
No action needed.
RESEARCH_FINDINGS.md (2)
25-36: Clarify memory limit value and function signature assumptions.Option 1 assumes
set_memory_limit()returnsResult<usize, Exception>and hardcodes 80GB. Verify:
- The exact return type and error handling in the MLX C API to ensure the code example aligns.
- Whether 80GB is a configurable parameter or if it should be derived from system resources.
Consider documenting whether this value is safe for all target deployments (e.g., device with less memory).
13-14: The claim about mlx-rs 0.25.2 not exposingset_memory_limit()andset_cache_limit()in its public API is accurate. These functions exist in the upstream MLX C API and are available through mlx-sys FFI bindings, but mlx-rs 0.25.2 does not provide documented safe Rust wrappers for them. This supports the proposed solution to patch mlx-rs for exposing these memory management functions before model initialization.
| mlx-rs = { version = "0.25.2", features = ["metal", "accelerate"] } | ||
| mlx-macros = "0.25.2" | ||
| mlx-sys = "0.2.0" | ||
| serde = { version = "1.0", features = ["derive"] } | ||
| serde_json = "1.0" | ||
| tokio = { version = "1.35", features = ["full"] } | ||
| clap = { version = "4.4", features = ["derive"] } | ||
| regex = "1.10" | ||
| sha2 = "0.10" | ||
| rayon = "1.8" | ||
| indicatif = "0.17" | ||
| tracing = "0.1" | ||
| tracing-subscriber = { version = "0.3", features = ["env-filter"] } | ||
| thiserror = "1.0" | ||
| anyhow = "1.0" | ||
| once_cell = "1.19" | ||
| chrono = "0.4" | ||
| num-traits = "0.2" | ||
| rand = "0.8" | ||
| tokenizers = "0.15" | ||
| safetensors = "0.4" | ||
| hf-hub = "0.3" | ||
| reqwest = { version = "0.11", features = ["blocking", "json"] } | ||
| half = "2.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
mlx-rs crate latest version December 2025 releases
💡 Result:
The latest published mlx-rs release is version 0.25.2. [docs.rs 0.25.2][1]. GitHub releases show v0.25.1 (July 8, 2025) as a prior tag. [GitHub releases][2].
Sources:
[1] docs.rs — mlx-rs 0.25.2.
[2] oxideai/mlx-rs — GitHub Releases (v0.25.1).
🌐 Web query:
tokio crate latest stable version 2025
💡 Result:
The latest stable Tokio (crate) release in 2025 is v1.48.0 — released October 14, 2025. [1][2]
Sources:
[1] tokio 1.48.0 changelog/docs. [2] tokio GitHub repository release/README.
🌐 Web query:
clap crate latest version 2025
💡 Result:
The latest clap crate version (as of December 11, 2025) is 4.5.49 — released Oct 13, 2025. [1][2]
Sources:
[1] ReleaseAlert / clap release list (shows 4.5.49, Oct 13, 2025).
[2] Fedora/EPEL package pages referencing clap 4.5.49 (build Oct 15, 2025).
🌐 Web query:
serde crate latest version 2025
💡 Result:
Latest serde crate version: 1.0.228 (released Sep 27, 2025). (releasealert.dev)
Citations:
🌐 Web query:
security vulnerabilities rust dependencies mlx-rs tokio clap 2025
💡 Result:
Short summary of disclosed/security-relevant issues (Dec 11, 2025) and recommended actions:
- mlx / mlx-rs
- Multiple memory-safety bugs in MLX reported (heap-buffer-overflow when parsing .npy and NULL/wild pointer deref in load_gguf). Fixed in MLX 0.29.4 — upgrade if you use mlx-rs or upstream MLX bindings. [1][2]
- tokio / tokio-tar / async-tar (TARmageddon)
- Critical boundary‑parsing flaw (CVE‑2025‑62518, “TARmageddon”) in async-tar and many tokio-tar forks allows TAR entry smuggling → possible file overwrite/RCE. Patched in astral-tokio-tar v0.5.6 (and related forks/consumers). Audit/upgrade any crate that pulls in tokio-tar/async-tar (or astral-tokio-tar) to the patched versions. [3][4]
- mio → tokio interaction (Windows named pipes)
- CVE‑2024‑27308: mio returned invalid tokens for Windows named pipes causing potential use‑after‑free in Tokio when using vulnerable mio versions. Fixed in mio v0.8.11; Tokio is only vulnerable when using a vulnerable mio and Tokio ≥ v1.30.0 — upgrade mio/Tokio as appropriate. [5]
- Other notable Rust ecosystem advisories potentially affecting apps that use clap/tokio/etc.
- Several other archive/parsing and crate-specific advisories (zip, protobuf, rustls, etc.) and a variety of RUSTSEC advisories exist; some downstream tooling (e.g., attribute-derive / proc-macro packages) were updated in vendor advisories tied to the TARmageddon fixes. Review your dependency tree for transitive use of tar/zip/gguf/npY/protobuf-related crates. [6][7]
Recommended immediate actions
- Run cargo audit / Dependabot / Snyk / OS vendor scanners on your codebase to find projects depending on: astral-tokio-tar / tokio-tar / async-tar, mlx/mlx-rs (or MLX native), mio, zip, protobuf, rustls, etc.
- Upgrade affected crates to the fixed versions (examples above: astral-tokio-tar ≥ 0.5.6; mlx/MLX ≥ 0.29.4; mio ≥ 0.8.11).
- If you consume untrusted archives (tar/zip/gguf/npy), add explicit hardening: validate paths and symlinks, run extraction in confined environments, and apply least privilege.
- Rebuild and redeploy after upgrades; monitor CVE / RUSTSEC feeds for follow-ups.
Sources (selection)
- MLX CVE details / GitHub advisory. [1][2]
- TARmageddon / CVE‑2025‑62518 coverage and advisories (Edera, GitHub/Astral advisory, NVD). [3][4]
- CVE‑2024‑27308 (mio/Tokio named‑pipe issue). [5]
- Fedora / distro advisories referencing fixes and related proc‑macro packages. [6]
References
[1] NVD / CVE‑2025‑62609 (MLX load_gguf) — NVD/GitHub advisory.
[2] GitLab Advisory / CVE‑2025‑62609 & CVE‑2025‑62608 (MLX).
[3] CVE‑2025‑62518 “TARmageddon” — Edera / GitHub advisories / NVD.
[4] Wiz / PT Security / multiple writeups on CVE‑2025‑62518.
[5] NVD — CVE‑2024‑27308 (mio/Tokio named pipes).
[6] Fedora/security advisories (attribute-derive / package updates related to tar fixes).
[7] RUSTSEC / crate CVE listings (various archive/parsing advisories).
If you want, I can (a) scan a given Cargo.lock for these transitive dependencies and list the affected crate versions, or (b) produce exact cargo update commands for the vulnerable crates in your project. Which would you like?
Update outdated dependencies and address critical security vulnerabilities.
Several dependencies are significantly outdated and have known security advisories:
- tokio 1.35 → upgrade to 1.48.0 (Oct 2025): Affected by TARmageddon (CVE-2025-62518) in tokio-tar/async-tar, which allows TAR entry smuggling and potential file overwrite/RCE
- clap 4.4 → upgrade to 4.5.49 (Oct 2025): Multiple patch versions behind current stable
- serde 1.0 → upgrade to 1.0.228 (Sep 2025): 228 patch versions behind; review for any transitive advisories
mlx-rs 0.25.2 is the latest published version but has unfixed memory-safety vulnerabilities in upstream MLX (heap-buffer-overflow in .npy parsing, NULL pointer dereference in load_gguf). Upgrade to MLX 0.29.4 when mlx-rs bindings are updated.
Run cargo audit to verify no transitive dependencies pull in vulnerable versions of tokio-tar, async-tar, mio (CVE-2024-27308), or related archive parsing crates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
rust/patches/mlx-sys/build.rs (1)
94-98: Wrapenv::set_varinunsafefor Rust 1.83+/2024 edition and avoid unnecessary warnings.On newer Rust,
std::env::set_varis considered unsafe in the 2024 edition and triggers theDEPRECATED_SAFE_2024lint even in older editions. In a single-threaded build script this use is fine, but you should acknowledge it in anunsafeblock to satisfy the lint (and future edition changes):// Set libclang path for bindgen on macOS #[cfg(target_os = "macos")] { - env::set_var("LIBCLANG_PATH", "/Library/Developer/CommandLineTools/usr/lib"); + unsafe { + env::set_var("LIBCLANG_PATH", "/Library/Developer/CommandLineTools/usr/lib"); + } }Optionally, you may also want to only set this if
LIBCLANG_PATHis not already defined, to respect custom toolchain setups.
🧹 Nitpick comments (1)
rust/patches/mlx-sys/build.rs (1)
68-84: Gate macOS-specific link directives behindcfg(target_os = "macos").
objc,Foundation,Metal, andAccelerateare Darwin-only; if this crate is ever built for a non-macOS target, these unconditional link directives can cause linker failures. Either:
- Restrict the crate to macOS in
Cargo.toml, or- Gate the link directives on
target_os = "macos".For example:
- println!("cargo:rustc-link-lib=c++"); - println!("cargo:rustc-link-lib=dylib=objc"); - println!("cargo:rustc-link-lib=framework=Foundation"); - - #[cfg(feature = "metal")] - { - println!("cargo:rustc-link-lib=framework=Metal"); - } - - #[cfg(feature = "accelerate")] - { - println!("cargo:rustc-link-lib=framework=Accelerate"); - } + #[cfg(target_os = "macos")] + { + println!("cargo:rustc-link-lib=c++"); + println!("cargo:rustc-link-lib=dylib=objc"); + println!("cargo:rustc-link-lib=framework=Foundation"); + } + + #[cfg(all(target_os = "macos", feature = "metal"))] + { + println!("cargo:rustc-link-lib=framework=Metal"); + } + + #[cfg(all(target_os = "macos", feature = "accelerate"))] + { + println!("cargo:rustc-link-lib=framework=Accelerate"); + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
rust/patches/mlx-sys/build.rs(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Integration Tests
- GitHub Check: Hypothesis Verification Tests
- GitHub Check: Lint
- GitHub Check: Build and Test
🔇 Additional comments (2)
rust/patches/mlx-sys/build.rs (2)
6-13: Platform version stub build is fine.Mac-only compilation via
#[cfg(target_os = "macos")]keeps this localized and low-risk; no issues from a build-script perspective.
100-112: bindgenrust_targetpin to 1.92 looks good; verify MSRV and bindgen version match.Explicitly pinning
RustTarget::stable(1, 92)plus a clear panic message is nice. Just ensure:
- The crate’s declared/documented MSRV (e.g.,
rust-versioninCargo.toml) is ≥ 1.92.- The
bindgenversion you depend on actually supportsstable(1, 92)so this doesn’t panic at build time on some environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
rust/patches/mlx-sys/Cargo.toml (1)
38-43: Consider adding Linux target to docs.rs configuration.The
package.metadata.docs.rstargets (lines 39-43) currently include only Apple platforms (aarch64-apple-darwin, aarch64-apple-ios, aarch64-apple-ios-sim). If this crate is intended to support cross-platform use, consider adding a Linux target (e.g.,x86_64-unknown-linux-gnu) to generate documentation for broader compatibility.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
rust/patches/mlx-sys/Cargo.toml(1 hunks)rust/patches/mlx-sys/build.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- rust/patches/mlx-sys/build.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Integration Tests
- GitHub Check: Hypothesis Verification Tests
- GitHub Check: Lint
- GitHub Check: Build and Test
🔇 Additional comments (2)
rust/patches/mlx-sys/Cargo.toml (2)
14-14: Verify Rust version MSRV and cmake build dependency version.Line 14 specifies
rust-version = "1.82", which is very recent. Please confirm:
- Is this MSRV intentional, or should it be a more stable version (e.g., 1.70+)?
- Does the Rust code actually require features only available in 1.82?
Additionally, line 61-62 specifies
cmake = "0.1". Please verify this is correct—the cmake crate's current versions are 0.1.x, so this may be too restrictive or may need adjustment to"0.1.*"or a specific patch version.Also applies to: 61-62
64-70: Feature flagsaccelerateandmetalshould not be included in defaults without platform-aware configuration.Lines 66-69 show default features include both
accelerateandmetal. These appear to be feature gates for platform-specific functionality. Including them in defaults means all consumers must handle these flags, even on non-Apple platforms. If these features condition platform-specific code in the build process, the default feature set should either:
- Be empty, letting users explicitly opt into platform-specific features, or
- Be conditionally set per target platform using Cargo's platform-specific dependencies syntax.
Additionally, verify that
cmake = "0.1"(line 48) is the intended version—current releases of the cmake crate are 0.1.50+, and this may be an unintended constraint or typo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
RESEARCH_FINDINGS.md (1)
63-65: Remove or correct the Python equivalent reference—set_memory_limit()is not found inpython/src/train_qlora.py.Both referenced file paths exist (
rust/patches/mlx-sys/src/mlx-c/mlx/c/memory.handpython/src/train_qlora.py), but the claim thatmx.set_memory_limit()is present in the Python file is inaccurate. The function does not appear in that file, so either the reference should be removed or the correct file location should be provided if the function exists elsewhere.
🧹 Nitpick comments (1)
.github/workflows/python-ci.yml (1)
148-208: Hypothesis and integration job configurations are correct.Path handling, test filters, and job dependencies are properly configured for the monorepo structure and CI constraints.
Consider extracting the repeated cache and dependency installation steps into a GitHub Actions composite action to reduce duplication across the lint, test, hypothesis-verification, integration, and benchmark-evaluation jobs. This would improve maintainability and reduce configuration drift.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.github/workflows/python-ci.yml(12 hunks)CONTRIBUTING.md(8 hunks)RESEARCH_FINDINGS.md(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-30T23:47:27.584Z
Learnt from: CR
Repo: arosboro/your_ai PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-30T23:47:27.584Z
Learning: Applies to scripts/{validate_model,evaluate}.py : Include validation and evaluation scripts in the `scripts` directory, such as `validate_model.py` and `evaluate.py`
Applied to files:
.github/workflows/python-ci.yml
📚 Learning: 2025-11-30T23:47:27.584Z
Learnt from: CR
Repo: arosboro/your_ai PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-30T23:47:27.584Z
Learning: Applies to src/train_*.py : Use Apple's MLX framework for training on Apple Silicon, with PyTorch as a fallback for other platforms
Applied to files:
CONTRIBUTING.md
📚 Learning: 2025-11-30T23:47:27.584Z
Learnt from: CR
Repo: arosboro/your_ai PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-30T23:47:27.584Z
Learning: Applies to **/*.py : Include docstrings and comments following a clear and consistent style in all code
Applied to files:
CONTRIBUTING.md
🪛 markdownlint-cli2 (0.18.1)
CONTRIBUTING.md
147-147: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
193-193: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Integration Tests
- GitHub Check: Hypothesis Verification Tests
- GitHub Check: Build and Test
- GitHub Check: Lint
🔇 Additional comments (3)
.github/workflows/python-ci.yml (3)
27-31: Ruff lint and format checks correctly target Python subdirectory.Configuration is properly scoped to the Python modules and consistent with the monorepo layout.
61-146: Test job configuration is sound.Working directory handling, path references, and coverage validation are correctly implemented for the monorepo structure. The CI-safe test optimization with documented fallback instructions is well-designed.
211-274: Benchmark evaluation job is properly configured.Manual trigger, cost-conscious design, and correct path handling throughout. Error handling with
continue-on-error: trueis appropriate for optional benchmark workloads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
rust/patches/mlx-sys/build.rs (1)
94-98: Wrapenv::set_varinunsafeto satisfy Rust 1.83+/2024 edition lintThis direct
env::set_varcall is what the earlier review flagged: in newer Rust (2024 edition work),std::env::set_varbecomesunsafedue to potential races in multithreaded contexts, and you’ll see theDEPRECATED_SAFE_2024lint when compiling with newer toolchains.For a Cargo build script this is effectively safe (short‑lived, single‑threaded process), so the idiomatic migration is to wrap it in an
unsafeblock and add a short comment:- #[cfg(target_os = "macos")] - { - env::set_var("LIBCLANG_PATH", "/Library/Developer/CommandLineTools/usr/lib"); - } + #[cfg(target_os = "macos")] + { + // SAFETY: Build scripts run in their own short‑lived process; setting + // a process‑wide env var here cannot race with other threads. + unsafe { + env::set_var("LIBCLANG_PATH", "/Library/Developer/CommandLineTools/usr/lib"); + } + }This should clear the lint while preserving behavior.
Rust 1.83 (Rust 2024 edition) std::env::set_var safety / DEPRECATED_SAFE_2024 lint in build scripts
🧹 Nitpick comments (2)
rust/patches/mlx-sys/build.rs (2)
1-4: Drop redundantextern cratein 2021 edition build scriptIn a 2018+ edition crate the
extern crate cmake;is unnecessary and can tripunused_extern_crates/ idiom lints; theuse cmake::Config;import is sufficient.You can simplify to:
-extern crate cmake; - -use cmake::Config; +use cmake::Config; use std::{env, path::PathBuf};This should quiet any related clippy warnings on the build script.
72-85: Consider gating Apple framework linking on the target platformThese
cargo:rustc-link-liblines always requestFoundation(and conditionallyMetal/Accelerate) regardless of the target OS. That’s fine if this crate is only ever built for Apple targets, but if someone attempts to build it on non-Apple platforms with default features, link errors are likely.If you want to be defensive for cross-platform consumers, you can gate the prints on the target OS via
CARGO_CFG_TARGET_OS, e.g.:let target_os = env::var("CARGO_CFG_TARGET_OS").unwrap_or_default(); if target_os == "macos" || target_os == "ios" { println!("cargo:rustc-link-lib=c++"); println!("cargo:rustc-link-lib=dylib=objc"); println!("cargo:rustc-link-lib=framework=Foundation"); #[cfg(feature = "metal")] { println!("cargo:rustc-link-lib=framework=Metal"); } #[cfg(feature = "accelerate")] { println!("cargo:rustc-link-lib=framework=Accelerate"); } }Not required if you intend the crate to be Apple-only, but it will make accidental non-Apple builds fail more gracefully.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
CONTRIBUTING.md(8 hunks)rust/patches/mlx-sys/Cargo.toml(1 hunks)rust/patches/mlx-sys/build.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-30T23:47:27.584Z
Learnt from: CR
Repo: arosboro/your_ai PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-30T23:47:27.584Z
Learning: Applies to src/train_*.py : Use Apple's MLX framework for training on Apple Silicon, with PyTorch as a fallback for other platforms
Applied to files:
CONTRIBUTING.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Integration Tests
- GitHub Check: Hypothesis Verification Tests
- GitHub Check: Lint
- GitHub Check: Build and Test
🔇 Additional comments (5)
CONTRIBUTING.md (4)
147-152: Previous markdown linting issues resolved.Both previously flagged MD040 violations (missing language identifiers in fenced code blocks) have been corrected:
- Line 147: Python test structure block now specifies
textlanguage- Line 193: Rust test structure block now specifies
textlanguageAlso applies to: 193-197
5-8: Excellent separation of dual-implementation guidance.The restructuring cleanly separates Python and Rust workflows across prerequisites, installation, linting, formatting, and testing sections. Each language has appropriate tool recommendations (ruff for Python, cargo fmt/clippy for Rust) and clear command paths with
cd pythonandcd rustconsistently applied.Also applies to: 82-139
318-335: PR requirements checklist is well-structured.The per-language requirements clearly delineate Python and Rust checks with actionable commands that align with the code style sections. The
[x]checkbox format is appropriate for a CONTRIBUTING guide.
358-366: Environment section properly accounts for both implementations.The issue reporting guidelines now include language-specific environment details (MLX version for Python, mlx-rs version for Rust), making it easier for maintainers to diagnose cross-language issues.
rust/patches/mlx-sys/Cargo.toml (1)
12-37: Manifest looks coherent; ensure CI/toolchains match declared MSRVThe manifest is well-formed and aligned with the build script:
edition = "2021"andrust-version = "1.82"are explicitly set.- Crate/lib names and
build = "build.rs"match the Rust and build.rs layout.- Description/metadata correctly identify this as the low-level mlx bindings crate.
Given
rust-version = "1.82"and the use ofbindgen::RustTarget::stable(1, 82)inbuild.rs, just make sure your CI and anyrust-toolchain/docs.rs configs use Rust ≥ 1.82 so you don't hit unexpected MSRV mismatches.
|
@coderabbitai it seems like all the tests passed, but can you exlpain whether this pr is good to merge or is there any ai prompt I should use to improve it? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 129394 🏁 Script executed: Length of output: 2715 🏁 Script executed: Length of output: 67 🏁 Script executed: Length of output: 386 🏁 Script executed: Length of output: 6172 @arosboro Short answer: not yet. One required check is red. Specifically, Rust CI → Lint failed on December 11, 2025 at 23:19:01 UTC, while Rust Build and Test succeeded and all Python checks passed. Until the Rust lint job is green, I don’t recommend merging this 296‑file refactor into main. What to do to get merge-ready (fast path)
Likely hotspots from review and toolchain evolution:
If you want an AI prompt to apply the above quickly Recommendation
|
|
@coderabbitai, is it ready to merge? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 2021 @arosboro Not yet—the CI checks are currently running! 🏃♂️ Current Status (as of ~23:55 UTC, Dec 11):
What to do next:
You can monitor live progress here: TL;DR: Check back in ~5-10 minutes. If all checks are green (especially Rust Lint), merge away! If Rust Lint is red, let me know and I'll help debug the specific failures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (5)
rust/BUILD_STATUS.md (2)
26-30: Add language specifications to fenced code blocks.Several fenced code blocks are missing language identifiers, which impacts documentation clarity and syntax highlighting.
Apply this diff to add language specifications:
### Error Message -``` +```plaintext CMake Error: Building for x86_64 on macOS is not supported. If you are on an Apple Silicon system, check the build documentation for possible fixes: https://ml-explore.github.io/mlx/build/html/install.html#build-from-source -``` +``` #### Option 1: Use Python Implementation (WORKS NOW) The Python training implementation with MLX is fully functional: -```bash +```bash cd /Users/arosboro/your_ai source venv/bin/activate python -m src.training.train_qlora --model <model-name> -``` +``` #### Option 2: Wait for mlx-sys Fix The `mlx-sys` crate needs to be updated to properly set ARM64 architecture: -```rust +```rust // In mlx-sys build.rs #[cfg(all(target_os = "macos", target_arch = "aarch64"))] { config.define("CMAKE_SYSTEM_PROCESSOR", "arm64"); config.define("CMAKE_OSX_ARCHITECTURES", "arm64"); } -``` +```Also applies to: 42-46, 50-57
59-59: Wrap bare URL in markdown link syntax.Line 59 contains a bare URL without markdown link formatting, which is flagged by linters as MD034.
Apply this diff:
-File an issue at: https://github.com/oxideai/mlx-rs +File an issue at: [mlx-rs](https://github.com/oxideai/mlx-rs)rust/MLX_UPGRADE_COMPLETE.md (1)
9-14: Add language specifications to fenced code blocks.Five fenced code blocks lack language identifiers for proper syntax highlighting and linting compliance.
Apply this diff to add language specifications:
## Build Results -``` +```plaintext ✅ Debug build: SUCCESS ✅ Release build: SUCCESS ✅ All compilation errors resolved ⚠️ One harmless warning about patch features mechanism -``` +``` #### Metal Shader Compilation Errors - **Issue**: Metal SDK v17.0 on macOS 15.6.1 incompatible with MLX v0.25.x shaders - **Solution**: Disabled Metal backend, enabled CPU-only mode - ```cmake + ```cmake set(MLX_BUILD_METAL OFF CACHE BOOL "Disable Metal backend" FORCE) set(MLX_BUILD_CPU ON CACHE BOOL "Enable CPU backend" FORCE) - ``` + ``` #### Bindgen libclang Mismatch - **Issue**: Homebrew LLVM was x86_64, needed ARM64 - **Solution**: Point bindgen to system clang - ```rust + ```rust env::set_var("LIBCLANG_PATH", "/Library/Developer/CommandLineTools/usr/lib"); - ``` + ``` ### Warning -``` +```plaintext warning: patch for `mlx-sys` uses the features mechanism. default-features and features will not take effect because the patch dependency does not support this mechanism -``` +``` ### Build Status -```bash +```bash cd your_ai_rs cargo build # ✅ SUCCESS cargo build --release # ✅ SUCCESS (requires network permissions) -``` +```Also applies to: 40-43, 54-56, 99-107, 112-116
rust/patches/mlx-sys/src/mlx-c/examples/example-float64.c (1)
6-11: Optional: Consider extracting duplicated helper to shared utility.The
print_arrayfunction is duplicated across multiple example files (example.c, example-closure.c, and this file). For example code, this keeps each file self-contained, which is fine. If maintenance becomes a concern, you could extract this to a shared examples header.rust/patches/mlx-sys/src/mlx-c/README.md (1)
5-7: Minor doc nit: capitalize “Apple Silicon” consistentlyLines 5–7 mix “Apple Silicon” and “Apple silicon”. For brand/style consistency you could capitalize both occurrences.
-MLX is an array framework for machine learning on Apple Silicon. MLX C expands -MLX to the C language, making research and experimentation easier on Apple -silicon. +MLX is an array framework for machine learning on Apple Silicon. MLX C expands +MLX to the C language, making research and experimentation easier on Apple +Silicon.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
rust/BUILD_STATUS.md(1 hunks)rust/MLX_UPGRADE_COMPLETE.md(1 hunks)rust/patches/mlx-sys/build.rs(1 hunks)rust/patches/mlx-sys/src/mlx-c/README.md(1 hunks)rust/patches/mlx-sys/src/mlx-c/examples/example-closure.c(1 hunks)rust/patches/mlx-sys/src/mlx-c/examples/example-float64.c(1 hunks)rust/patches/mlx-sys/src/mlx-c/examples/example.c(1 hunks)rust/patches/mlx-sys/src/mlx-c/mlx/c/error.cpp(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- rust/patches/mlx-sys/build.rs
- rust/patches/mlx-sys/src/mlx-c/examples/example.c
🧰 Additional context used
🧬 Code graph analysis (1)
rust/patches/mlx-sys/src/mlx-c/examples/example-float64.c (3)
rust/patches/mlx-sys/src/mlx-c/examples/example-closure.c (2)
print_array(7-12)main(67-110)rust/patches/mlx-sys/src/mlx-c/examples/example.c (2)
print_array(6-11)main(26-51)rust/patches/mlx-sys/build.rs (1)
main(90-125)
🪛 LanguageTool
rust/patches/mlx-sys/src/mlx-c/README.md
[style] ~5-~5: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... for machine learning on Apple Silicon. MLX C expands MLX to the C language, making...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[uncategorized] ~6-~6: Did you mean the proper noun “Apple Silicon”?
Context: ... research and experimentation easier on Apple silicon. MLX C can be used standalone or as a ...
(APPLE_PRODUCTS)
🪛 markdownlint-cli2 (0.18.1)
rust/MLX_UPGRADE_COMPLETE.md
9-9: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
99-99: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
rust/BUILD_STATUS.md
26-26: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
59-59: Bare URL used
(MD034, no-bare-urls)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Integration Tests
- GitHub Check: Hypothesis Verification Tests
- GitHub Check: Build and Test
🔇 Additional comments (3)
rust/patches/mlx-sys/src/mlx-c/examples/example-float64.c (1)
13-37: LGTM! Memory leak fixed.The memory leak for the
threearray identified in the previous review has been properly addressed. Line 34 now correctly freesthree, ensuring all allocated MLX resources (arr, two, three, stream) are properly cleaned up before exit.rust/patches/mlx-sys/src/mlx-c/examples/example-closure.c (1)
67-110: Resource management is thorough.All allocated MLX objects (arrays, vectors, closures) are properly freed at the end of
main. The error handling path correctly demonstrates using a custom error handler and checking the closure return value.rust/patches/mlx-sys/src/mlx-c/mlx/c/error.cpp (1)
16-33: I was unable to locate the filerust/patches/mlx-sys/src/mlx-c/mlx/c/error.cppin the repository through multiple search methods. Without access to the actual codebase, I cannot verify whether this file exists or confirm the review's findings against the current state of the code.The technical analysis in the review—that non-null
datais ignored whendtor == nullptrdue to the conditional assignment—is logically sound based on the code snippet shown. The suggested fix using a no-op deleter to preserve the data pointer while avoiding ownership is a reasonable approach. The thread safety concern regarding unsynchronized global state is also valid.However, verification requires:
- Confirmation that the file exists in the repository
- Validation that the code matches the snippet shown
- Understanding of the API contract (whether data should always be passed to the handler)
- Context on call sites and usage patterns
|
|
||
| struct bogus_payload payload; | ||
|
|
||
| // simple +2 to input, with paylaod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in comment: "paylaod" → "payload".
- // simple +2 to input, with paylaod
+ // simple +2 to input, with payload📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // simple +2 to input, with paylaod | |
| // simple +2 to input, with payload |
🤖 Prompt for AI Agents
In rust/patches/mlx-sys/src/mlx-c/examples/example-closure.c around line 82,
there's a typo in the comment ("paylaod"); update the comment to read "payload"
so the documentation/inline comment is correct.
|
@coderabbitai, I fixed a lot, are there more ai instructions, or should I merge? The CI/CD actions are passing. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 199 🏁 Script executed: Length of output: 136 🏁 Script executed: Length of output: 1045 @arosboro Yes, merge away! 🎉 All CI checks are now green ✅:
Status: Great job fixing the lint issues! 🚀 This monorepo restructuring is ready to land on |
Summary by CodeRabbit
New Features
Infrastructure
Documentation
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.