Feature/fp32 support for metal #9

TheApeMachine · 2026-01-10T22:29:09Z

No description provided.

Critical fixes: - codegraph/sync.py: Batch FalkorDB deletes to avoid N+1 query pattern - benchmark/behavior.py: Use itertools.accumulate instead of O(n²) string concat - ai/tools/deeplake/tool.py: Replace exception-based control flow with dict.get() High priority fixes: - instrumentation/run_logger.py: Document file handle lifecycle - instrumentation/hdf5_store.py: Add context manager protocol - tui/input_bar.py: Add MAX_HISTORY_SIZE (1000) with LRU eviction - ai/process/__init__.py: Add MAX_HISTORY_SIZE (10000) with LRU eviction - core/event_bus.py: Add max_queue_size (100000) to prevent unbounded growth Medium priority optimizations: - benchmark/artifacts.py: Use fields() + getattr() instead of repeated asdict() - benchmark/artifacts.py: Use str.translate() for LaTeX escaping - layer/memory_block/ngram_cache.py: Extract tensor creation from loop body - ai/tools/deeplake/tool.py: Add _cleanup_deeplake_tool() for resource cleanup - ai/tools/codegraph/tool.py: Use list comprehension for row conversion

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

…543tv8g3aojo-7jbM7 Fix performance anti-patterns across multiple modules

Major changes to the AI package architecture: 1. Development Workflow (OpenHands SDK): - Created new development process using OpenHands SDK for contained code modifications (ai/process/development/) - OpenHandsWorkspace wraps the SDK for git operations and code editing - DevelopmentProcess coordinates feature analysis, implementation, and review 2. Manifest Workflow: - Created manifest process for AI-driven experiment management (ai/process/manifest/) - ManifestBuilder for constructing valid experiment configurations - ResultsCollector for gathering and summarizing experiment outputs - ManifestProcess enables propose → run → analyze research loops 3. Structured Outputs (output_schema): - Added output_schema to verifier.yml for verdict responses - Created specialized personas with output_schema: - ideator_* (architect, catalyst, ml) for structured idea proposals - file_selector for file selection - feature_analyst for feature analysis - code_reviewer for code review - experiment_proposer for experiment proposals - results_analyzer for results analysis - Updated platform_improve/process.py to use Pydantic models instead of manual JSON parsing 4. Process Registry: - Updated experiment/runner.py to register development and manifest processes alongside brainstorm - Removed platform_improve from registry (replaced by development) The three main workflows are now: 1. General discussion with persistence (existing brainstorm) 2. Development workflow using OpenHands SDK 3. Manifest workflow for building/running/collecting experiments

- Add openhands-sdk and openhands-tools to pyproject.toml dependencies - Add new packages to setuptools packages list - Add docker-compose services for new personas: - feature-analyst - code-reviewer - experiment-proposer - results-analyzer - Update root agent's sub_agents list to include new personas - Update root agent instructions to document new expert agents The system is now fully self-contained - just start the TUI and talk to the root agent.

- Fix _run_task stub to extract output from OpenHands conversation history - Add shlex.quote() for shell command safety to prevent injection - Use secure tempfile.TemporaryDirectory instead of hardcoded /tmp - Use exit codes for test success detection instead of string matching - Catch specific RuntimeError exceptions instead of broad Exception - Add platform_improve back to process map in runner.py - Add DevelopmentProcessConfig and ManifestProcessConfig to agents.py - Update DevelopmentProcess and ManifestProcess to use proper config types - Standardize files_to_touch to array type in ideator personas and models - Clean up temporary directory on workspace cleanup

Review AI package for machine learning assistance

coderabbitai · 2026-01-10T22:29:25Z

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Summary by CodeRabbit

New Features
- Added LLaMA 3.2 model support with configuration, tokenization, and licensing.
- Integrated Vast.ai remote GPU compute provisioning.
- Introduced adaptive memory auto-tuning with real-time health diagnostics.
- Added PyTorch Lightning training backend support.
- Enabled HuggingFace image classification datasets integration.
- Launched Constructive Compression Learning trainer for non-gradient learning.
Improvements
- Enhanced event serialization via Cap'n Proto for efficiency.
- Extended Metal/MPS optimizer kernels to support FP32 precision.
- Added memory tuner visualization and live monitoring.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Walkthrough

Introduces Llama 3.2 checkpoints, migrates event serialization from JSON to Cap'n Proto, adds memory autotuning with ResonantRouter, implements Constructive Compression Learning (CCL), adds Vast.ai orchestration and PyTorch Lightning engine support, and extends fp32 support across Metal optimizers. Removes best-effort terminology throughout codebase.

Changes

Cohort / File(s)	Summary
Llama 3.2 Checkpoints `artifacts/checkpoints/{config.json, generation_config.json, special_tokens_map.json, tokenizer_config.json, LICENSE.txt, README.md, USE_POLICY.md, .gitattributes}` `artifacts/checkpoints/original/params.json`	Adds complete Llama 3.2 model configuration, tokenization specs (128k vocab), and licensing documentation. Enables LFS tracking for large model artifacts.
Event Codec: Cap'n Proto Migration `core/event_codec/{__init__.py, capnp_codec.py, capnp_codec_test.py, event.capnp}` `core/event_codec/{json_codec.py, binary_codec.py, binary_codec_test.py, stream_parser.py}`	Replaces JSON/binary encoding with Cap'n Proto zero-copy serialization. Removes ~160 lines of JSON/binary codec logic and introduces new `CapnpEventEncoder/Decoder` with batch and padded encoding support.
Memory Autotune System `layer/memory_block/memory/{tuner.py, tuner_test.py, telemetry.py}` `layer/memory_block/memory/{memory.py, reader.py, writer.py, vsa.py}`	Introduces gradient-free coordinate-descent tuner with momentum, three novel telemetry dataclasses (ResonantSettlingMetrics, VsaNoveltyMetrics, RmfDynamicsMetrics), and integrates tuning into read/write paths.
Resonant Router Implementation `layer/memory_block/memory/{routing.py, memory_autotune_test.py}` `layer/memory_block/block/{layer.py, viz_test.py}`	Adds phase-coded attractor dynamics routing alternative to VQ/bits. Includes Triton and Metal backends, per-device caching, optional nan checks, and visualization integration (~260 lines of routing logic).
CCL (Constructive Compression Learning) `ccl/{__init__.py, context_counts.py, patch_vq.py, system.py}` `research/ccl/{ccl_mnist.py, requirements.txt}` `trainer/{ccl.py, ccl_test.py}` `config/presets/ccl_mnist.yml`	Implements patch-based VQ codec, context-count models with Dirichlet smoothing, and CCL system module. Adds non-gradient trainer and full MNIST pipeline (~1400 lines total).
Event Runtime & Message Handling `infer/{event_runtime.py, event_runtime_test.py}` `core/{event_bus.py, task_queue.py}`	Migrates from JSON to Cap'n Proto delimiters (10→0). Updates EventResponder.json_end_byte to message_end_byte. Adds Task-based async publishing via TaskQueue.
Vast.ai Compute Orchestration `orchestrator/compute/vast_ai.py` `config/{compute.py, compute_test.py}` `config/target.py`	Introduces VastAIClient with offer discovery, provisioning, and SSH waiting. Adds discriminated union ComputeConfig (LocalComputeConfig
Lightning Engine & Trainer Integration `runtime/engine/lightning_engine.py` `experiment/runner.py` `trainer/standard.py`	Adds PyTorch Lightning execution backend with CarambaLightningModule wrapper. Integrates compute provisioning logic and backend selection (torch vs lightning).
Metal Optimizer: FP32 Support `optimizer/metal/{adamw.metal, adamw.py, lion.metal, lion.py}` `optimizer/metal/{rmsnorm.metal, rmsnorm.py, rope.metal, rope.py}` `optimizer/metal/layernorm.metal`	Converts FP16-specific kernels to templated implementations supporting both FP16/FP32. ~600 lines of Metal template generalization and corresponding Python dtype validation updates.
Resonant Update Backends `optimizer/metal/{resonant_update.metal, resonant_update.py, resonant_ops.mm, resonant_jit.py}` `optimizer/{resonant_update_triton.py, resonant_update_triton_kernels.py}` `optimizer/metal/resonant_update_autograd_test.py` `optimizer/resonant_update_triton_test.py`	Implements Metal/MPS and Triton backends for resonant phase update with full autograd support. Forward/backward kernels and JIT build infrastructure (~500 lines combined).
Training Metrics & Instrumentation `instrumentation/{training_metrics.py, viz.py}` `instrumentation/{run_logger.py, wandb_writer.py}` `layer/memory_block/block/layer.py`	Adds thread-safe TrainingMetrics singleton, memory block telemetry storage on viz context, improved logging resilience with exception capture.
Data & Dataset Adapters `data/{hf_image_classification.py, icl_rule.py, icl_rule_test.py}`	Adds HuggingFace image classification dataset wrapper with PIL preprocessing. Extends ICL rule dataset with MOSAIC memory teacher signals (write_gate, write_bucket, read_bucket).
Console/Visualization `console/logger.py`	Adds tuner_status and health_bars rendering with Live display context manager. Integrates Rich panels for parameter tuning visualization.
Config & Presets `config/{layer.py, defaults.py, presets/mosaic_resonant.yml}` `config/presets/{ccl_mnist.yml, mosaic_icl.yml}` `examples/lightning_vastai_demo.yml`	Expands MemoryBlockLayerConfig with resonant routing params (mem_resonant_steps, coupling, damping, mem_autotune options). Adds CCL and resonant experiment presets.
Best-Effort Terminology Cleanup `.py`, `.yml`, `*.md` files (50+)	Systematic replacement of "best-effort" descriptors with neutral terms (e.g., "conservative," "non-critical," "heuristic," "local mechanism"). Removes best_effort config flag and CLI option. Updates docstrings and comments.
Runtime & Readiness `runtime/{readiness.py, readiness_test.py, engine/torch_engine.py}` `experiment/runner.py`	Removes best_effort parameter from readiness checks; treats missing Metal/Triton tools as errors. Registers new CCL, HF image classification, and Lightning components.
Documentation & Research `docs/manifests.md` `research/ccp/{paper.tex, notes_copy.md, references.bib}` `.cursor/commands/development.md`	Adds HF dataset example to manifest docs. Updates paper with ResonantRouter section and ICL diagnostic. Outlines Differentiable VM evolution plan. Refines development guidance.
Removed Modules `prepare_fineweb.py`	Eliminates entire data preparation pipeline (TextTokenizer, FinewebNpyPreparer, tokenization/streaming/metadata logic).
VSCode & Git Config `.vscode/c_cpp_properties.json` `.githooks/{post-commit, pre-push}` `.gitignore` `cli.py`	Adds C/C++ IntelliSense config for Mac. Removes --best-effort flag from codegraph hooks and CLI. Broadens artifacts/ ignore pattern.
AI/A2A Integration `ai/{server.py, worker.py, connection.py, root.py, push_notifications.py}` `ai/tools/{builder.py, deeplake/tool.py}`	Improved error logging in persona loading, task queue integration, connection health checks. Enhanced credential normalization.

Sequence Diagram(s)

sequenceDiagram
    participant Trainer as Trainer (standard.py)
    participant Memory as MemoryBlockMemory
    participant Tuner as UniversalMemoryTuner
    participant Router as ResonantRouter
    participant Telemetry as MemoryHealthTelemetry
    
    Trainer->>Memory: forward(batch, ctx)
    activate Memory
    Memory->>Router: route(tag, collect_aux=True)
    activate Router
    Router->>Router: Iterative dynamics<br/>(steps, coupling, damping)
    Router-->>Memory: Routing{logits, aux}
    deactivate Router
    
    Memory->>Memory: collect_health_telemetry()
    activate Memory
    Memory->>Telemetry: Build metrics<br/>(util, resonant, vsa)
    Telemetry-->>Memory: MemoryHealthTelemetry
    deactivate Memory
    
    Memory->>Tuner: update(telemetry)
    activate Tuner
    Tuner->>Tuner: Update EMA signals<br/>Compute objective
    Tuner->>Tuner: Explore parameters<br/>(if adaptive & post-warmup)
    Tuner-->>Memory: metrics{coupling_mult, damping_mult, ...}
    deactivate Tuner
    
    Memory-->>Trainer: routing_output + telemetry
    deactivate Memory
    Trainer->>Trainer: Log telemetry & viz

sequenceDiagram
    participant Runner as experiment/runner.py
    participant VastAI as VastAIClient
    participant Lightning as LightningEngine
    participant Trainer as Trainer (standard/ccl)
    participant Dataset as Dataset
    
    Runner->>Runner: check_target_readiness()
    alt compute.type == "vast_ai"
        Runner->>VastAI: VastAIClient(api_key)
        Runner->>VastAI: run_lifecycle(config)
        activate VastAI
        VastAI->>VastAI: find_best_offer()
        VastAI->>VastAI: provision_instance()
        VastAI->>VastAI: wait_for_ssh()
        VastAI-->>Runner: ssh_string
        deactivate VastAI
    end
    
    alt backend == "lightning"
        Runner->>Lightning: LightningEngine()
        Runner->>Lightning: run_experiment(manifest, target)
        activate Lightning
        Lightning->>Dataset: build_dataset()
        Lightning->>Trainer: wrap in CarambaLightningModule
        Lightning->>Lightning: Trainer.fit(model, dataloader)
        Lightning-->>Runner: results
        deactivate Lightning
    else backend == "torch"
        Runner->>Trainer: TorchEngine.run_experiment()
    end

sequenceDiagram
    participant EventBus as core/event_bus.py
    participant TaskQueue as core/task_queue.py
    participant Codec as CapnpEventCodec
    participant Encoder as CapnpEventEncoder
    
    EventBus->>EventBus: publish(event)
    alt task_queue configured
        EventBus->>EventBus: Construct Task<br/>(id, context_id, metadata)
        EventBus->>TaskQueue: push(task)
    else fallback
        EventBus->>EventBus: publish_sync()
    end
    
    Note over EventBus,Encoder: Event Encoding Path
    EventBus->>Encoder: encode(EventEnvelope)
    activate Encoder
    Encoder->>Encoder: Load Cap'n Proto schema
    Encoder->>Encoder: Build message<br/>(fields + payload JSON)
    Encoder->>Codec: Serialize to bytes
    Encoder->>Encoder: Convert to uint8 tensor
    Encoder-->>EventBus: Tensor(int64)
    deactivate Encoder

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Refactor project architecture and enhance agentic development workflo… #6: Overlaps on platform_improve workflow, TUI modules, Docker integration, and verifier persona systems; related at infrastructure and integration level.
Feature/brainstorm #5: Shares A2A/brainstorm feature modifications (ai/agent, persona configs, docker-compose services, mcp modules).
Feature/brainstorm #4: Directly related via .githooks/post-commit modifications; overlapping changes to hook behavior and best-effort flag handling.

Poem

🐰 Whisker-twitch'd delight! 🎉

Fresh resonant neurons dance in phase,
Cap'n Proto swift through data's maze,
Lightning strikes tuner's gradient-free art,
Vast skies provision'd, new pathways chart,
A rabbit's burrow—deeper, richer still! 🌟

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 46.38% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	The pull request description is empty, providing no information about the changes, rationale, or context for the FP32 Metal support feature.	Add a description explaining the motivation for FP32 support, what Metal kernels were updated, and any testing performed.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title 'Feature/fp32 support for metal' directly summarizes the main change: adding FP32 support to Metal/MPS kernels alongside existing FP16 support.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/fp32-support-for-metal

📜 Recent review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6922553 and dc5f3e7.

⛔ Files ignored due to path filters (2)

artifacts/codegraph/hook.log is excluded by !**/*.log
assets/caramba-1.png is excluded by !**/*.png

📒 Files selected for processing (117)

.cursor/commands/development.md
.githooks/post-commit
.githooks/pre-push
.vscode/c_cpp_properties.json
ai/connection.py
ai/push_notifications.py
ai/root.py
ai/server.py
ai/tools/builder.py
ai/tools/deeplake/tool.py
api/app.py
artifacts/checkpoints/README.md
artifacts/checkpoints/USE_POLICY.md
artifacts/checkpoints/config.json
artifacts/paper/ablation.yml
artifacts/paper/ablation_scratch.yml
benchmark/accuracy/utils.py
benchmark/artifacts.py
benchmark/behavior.py
benchmark/context/base.py
benchmark/runner.py
ccl/context_counts.py
ccl/patch_vq.py
ccl/system.py
cli.py
codegraph/__init__.py
codegraph/parser.py
codegraph/sync.py
config/benchmark.py
config/compute.py
config/compute_test.py
config/defaults.py
config/layer.py
config/presets/ccl_mnist.yml
config/presets/dba_paper_local.yml
config/presets/dba_paper_rerun.yml
config/presets/dba_paper_rerun_sem8_geo32_v40.yml
config/presets/diffusion_codegen.yml
config/presets/herorun.yml
config/presets/mosaic_garbage_suite_signal.yml
config/presets/mosaic_hypothesis_suite.yml
config/presets/mosaic_hypothesis_suite_signal.yml
config/presets/mosaic_phase_cache.yml
config/presets/resonant_memory_field_demo.yml
config/train.py
console/logger.py
core/event_codec/capnp_codec.py
core/event_codec/capnp_codec_test.py
data/hf_image_classification.py
data/icl_rule.py
data/icl_rule_test.py
docs/agents.md
examples/lightning_vastai_demo.yml
experiment/group.py
experiment/runner.py
frontend/src/components/run-panel.tsx
infer/event_runtime.py
infer/event_runtime_test.py
infer/generate.py
instrumentation/hdf5_store.py
instrumentation/live_plotter.py
instrumentation/run_logger.py
instrumentation/tensorboard_writer.py
instrumentation/training_metrics.py
instrumentation/utils.py
instrumentation/viz.py
instrumentation/wandb_writer.py
layer/memory_block/block/layer.py
layer/memory_block/block/path/fast_train.py
layer/memory_block/memory/memory.py
layer/memory_block/memory/reader.py
layer/memory_block/memory/routing.py
layer/memory_block/memory/tuner.py
layer/memory_block/memory/tuner_test.py
layer/memory_block/memory/vsa.py
layer/memory_block/memory/writer.py
layer/rope.py
model/__init__.py
model/language_model_system.py
optimizer/adamw_master.py
optimizer/metal/resonant_jit.py
optimizer/metal/resonant_update.metal
optimizer/metal/resonant_update.py
optimizer/metal/resonant_update_autograd_test.py
optimizer/offload.py
optimizer/resonant_update_triton.py
optimizer/resonant_update_triton_kernels.py
optimizer/resonant_update_triton_test.py
orchestrator/compute/vast_ai.py
orchestrator/nowcast.py
orchestrator/nowcast_test.py
pyproject.toml
research/ccl/ccl_mnist.py
research/ccl/requirements.txt
research/ccp/notes_copy.md
research/ccp/paper.tex
research/ccp/references.bib
research/dba/benchmark.yml
research/dba/benchmark_sem8_geo32_v40.yml
research/dba/quicky.yml
runtime/engine/lightning_engine.py
runtime/engine/torch_engine.py
runtime/readiness.py
runtime/readiness_test.py
runtime/registry_test.py
runtime/tensordict_utils.py
scripts/test_a2a_smoke.py
topology/graph.py
trainer/blockwise.py
trainer/ccl.py
trainer/checkpoint_compare.py
trainer/gradient_isolation.py
trainer/objectives.py
trainer/standard.py
trainer/steppers/global_stepper.py
trainer/swap_manager.py
trainer/upcycle.py

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-01-10T22:31:37Z

Summary of Changes

Hello @TheApeMachine, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request fundamentally transforms the platform's architecture by integrating a sophisticated multi-agent system, enabling complex task orchestration and collaborative AI workflows. Concurrently, it delivers substantial performance gains on Apple Silicon hardware through enhanced FP32 support for Metal kernels and introduces advanced, self-tuning memory mechanisms for neural networks. These improvements are complemented by an expanded suite of specialized MCP tools and optimized Docker deployments, collectively boosting the platform's scalability, efficiency, and agent-driven capabilities for ML experimentation.

Highlights

Multi-Agent System with A2A/ADK Integration: Implemented a hierarchical multi-agent system using Google's Agent Development Kit (ADK) and the Agent-to-Agent (A2A) protocol, enabling asynchronous task delegation and communication between Root, Lead, and Worker agents. This includes new persona types, team configurations, and persistent session/task stores.
Metal (Apple Silicon) FP32 Support: Extended Metal Performance Shaders (MPS) kernels to support full FP32 precision for key operations like AdamW, LayerNorm, RMSNorm, and RoPE, enhancing compatibility and performance for neural network training and inference on Apple Silicon.
Advanced Memory Routing and Auto-tuning: Introduced a novel 'Resonant Router' for memory blocks, utilizing phase dynamics for improved associative memory, alongside a 'Universal Memory Tuner' for adaptive control and optimization of memory parameters based on real-time telemetry.
Expanded MCP Tool Ecosystem: Migrated existing Model Context Protocol (MCP) tools (e.g., filesystem, codegraph) to a more robust streamable HTTP transport and added new specialized tools, including a Docling tool for document processing and a Project Board tool for hierarchical project management.
Optimized Docker Deployment for Agents: Significantly reduced Docker image sizes and build times for persona agents through a multi-stage build process, installing only necessary dependencies and selectively copying code, drastically improving deployment efficiency and resource utilization.
Cap'n Proto for Event Serialization: Switched the core event serialization mechanism from JSON to Cap'n Proto, providing significant performance improvements and zero-copy access for high-throughput event processing within the system.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a major overhaul and expansion of the AI agent system, transitioning to a more robust and scalable architecture based on Google's Agent Development Kit (ADK) and the Agent-to-Agent (A2A) protocol. Key changes include a new comprehensive CLAUDE.md documentation, a refactored ai module with explicit __all__ exports (though noted by the reviewer as a potential maintenance burden), and the introduction of Agent, AgentFactory, RootAgent, and LeadAgent classes for hierarchical agent orchestration. New components like ConnectionManager, PendingTaskManager, ADKAgentExecutor, StreamingExecutor, AgentServer, DatabaseSessionService, TaskQueue, and push notification mechanisms (InMemoryPushNotificationConfigStore, HttpPushNotificationSender) were added to support asynchronous, persistent, and distributed agent interactions. The .gitignore file was updated to include ADK.txt, with a reviewer suggestion to also ignore .vscode/. Several MCP tools (codegraph, deeplake, docling, filesystem, projectboard, webcrawl) were updated to use streamable-http transport and include TransportSecuritySettings, with performance optimizations noted by the reviewer in codegraph (batch deletes) and deeplake (efficient row data extraction). Additionally, new persona configurations (data_scientist, product_owner, researcher, tester, project_manager) were added, and existing ones (architect, catalyst, developer, mathematician, ml_expert, reviewer, verifier, writer, root) were updated to reflect the new A2A structure, including explicit URLs and model references. The ExperimentRunner was updated to support remote compute provisioning via Vast.ai, and memory block layers gained a 'resonant' router option with adaptive auto-tuning, including new telemetry and tuner components. Metal kernels for adamw, layernorm, and rope were updated to support float32 on MPS, and Triton kernels for DBA attention were refined with explicit type casting for masks.

I am having trouble creating individual review comments. Click here to see my feedback.

.vscode/c_cpp_properties.json (1-26)

This file contains user-specific absolute paths (e.g., /Users/theapemachine/...). Such IDE configuration files should not be checked into version control as they will cause issues for other developers on different machines. Please add .vscode/ to the .gitignore file to prevent this from being committed in the future.

ai/init.py (51-87)

This __all__ list is quite long and appears to be manually maintained. This can become a maintenance burden and a source of ImportError for users doing from caramba.ai import * if it gets out of sync with the actual imports in the module. While not a critical issue, consider if there's a way to automate this or if a more granular import strategy might be better for consumers of this package.

ai/connection.py (352)

The line self._pending_tasks.add(task.id) seems redundant. The PendingTaskManager is already responsible for tracking pending tasks via its register method. The _pending_tasks set on the RemoteAgent class itself doesn't appear to be used anywhere else. This looks like leftover code and could be removed to avoid confusion.

ai/tools/codegraph/tool.py (81-84)

Using a list comprehension here is a good optimization over appending to a list in a loop, especially for potentially large result sets. It's more concise and generally faster.

ai/tools/deeplake/tool.py (167-188)

Refactoring the data extraction from the DeepLake row to use a helper function _rowview_to_str_dict and .get() with defaults is a great improvement. It avoids the performance overhead of repeated try...except blocks inside the loop and makes the code more robust and readable.

codegraph/sync.py (111)

Switching from deleting nodes file-by-file in a loop to a single batch delete operation (_delete_files_batch) is an excellent performance optimization. This avoids the N+1 query problem and will be significantly faster when syncing multiple files.

core/event_codec/init.py (1-23)

The switch from JSON/binary codecs to a Cap'n Proto based codec is a significant performance optimization. Using a zero-copy serialization format like Cap'n Proto is an excellent choice for a high-throughput event processing system. This change should dramatically reduce serialization/deserialization overhead.

docker/Dockerfile.persona-a2a (1-131)

The optimization of this Dockerfile using a multi-stage build is a fantastic improvement. By separating dependency installation and copying only the necessary source files for the agent runtime, you've likely reduced the image size and build time by an order of magnitude. This significantly improves the development lifecycle, CI/CD performance, and resource consumption. Excellent work.

coderabbitai

Actionable comments posted: 49

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)

optimizer/metal/layernorm.py (1)
164-175: Critical: Non-gradient path still hardcodes fp16, breaking fp32 support.

Lines 170 and 174 still explicitly cast weight and bias to torch.float16, which is inconsistent with the gradient path (lines 72, 78) that uses x.dtype. This breaks fp32 support for inference scenarios where gradients are not required.
🐛 Proposed fix to use x.dtype consistently
-    w2 = weight.to(device=x.device, dtype=torch.float16).contiguous()
+    w2 = weight.to(device=x.device, dtype=x.dtype).contiguous()
     if bias is None:
         return ops.layernorm_weight(x2, w2, float(eps))
 
-    b2 = bias.to(device=x.device, dtype=torch.float16).contiguous()
+    b2 = bias.to(device=x.device, dtype=x.dtype).contiguous()
     return ops.layernorm(x2, w2, b2, float(eps))
optimizer/metal/lion.py (1)
21-38: lion_fp16 name no longer matches behavior; also address Ruff TRY003 if CI enforces it.
Line 33-34: dtype broadening is correct, but consider adding a clearer alias (e.g., lion_step/lion) or docstring note that lion_fp16 supports fp16/fp32. Also, if Ruff TRY003 is enforced, suppress or refactor the long inline exception message.
Proposed minimal adjustments
 def lion_fp16(
@@
 ) -> Tensor:
+    """Run one fused Lion step on Metal (supports fp16/fp32; p/grad/m dtypes must match)."""
@@
-    if p.dtype not in (torch.float16, torch.float32) or grad.dtype != p.dtype or m.dtype != p.dtype:
-        raise RuntimeError("Metal Lion currently supports fp16/fp32 tensors only (matching)")
+    if p.dtype not in (torch.float16, torch.float32) or grad.dtype != p.dtype or m.dtype != p.dtype:
+        raise RuntimeError("Metal Lion currently supports fp16/fp32 tensors only (matching)")  # noqa: TRY003
optimizer/metal/rope.py (1)

38-52: Dtype propagation is coherent; TRY003 linting will flag inline exception messages.

Lines 44–45 and 66–67 correctly cast cos/sin to x.dtype and align grad_out to saved dtype, making the fp32 path consistent end-to-end. However, with TRY003 enabled in ruff.toml, the six RuntimeError raises (lines 39, 41, 59, 61, 91, 93) with inline messages will trigger linting warnings. Address by adding # noqa: TRY003 to each raise statement, or by centralizing repeated messages as module-level constants.

optimizer/metal/rmsnorm.py (1)

85-118: Function name rmsnorm_fp16 is now misleading.

The function handles both FP16 and FP32, but the name suggests FP16-only. Consider renaming to rmsnorm_metal or rmsnorm_mps for clarity.
optimizer/metal/ops.mm (1)
2016-2041: PYBIND docstrings are outdated - still claim fp16-only.

The module docstrings reference "fp16" but all operations now support FP32. Consider updating to "(Metal/MPS, fp16/fp32)" for accuracy.
Example fix for a few entries
-  m.def("rmsnorm", &rmsnorm, "RMSNorm Forward (Metal/MPS, fp16)");
+  m.def("rmsnorm", &rmsnorm, "RMSNorm Forward (Metal/MPS, fp16/fp32)");
-  m.def("rope", &rope, "RoPE Apply (Metal/MPS, fp16)");
+  m.def("rope", &rope, "RoPE Apply (Metal/MPS, fp16/fp32)");
-  m.def("lion_step", &lion_step, "Lion step update (Metal/MPS, fp16)");
+  m.def("lion_step", &lion_step, "Lion step update (Metal/MPS, fp16/fp32)");
-  m.def("adamw_master_step", &adamw_master_step, "AdamW master step update (Metal/MPS, fp16 params + fp32 state)");
+  m.def("adamw_master_step", &adamw_master_step, "AdamW master step update (Metal/MPS, fp16/fp32 params + fp32 state)");
optimizer/adamw_master.py (1)

109-120: MPS fused path has critical dtype mismatch with Metal kernel.

The Metal kernel at ops.mm:1707 enforces grad.dtype() == p.dtype(), but the fused condition in adamw_master.py allows FP32 parameters (line 111) while requiring FP16 gradients (line 113). This creates an incompatibility: when p.dtype == torch.float32, the kernel expects grad.dtype == torch.float32, but the condition enforces grad.dtype == torch.float16, causing a runtime assertion failure.

Either expand the grad dtype check to allow both FP16 and FP32 to match the parameter dtype, or restrict the parameter dtype to FP16 only to match the gradient requirement.

🤖 Fix all issues with AI agents

In @.vscode/c_cpp_properties.json:
- Around line 1-26: The .vscode/c_cpp_properties.json currently contains
hardcoded absolute include paths (the entries in the "includePath" array
pointing to /Users/theapemachine/...) which breaks portability; update those
specific includePath entries to use workspace-relative variables (e.g.
${workspaceFolder}/...) or env vars (e.g. ${env:PYENV} or
${workspaceFolder}/.venv/...) so all contributors resolve headers locally, and
either add .vscode/c_cpp_properties.json to .gitignore or add a template file
(c_cpp_properties.json.example) for local customization; target the
"includePath" array and the file-level config in .vscode/c_cpp_properties.json
when making the change.

In @artifacts/checkpoints/original/params.json:
- Around line 1-12: Remove the unused "use_scaled_rope" key from params.json and
add a trailing newline at EOF; the RoPE scaling is already handled via the
rope_scaling dict (see config/layer.py) and consumed by layer/rope.py and
attention implementations, so deleting use_scaled_rope avoids confusion and the
newline ensures standard EOF formatting.

In @artifacts/checkpoints/tokenizer_config.json:
- Around line 1-2061: Repo artifacts are bloating the repository because large
tokenizer files (tokenizer.json ~8.7M and tokenizer.model ~2.1M) are committed
under artifacts/checkpoints alongside tokenizer_config.json; remove or offload
them to avoid heavy PRs. Fix by: (1) remove large binary tokenizer files from
the commit and add them to Git LFS (or delete and re-add via git rm --cached) so
tokenizer.json and tokenizer.model are tracked via LFS; (2) update
documentation/README to state that artifacts are fetched at runtime from the
HuggingFace Hub (use AutoTokenizer.from_pretrained or transformers'
from_pretrained for the model repo) and include a fallback local path for
tokenizer_config.json if you must keep a small config; (3) ensure CI/deployment
pulls tokenizer artifacts at build/runtime (reference files tokenizer.json,
tokenizer.model and tokenizer_config.json in artifacts/checkpoints and functions
that load them) so the repo contains only the lightweight config while large
blobs live in LFS or are downloaded on demand.

In @artifacts/checkpoints/USE_POLICY.md:
- Around line 45-52: Replace the redirect/incorrect hrefs in the policy links so
the visible link text points directly to the intended secure URLs: update the
"Reporting issues with the model" link to use the direct HTTPS URL
https://github.com/meta-llama/llama-models/issues instead of the l.workplace.com
redirect, and change the other two links to HTTPS (use
https://developers.facebook.com/llama_output_feedback and
https://facebook.com/whitehat/info) while leaving the LlamaUseReport@meta.com
contact as-is; ensure the markdown link targets (the hrefs) are the direct
secure URLs rather than redirects.
- Around line 1-43: Convert the initial bold title into a proper Markdown
heading (e.g., use a single-level heading like "# Llama 3.2 Acceptable Use
Policy") and ensure a blank line follows it; normalize ordered-list markers so
nested and top-level numbered lists use a consistent prefix style (prefer using
"1." for every ordered item per MD029 or sequential numbers consistently), fix
nested indentation for sublists so they render as distinct levels (use two or
four spaces per indent), remove all trailing spaces and hard tabs (address
MD009), and ensure there is a blank line before/after block-level elements
(lists/headings) to satisfy MD041/MD036/MD029/MD009 checks.

In @config/compute_test.py:
- Around line 31-33: The test constructs VastAIComputeConfig without the
required field gpu_name so the ValidationError will be raised for missing
gpu_name instead of negative max_price_per_hr; update the test to pass a valid
gpu_name (e.g., "A100") when calling VastAIComputeConfig(max_price_per_hr=-1.0,
gpu_name="A100") so the validation failure is due to the negative price as
intended.
- Around line 51-54: Add an explicit runtime type check for target.compute so
static analyzers know its concrete type: after creating target via
ExperimentTargetConfig.model_validate(data), assert isinstance(target.compute,
VastAIComputeConfig) (or use typing.cast) before accessing
target.compute.gpu_name; reference ExperimentTargetConfig, target.compute, and
VastAIComputeConfig to locate where to add this check.

In @config/compute.py:
- Line 43: The default Docker image string assigned to image (image: str =
"pytorch/pytorch:2.2.1-cuda12.1-cudnn8-devel") is hardcoded and may become
stale; make this configurable or document it by replacing the literal with a
configurable value (e.g., read from an env var or config setting) and/or add a
comment next to the image variable explaining how to override it and
recommending updating to a current PyTorch+CUDA tag (mentioning current project
uses PyTorch 2.9.1), so future maintainers know to pin or update the image.
- Around line 15-21: LocalComputeConfig currently inherits from BaseModel which
bypasses the project's Config behavior (notably extra="forbid"); change the base
class to the project's Config (imported from caramba.config as Config) so
LocalComputeConfig: class LocalComputeConfig(Config): ... and do the same for
the other compute-related classes in this file (the block spanning lines 23-46)
to ensure YAML typo validation is enforced; update imports to remove BaseModel
and import Config from caramba.config.

In @config/defaults.py:
- Around line 42-46: DefaultsCompute defines vast_ai_api_key with an alias
"VAST_AI_API_KEY" but Pydantic v2 won't accept that alias as a keyword when
instantiating the model unless you enable population by alias/name; update the
DefaultsCompute model to include a model_config that enables populate_by_name
(and/or populate_by_alias if you prefer alias-first population) so instantiating
with DefaultsCompute(VAST_AI_API_KEY="...") or loading from dict/YAML using that
alias works as intended.

In @config/layer.py:
- Around line 489-490: The mem_autotune field currently typed as str allows any
value; change its annotation to a Literal of the allowed values (Literal["off",
"monitor", "adaptive"]) and import Literal from typing (or typing_extensions for
older Python) so Pydantic enforces and validates allowed values; keep the
default "off" and, if present, adjust or remove any custom validators that
duplicate this check to rely on Pydantic's validation instead.

In @config/presets/mosaic_resonant.yml:
- Around line 26-27: The vars define mem_buckets and mem_hashes but the layer
blocks still use hardcoded values, causing inconsistency; update the layer
configuration entries that currently set mem_buckets and mem_hashes (also the
occurrences around lines 75-77) to reference the variables (${mem_buckets} and
${mem_hashes}) or delete the unused vars block—pick one approach and apply it
consistently across all layer entries so mem_buckets and mem_hashes are not
defined in two places.

In @core/event_codec/capnp_codec_test.py:
- Around line 67-70: Rename the unused loop index `i` to `_` in the for-loop
that iterates over zipped pairs of `events` and `decoded`, and call zip with
`strict=True` so the loop becomes: iterate over zip(events, decoded,
strict=True) to enforce equal lengths; keep the existing assertions that compare
`dec.type`, `dec.sender`, and `dec.payload` against `orig` unchanged.

In @core/event_codec/capnp_codec.py:
- Around line 20-27: Add return type annotations for the private helper
functions to satisfy the static analyzer: annotate _load_schema() and
_get_schema() with an appropriate return type (use typing.Any if pycapnp types
are unavailable), and add an import for Any from typing. Ensure the function
signatures include the return annotation (e.g., def _load_schema() -> Any:) and
keep behavior unchanged.
- Around line 119-120: The current tensor-to-bytes conversion creates a Python
list and iterates over it (vals =
ids.detach().cpu().to(dtype=torch.int64).tolist(); raw = bytes(...)), which is
slow; change it to a zero-copy path by converting the tensor to uint8 on CPU and
using numpy().tobytes() (e.g. replace the to(dtype=torch.int64).tolist() +
bytes(...) logic with
ids.detach().cpu().to(dtype=torch.uint8).numpy().tobytes()), ensuring you
preserve the intended byte masking and avoid intermediate Python-level
iteration.
- Line 73: The tensor created from the byte buffer is using dtype=torch.long and
should use dtype=torch.uint8 to store bytes efficiently: change the return in
the function that does "return torch.tensor(list(buf), dtype=torch.long)" to use
torch.uint8; also update the corresponding decoder function (the decoder at the
review note) and the encode_padded implementation (encode_padded) to produce and
consume torch.uint8 tensors, and add explicit casting to larger integer types
only where arithmetic expects int64 so behavior doesn’t change.

In @examples/lightning_vastai_demo.yml:
- Around line 4-6: Replace the literal placeholder value for VAST_AI_API_KEY
under the defaults -> compute section with an environment-variable reference or
clear comment: use a substitution token like ${VAST_AI_API_KEY} (if the config
loader supports it) or add a comment next to VAST_AI_API_KEY stating this is an
example placeholder that must be replaced with a real key via environment
variables; update the defaults/compute entry for VAST_AI_API_KEY accordingly so
static scanners won't flag a hardcoded secret.

In @experiment/runner.py:
- Line 143: The logged message at the logger.info call inside
experiment/runner.py is using an unnecessary f-string; remove the f prefix so
the call uses a plain string literal (change logger.info(f"Instance ready.
Remote run not fully implemented, continuing locally for demonstration.") to
logger.info("Instance ready. Remote run not fully implemented, continuing
locally for demonstration.")) to avoid misleading usage of f-strings where no
interpolation occurs.
- Around line 131-136: Accessing self.manifest.defaults.compute can raise
AttributeError when defaults or compute are missing; before constructing
VastAIClient use a defensive lookup (e.g., check hasattr(self.manifest,
"defaults") and hasattr(self.manifest.defaults, "compute") or use
getattr(self.manifest, "defaults", None) and getattr(..., "compute", None)) and
pass a None-safe api_key to VastAIClient
(getattr(self.manifest.defaults.compute, "vast_ai_api_key", None)) or handle the
missing defaults.compute case by raising a clear RuntimeError; update the block
that builds VastAIClient and reads the api_key so it won’t assume
self.manifest.defaults.compute exists.

In @infer/event_runtime.py:
- Around line 54-56: The delimiter=0 field and comment are wrong for Cap'n Proto
framing and risk collisions because Cap'n Proto payloads can contain 0 bytes;
change the framing: either replace delimiter default with a safe unused byte
(e.g., delimiter: int = 255) and update encode_with_delimiter() and the decoder
logic in EventEncoder/EventDecoder to use that byte, or implement proper
length-prefixed framing (write a 4-byte length before each encoded message and
update decode logic to read length then that many bytes) so the decoder can
unambiguously parse messages; also remove or correct the misleading comment
about Cap'n Proto using 0-byte delimiters.

In @layer/attention/standard/layer.py:
- Around line 100-101: The file references torch (e.g., torch.float32) in the
condition near the SDPA fallback but never imports the torch module; add an
import for the torch package at the top of the file (alongside existing imports
like "from torch import Tensor, nn" and "import torch.nn.functional as F") so
symbols like torch.float32 and torch.device are defined and the NameError is
resolved.

In @layer/memory_block/memory/memory.py:
- Around line 510-514: The method collect_health_telemetry currently performs a
local import of MemoryHealthTelemetry, ResonantSettlingMetrics and
VsaNoveltyMetrics on every call, causing unnecessary overhead; move the import
statement for MemoryHealthTelemetry, ResonantSettlingMetrics, VsaNoveltyMetrics
to the module top-level with the other imports and remove the inline import
inside collect_health_telemetry so the function simply references those classes
directly.
- Around line 524-531: ResonantSettlingMetrics is being populated with
resonant_convergence_steps from routing but ResonantRouter.route() never sets
"resonant_convergence_steps", so it always defaults to 0; either have
ResonantRouter.route() compute and emit convergence steps into its aux/routing
dict (e.g., track when energy history converges and set
"resonant_convergence_steps") or remove the field usage here and stop reading
"resonant_convergence_steps" in memory.py; locate ResonantRouter.route and the
place where routing/aux is returned to add the convergence counter, or adjust
the construction of ResonantSettlingMetrics to omit/respectably default that
metric.

In @layer/memory_block/memory/routing.py:
- Around line 264-265: The variable energy_history is being collected in the
settling loop but never returned; update the code so the auxiliary output
includes it: after the settling loop add energy_history to the aux dict (e.g.,
aux['energy_history'] = energy_history) or, if you decide not to expose it,
remove the energy_history list and any pushes to it to avoid wasted work; locate
the energy_history variable and the aux dict in the settling function in
routing.py and either attach the list to aux before returning or delete all
energy_history usage.
- Around line 228-232: After applying tuner adjustments in the block that
multiplies coupling/damping and adds resonant_steps_delta to steps (the code
using self.tuner and getattr(self.tuner, "resonant_steps_delta", 0)), clamp or
validate the resulting steps to be at least 1 (or another safe minimum) before
the settling loop runs; e.g., compute steps = max(1, steps + getattr(self.tuner,
"resonant_steps_delta", 0)) and optionally emit a warning via the existing
logger if the clamp was applied so callers know the tuner produced a
non-positive delta.
- Around line 277-284: The energy metric is computed after you normalize x and y
(so x**2 + y**2 ≈ 1), yielding constant values; move the energy computation to
before the normalization (compute energy = (x**2 + y**2).mean().item() prior to
computing mag and dividing x,y) or replace it with a convergence metric (e.g.,
mean squared change between iterations: ((x - x_prev)**2 + (y -
y_prev)**2).mean().item()) and append that to energy_history within the existing
collect_aux and s_idx % 5 check in the same block where x and y are updated.
- Around line 211-213: The _check_nan method currently prints a message via
print, which should be replaced with structured logging: add or use a
module/class logger (e.g., logger = logging.getLogger(__name__) or a class
attribute on ResonantRouter) and replace the print call with logger.error (or
logger.warning) including the name and useful tensor diagnostics (shape, dtype,
mean/std or a sample) so NaN/Inf events are recorded with severity and visible
in production logs; keep the function behavior otherwise (do not swallow
exceptions) and ensure the logging import is added if missing.

In @layer/memory_block/memory/telemetry.py:
- Around line 55-86: The to_dict method currently omits nested metrics
(resonant.energy_drop, resonant.state_drift, vsa.tag_collision_rate,
rmf.delta_rms); add an optional parameter to_dict(self, verbose: bool = False)
and when verbose is True include these extra fields under the same flattened
keys (e.g. "memory/resonant/energy_drop", "memory/resonant/state_drift",
"memory/vsa/tag_collision_rate", "memory/rmf/delta_rms"); keep the existing
compact output when verbose is False, update the docstring to document the new
parameter, and ensure you access the same attributes on
self.resonant/self.vsa/self.rmf only after the corresponding truthy checks
already present.

In @layer/memory_block/memory/tuner_test.py:
- Around line 10-61: Add a complementary unit test that verifies high
utilization (oversaturation) causes the tuner to increase thresholds rather than
decrease: inside TestUniversalMemoryTuner add a method (e.g.,
test_memory_oversaturation_increases_thresholds) that constructs a
UniversalMemoryTuner in "adaptive" mode with a known target_utilization (0.5)
and max_delta_per_step (e.g., 0.01), supply a MemoryHealthTelemetry with
utilization > target (e.g., 0.9), call tuner.update(tel) and assert that
target_write_threshold and write_threshold_mult have increased
(write_threshold_mult == 1.0 + max_delta_per_step after first update when
appropriate) and that repeated updates move the multiplier further up but still
respect gradual increments; reference UniversalMemoryTuner,
MemoryHealthTelemetry, target_write_threshold, write_threshold_mult, and
max_delta_per_step to locate the relevant logic.

In @layer/memory_block/memory/tuner.py:
- Line 170: The setattr call in MemoryTuner (look for setattr(self, actual_name,
int(round(final_val)))) uses an unnecessary outer int() because round(final_val)
already returns an int in Python 3; remove the redundant int() and set the
attribute with round(final_val) directly to improve clarity.

In @layer/memory_block/memory/writer.py:
- Line 51: The field tuner is typed as Any which loses type safety; define a
Protocol (e.g., MemoryTuner) that declares the expected attributes
write_threshold_mult and vsa_novelty_mult and update the tuner annotation to use
this Protocol instead of Any so the writer code (references to tuner,
write_threshold_mult, vsa_novelty_mult) gets static type checks and clearer
documentation.

In @optimizer/metal/adamw.py:
- Around line 37-38: The Metal AdamW wrapper enforces p.dtype == grad.dtype, but
adamw_master.py's MPS fused-path only checks g.dtype == torch.float16 causing a
mismatch for fp32 params; update the fused-path condition in adamw_master.py
(the block that currently tests g.dtype == torch.float16) to require g.dtype ==
p.dtype (or equivalently allow both torch.float16 and torch.float32 by checking
g.dtype in (torch.float16, torch.float32) and g.dtype == p.dtype) so the fused
path is only taken when gradient and parameter dtypes match the wrapper's
expectation.

In @optimizer/metal/layernorm.metal:
- Around line 169-170: The threadgroup arrays ts[8] and tq[8] in layernorm.metal
are hardcoded to NSIMD==8 and will break if TG/SIMD constants change; update
layernorm_impl usage by either documenting the dependency with a clear inline
comment referencing NSIMD/TG/SIMD and layernorm_impl<...>, or add a compile-time
check (static_assert-like macro) that validates the array size equals NSIMD (and
mirror the same change for the other occurrence at the region around lines
250-251); ensure the assertion/comment references the symbols ts, tq, NSIMD,
SIMD/TG, and layernorm_impl so future maintainers see the coupling.
- Around line 322-433: The two backward kernels layernorm_bwd_x_impl and
layernorm_bwd_x_noweight_impl are nearly identical; replace them with a single
templated kernel (e.g., template<typename T, bool HAS_WEIGHT>) that takes the
same arguments but conditionally applies weight only when HAS_WEIGHT is true
(use weight parameter when HAS_WEIGHT, otherwise ignore it), keep all reductions
and shared memory logic identical, remove layernorm_bwd_x_noweight_impl, and
update any callers to instantiate the unified function with HAS_WEIGHT=true or
false as appropriate.

In @optimizer/metal/layernorm.py:
- Around line 129-130: The function name layernorm_fp16 is misleading because
the implementation supports both fp16 and fp32; rename the function to
layernorm_metal (or add a compatibility alias layernorm_fp16 = layernorm_metal)
and update any references/exports/tests to use the new name; make sure the
docstring and any public API lists (e.g., __all__ or module exports) reflect the
new name to avoid breaking callers.

In @optimizer/metal/ops.mm:
- Around line 1706-1707: The C++ check in optimizer/metal/ops.mm enforces p and
grad have matching FP16/FP32 dtypes, but adamw_master.py currently hardcodes
g.dtype == torch.float16; update adamw_master.py to compare grad.dtype against
p.dtype (or assert grad.dtype == p.dtype) instead of hardcoding float16, and
adjust any related casting/initialization logic in the adamw_master function so
it works for both torch.float16 and torch.float32 and matches the C++
adamw_master_step constraint.

In @orchestrator/compute/vast_ai.py:
- Around line 47-57: The wait_for_ssh function currently ignores the timeout and
contains a mock sleep; add a TODO at the top of wait_for_ssh noting that the
timeout parameter must be enforced when wiring to the real Vast.ai API, and
replace the mock with polling logic that queries
vastai.get_instance(instance_id) until the instance status is 'running' and an
IP is available, aborting (return None or raise) when the timeout elapses;
reference the wait_for_ssh function and the timeout parameter so future
implementers know to use them and return a proper SSH string only when ready.
- Around line 67-75: run_lifecycle currently provisions an instance
(provision_instance) then waits for SSH (wait_for_ssh) but never decommissions
it; change run_lifecycle to ensure decommission_instance is called on the
provisioned instance when finished or on error by implementing a
context-manager/RAII pattern (e.g., return a context manager object or generator
that yields the connection info and calls decommission_instance(instance_id) in
its __exit__/finally block) and ensure all code paths (success and exception)
invoke decommission_instance for the instance_id returned by provision_instance.

In @pyproject.toml:
- Around line 70-71: Update the dependency entry for "asyncpg" in pyproject.toml
to pin it to a specific stable version (e.g., change the plain "asyncpg" entry
to a fixed version such as "asyncpg==0.31.0" or a bounded spec like
"asyncpg>=0.31.0,<0.32.0") so builds are reproducible; locate the "asyncpg" line
in the dependency list and replace it with the chosen version constraint.

In @research/ccp/notes copy.md:
- Line 25: Remove the trailing whitespace characters on the two affected
Markdown lines: delete the extra space after "modify:" (the text "Files to
modify:") and after "# Gate operations based on opcode" to satisfy MD009; you
can enable trim-trailing-whitespace in your editor or run a markdown
linter/formatter to strip trailing spaces automatically and re-run the linter to
confirm the fixes.
- Line 5: Replace the bolded section title "**What exists:**" with a proper
Markdown heading (e.g., "## What exists") to satisfy MD036; also apply the same
treatment to similar titles such as changing "**What's missing:**" to "## What's
Missing" while leaving comments/sections like "Implementation skeleton"
unchanged.
- Line 1: Add a top-level Markdown heading as the first line by turning the
existing first-line text into a level-1 heading: prepend "# " to "Complete
MOSAIC Differentiable VM Architecture" so the document begins with "# Complete
MOSAIC Differentiable VM Architecture" (ensure there is a single blank line
after the heading if needed).
- Around line 20-25: The ordered list under the "Phase 1: Wire Opcodes to
Control Behavior" heading has inconsistent numbering (lines currently show 2–7)
and triggers MD029; fix it by either converting the items to an unordered list
(using "-" or "*") or renumbering the list to start at 1 and be sequential;
update the list items referenced in that section so they are consistently
formatted and valid Markdown (e.g., restart numbering at 1 or replace the
numeric prefixes with "-" for each item).
- Around line 1-10: This file research/ccp/notes copy.md is unrelated to the
"Feature/fp32 support for metal" PR and should be either removed or explicitly
linked to the PR as design/reference material; either delete the file from the
branch, move it into a dedicated documentation/design PR, or add a clear note in
the commit message and at the top of the file stating it is reference material
for this PR (and rename to remove "copy" from the filename) so reviewers
understand its intent.

In @runtime/engine/lightning_engine.py:
- Around line 103-109: Replace the invalid devices=0 logic in the L.Trainer call
so CPU runs use a positive device count; specifically update the devices
argument in the L.Trainer(...) call (where trainer is created) to use a valid
value such as devices=1 when train_cfg.device == "cpu" and otherwise "auto"
(e.g., devices=1 if train_cfg.device == "cpu" else "auto"), leaving accelerator
and precision logic (train_cfg.use_amp) unchanged.

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 65186d1 and 45e6115.

⛔ Files ignored due to path filters (4)

artifacts/mosaic/paper.pdf is excluded by !**/*.pdf
research/ccp/Cognitive Control Plane.pdf is excluded by !**/*.pdf
research/ccp/Cognitive_Control_Plane_MOSAIC_Runtime.pdf is excluded by !**/*.pdf
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (61)

.vscode/c_cpp_properties.json
artifacts/checkpoints/.gitattributes
artifacts/checkpoints/LICENSE.txt
artifacts/checkpoints/README.md
artifacts/checkpoints/USE_POLICY.md
artifacts/checkpoints/config.json
artifacts/checkpoints/generation_config.json
artifacts/checkpoints/original/params.json
artifacts/checkpoints/original/tokenizer.model
artifacts/checkpoints/special_tokens_map.json
artifacts/checkpoints/tokenizer.json
artifacts/checkpoints/tokenizer_config.json
config/compute.py
config/compute_test.py
config/defaults.py
config/layer.py
config/presets/mosaic_resonant.yml
config/target.py
core/__init__.py
core/event_codec/__init__.py
core/event_codec/binary_codec.py
core/event_codec/binary_codec_test.py
core/event_codec/capnp_codec.py
core/event_codec/capnp_codec_test.py
core/event_codec/event.capnp
core/event_codec/json_codec.py
core/event_codec/stream_parser.py
examples/lightning_vastai_demo.yml
experiment/runner.py
infer/event_runtime.py
instrumentation/viz.py
layer/attention/standard/layer.py
layer/memory_block/memory/memory.py
layer/memory_block/memory/memory_autotune_test.py
layer/memory_block/memory/reader.py
layer/memory_block/memory/routing.py
layer/memory_block/memory/telemetry.py
layer/memory_block/memory/tuner.py
layer/memory_block/memory/tuner_test.py
layer/memory_block/memory/vsa.py
layer/memory_block/memory/writer.py
optimizer/adamw_master.py
optimizer/kernels.py
optimizer/metal/adamw.metal
optimizer/metal/adamw.py
optimizer/metal/layernorm.metal
optimizer/metal/layernorm.py
optimizer/metal/lion.metal
optimizer/metal/lion.py
optimizer/metal/ops.mm
optimizer/metal/rmsnorm.metal
optimizer/metal/rmsnorm.py
optimizer/metal/rope.metal
optimizer/metal/rope.py
orchestrator/compute/vast_ai.py
pyproject.toml
research/ccp/notes copy.md
research/ccp/paper.tex
research/ccp/references.bib
runtime/engine/lightning_engine.py
trainer/standard.py

💤 Files with no reviewable changes (4)

core/event_codec/stream_parser.py
core/event_codec/binary_codec.py
core/event_codec/binary_codec_test.py
core/event_codec/json_codec.py

🧰 Additional context used

🧬 Code graph analysis (13)

config/compute_test.py (2)

config/compute.py (1)

LocalComputeConfig (15-20)

config/target.py (1)

ExperimentTargetConfig (21-77)

config/target.py (1)

config/compute.py (1)

LocalComputeConfig (15-20)

orchestrator/compute/vast_ai.py (2)

config/compute.py (1)

VastAIComputeConfig (23-45)

console/logger.py (4)

warning (77-79)

info (69-71)

success (73-75)

error (81-83)

optimizer/metal/layernorm.py (4)

trainer/distributed.py (1)

device (206-208)

optimizer/metal/rmsnorm.py (1)

save_for_backward (20-20)

optimizer/metal/rope.py (1)

save_for_backward (20-20)

optimizer/metal/ssm.py (1)

save_for_backward (19-19)

optimizer/metal/rmsnorm.py (3)

model/generic_system.py (1)

to (67-69)

topology/graph.py (1)

to (157-159)

trainer/distributed.py (1)

device (206-208)

layer/attention/standard/layer.py (1)

trainer/distributed.py (1)

device (206-208)

layer/memory_block/memory/tuner.py (1)

layer/memory_block/memory/telemetry.py (1)

MemoryHealthTelemetry (41-86)

trainer/standard.py (1)

trainer/distributed.py (1)

device (206-208)

core/event_codec/__init__.py (1)

core/event_codec/capnp_codec.py (2)

CapnpEventDecoder (102-164)

CapnpEventEncoder (41-99)

layer/memory_block/memory/writer.py (1)

orchestrator/telemetry.py (1)

threshold (174-176)

core/event_codec/capnp_codec.py (1)

core/event.py (1)

EventEnvelope (18-142)

layer/memory_block/memory/memory.py (2)

layer/memory_block/memory/routing.py (4)

ResonantRouter (179-319)

route (53-67)

route (117-130)

route (215-319)

layer/memory_block/memory/telemetry.py (4)

to_dict (55-86)

MemoryHealthTelemetry (41-86)

ResonantSettlingMetrics (13-19)

VsaNoveltyMetrics (23-28)

layer/memory_block/memory/telemetry.py (1)

layer/memory_block/memory/tuner.py (1)

update (72-106)

🪛 Checkov (3.2.334)

artifacts/checkpoints/tokenizer_config.json

[low] 2060-2061: Base64 High Entropy String

(CKV_SECRET_6)

examples/lightning_vastai_demo.yml

[low] 6-7: Base64 High Entropy String

(CKV_SECRET_6)

🪛 LanguageTool

artifacts/checkpoints/USE_POLICY.md

[grammar] ~41-~41: Use a hyphen to join words.
Context: ...of your AI system 5. Interact with third party tools, models, or software designe...

(QB_NEW_EN_HYPHEN)

[style] ~42-~42: ‘With respect to’ might be wordy. Consider a shorter alternative.
Context: ... are associated with Meta or Llama 3.2 With respect to any multimodal models included in Llama...

(EN_WORDINESS_PREMIUM_WITH_RESPECT_TO)

artifacts/checkpoints/LICENSE.txt

[style] ~26-~26: This phrase is redundant. Consider using “outside”.
Context: ...eta Platforms, Inc. (if you are located outside of the EEA or Switzerland). By clicking...

(OUTSIDE_OF)

[grammar] ~46-~46: Use a hyphen to join words.
Context: ...e Llama Materials to create, train, fine tune, or otherwise improve an AI model, ...

(QB_NEW_EN_HYPHEN)

[grammar] ~51-~51: Use a hyphen to join words.
Context: ... a Licensee as part of an integrated end user product, then Section 2 of this Agr...

(QB_NEW_EN_HYPHEN)

[style] ~64-~64: This phrasing could be wordy, so try replacing it with something more concise.
Context: ...or Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the...

(MORE_THAN_EXCEEDS)

[style] ~79-~79: Consider simply using “of” instead.
Context: ...ES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Propert...

(OF_ANY_OF)

[style] ~91-~91: ‘with respect to’ might be wordy. Consider a shorter alternative.
Context: ...ls and derivatives made by or for Meta, with respect to any derivative works and modificati...

(EN_WORDINESS_PREMIUM_WITH_RESPECT_TO)

[style] ~97-~97: Consider simply using “of” instead.
Context: ... outputs or results, or any portion of any of the foregoing, constitutes infringement...

(OF_ANY_OF)

artifacts/checkpoints/README.md

[grammar] ~224-~224: Use a hyphen to join words.
Context: ...t). The Llama 3.2 instruction-tuned text only models are optimized for multilingu...

(QB_NEW_EN_HYPHEN)

[grammar] ~224-~224: Use a hyphen to join words.
Context: ...ey outperform many of the available open source and closed chat models on common ...

(QB_NEW_EN_HYPHEN)

[grammar] ~251-~251: Use a hyphen to join words.
Context: ...h use in multiple languages. Instruction tuned text only models are intended for ...

(QB_NEW_EN_HYPHEN)

[grammar] ~251-~251: Use a hyphen to join words.
Context: ...ltiple languages. Instruction tuned text only models are intended for assistant-l...

(QB_NEW_EN_HYPHEN)

[grammar] ~251-~251: Use a hyphen to join words.
Context: ...e retrieval and summarization, mobile AI powered writing assistants and query and...

(QB_NEW_EN_HYPHEN)

[style] ~251-~251: Consider using a more concise synonym.
Context: ...g. Pretrained models can be adapted for a variety of additional natural language generation ...

(A_VARIETY_OF)

[style] ~251-~251: Consider using a more concise synonym.
Context: ...ly, quantized models can be adapted for a variety of on-device use-cases with limited comput...

(A_VARIETY_OF)

[grammar] ~293-~293: Use a hyphen to join words.
Context: ...custom training libraries, Meta's custom built GPU cluster, and production infras...

(QB_NEW_EN_HYPHEN)

[style] ~297-~297: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...adjusted for power usage efficiency. Training Greenhouse Gas Emissions: Estimated t...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~323-~323: ‘taking into account’ might be wordy. Consider a shorter alternative.
Context: ... framework and Arm CPU backend in mind, taking into account metrics including model quality, prefil...

(EN_WORDINESS_PREMIUM_TAKING_INTO_ACCOUNT)

[grammar] ~331-~331: Use a hyphen to join words.
Context: ... stages, using the same data as the full precision models. To initialize QAT, we ...

(QB_NEW_EN_HYPHEN)

[uncategorized] ~331-~331: Do not mix variants of the same word (‘adapter’ and ‘adaptor’) within a single text.
Context: ...3) (i.e., quantization followed by LoRA adapters), we refer this method as QLoRA. Finall...

(EN_WORD_COHERENCY)

[grammar] ~353-~353: Use a hyphen to join words.
Context: ...0 | em | 96.8 | 1 | 1 | ### Instruction Tuned Models | Capability | | Benchmar...

(QB_NEW_EN_HYPHEN)

[style] ~424-~424: Consider using a more concise synonym.
Context: ...ional technology designed to be used in a variety of use cases. Examples on how Meta’s Llama...

(A_VARIETY_OF)

[grammar] ~456-~456: Use a hyphen to join words.
Context: ...nd appropriate risk mitigations via fine tuning. We leveraged those expert red te...

(QB_NEW_EN_HYPHEN)

[grammar] ~456-~456: Use a hyphen to join words.
Context: ...ed new in-depth sessions using objective based methodologies to assess the model ...

(QB_NEW_EN_HYPHEN)

[grammar] ~456-~456: Use a hyphen to join words.
Context: ...g content while taking account of market specific nuances or experiences. **3. ...

(QB_NEW_EN_HYPHEN)

[grammar] ~463-~463: Use a hyphen to join words.
Context: ...uations. Our Purple Llama tools are open sourced for the community to use and wid...

(QB_NEW_EN_HYPHEN)

[uncategorized] ~463-~463: The official name of this software platform is spelled with a capital “H”.
Context: ...courage community contributions to our [Github repository](https://github.com/meta-lla...

(GITHUB)

🪛 markdownlint-cli2 (0.18.1)

artifacts/checkpoints/USE_POLICY.md

1-1: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)

5-5: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

20-20: Ordered list item prefix
Expected: 1; Actual: 2; Style: 1/1/1

(MD029, ol-prefix)

21-21: Ordered list item prefix
Expected: 1; Actual: 3; Style: 1/1/1

(MD029, ol-prefix)

22-22: Ordered list item prefix
Expected: 1; Actual: 4; Style: 1/1/1

(MD029, ol-prefix)

23-23: Ordered list item prefix
Expected: 1; Actual: 5; Style: 1/1/1

(MD029, ol-prefix)

24-24: Ordered list item prefix
Expected: 1; Actual: 6; Style: 1/1/1

(MD029, ol-prefix)

25-25: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

25-25: Ordered list item prefix
Expected: 1; Actual: 7; Style: 1/1/1

(MD029, ol-prefix)

39-39: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

52-52: Bare URL used

(MD034, no-bare-urls)

research/ccp/notes copy.md

1-1: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)

5-5: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

20-20: Ordered list item prefix
Expected: 1; Actual: 2; Style: 1/1/1

(MD029, ol-prefix)

21-21: Ordered list item prefix
Expected: 1; Actual: 3; Style: 1/1/1

(MD029, ol-prefix)

22-22: Ordered list item prefix
Expected: 1; Actual: 4; Style: 1/1/1

(MD029, ol-prefix)

23-23: Ordered list item prefix
Expected: 1; Actual: 5; Style: 1/1/1

(MD029, ol-prefix)

24-24: Ordered list item prefix
Expected: 1; Actual: 6; Style: 1/1/1

(MD029, ol-prefix)

25-25: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

25-25: Ordered list item prefix
Expected: 1; Actual: 7; Style: 1/1/1

(MD029, ol-prefix)

39-39: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

52-52: Bare URL used

(MD034, no-bare-urls)

artifacts/checkpoints/README.md

222-222: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)

247-247: Link text should be descriptive

(MD059, descriptive-link-text)

287-287: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

311-311: Link text should be descriptive

(MD059, descriptive-link-text)

465-465: Link text should be descriptive

(MD059, descriptive-link-text)

🪛 Ruff (0.14.10)

config/compute.py

48-51: Type alias ComputeConfig uses TypeAlias annotation instead of the type keyword

Use the type keyword

(UP040)

orchestrator/compute/vast_ai.py

37-37: Unused method argument: config

(ARG002)

47-47: Unused method argument: timeout

(ARG002)

optimizer/metal/layernorm.py

41-41: Avoid specifying long messages outside the exception class

(TRY003)

137-137: Avoid specifying long messages outside the exception class

(TRY003)

139-139: Avoid specifying long messages outside the exception class

(TRY003)

layer/memory_block/memory/routing.py

196-196: Dynamically typed expressions (typing.Any) are disallowed in tuner

(ANN401)

218-218: Avoid specifying long messages outside the exception class

(TRY003)

optimizer/metal/adamw.py

38-38: Avoid specifying long messages outside the exception class

(TRY003)

layer/memory_block/memory/vsa.py

73-73: Avoid specifying long messages outside the exception class

(TRY003)

optimizer/metal/rmsnorm.py

40-40: Avoid specifying long messages outside the exception class

(TRY003)

100-100: Avoid specifying long messages outside the exception class

(TRY003)

102-102: Avoid specifying long messages outside the exception class

(TRY003)

optimizer/metal/lion.py

34-34: Avoid specifying long messages outside the exception class

(TRY003)

optimizer/metal/rope.py

41-41: Avoid specifying long messages outside the exception class

(TRY003)

91-91: Avoid specifying long messages outside the exception class

(TRY003)

93-93: Avoid specifying long messages outside the exception class

(TRY003)

layer/memory_block/memory/memory_autotune_test.py

27-27: Use a regular assert instead of unittest-style assertIsNotNone

Replace assertIsNotNone(...) with assert ...

(PT009)

28-28: Use a regular assert instead of unittest-style assertEqual

Replace assertEqual(...) with assert ...

(PT009)

49-49: Use a regular assert instead of unittest-style assertIn

Replace assertIn(...) with assert ...

(PT009)

50-50: Use a regular assert instead of unittest-style assertIn

Replace assertIn(...) with assert ...

(PT009)

51-51: Use a regular assert instead of unittest-style assertIn

Replace assertIn(...) with assert ...

(PT009)

54-54: Use a regular assert instead of unittest-style assertIsInstance

Replace assertIsInstance(...) with assert ...

(PT009)

layer/memory_block/memory/tuner_test.py

15-15: Use a regular assert instead of unittest-style assertEqual

Replace assertEqual(...) with assert ...

(PT009)

16-16: Use a regular assert instead of unittest-style assertEqual

Replace assertEqual(...) with assert ...

(PT009)

25-25: Use a regular assert instead of unittest-style assertLess

Replace assertLess(...) with assert ...

(PT009)

26-26: Use a regular assert instead of unittest-style assertAlmostEqual

(PT009)

32-32: Use a regular assert instead of unittest-style assertLess

Replace assertLess(...) with assert ...

(PT009)

33-33: Use a regular assert instead of unittest-style assertGreater

Replace assertGreater(...) with assert ...

(PT009)

42-42: Use a regular assert instead of unittest-style assertEqual

Replace assertEqual(...) with assert ...

(PT009)

44-44: Use a regular assert instead of unittest-style assertAlmostEqual

(PT009)

55-55: Use a regular assert instead of unittest-style assertEqual

Replace assertEqual(...) with assert ...

(PT009)

60-60: Use a regular assert instead of unittest-style assertGreaterEqual

Replace assertGreaterEqual(...) with assert ...

(PT009)

61-61: Use a regular assert instead of unittest-style assertGreaterEqual

Replace assertGreaterEqual(...) with assert ...

(PT009)

core/event_codec/capnp_codec_test.py

67-67: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

67-67: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

runtime/engine/lightning_engine.py

21-21: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

23-23: Dynamically typed expressions (typing.Any) are disallowed in system

(ANN401)

24-24: Dynamically typed expressions (typing.Any) are disallowed in objective

(ANN401)

25-25: Dynamically typed expressions (typing.Any) are disallowed in train_cfg

(ANN401)

37-37: Unused method argument: batch_idx

(ARG002)

67-67: Unused method argument: manifest

(ARG002)

71-71: Dynamically typed expressions (typing.Any) are disallowed in run_experiment

(ANN401)

layer/attention/standard/layer.py

101-101: Undefined name torch

(F821)

layer/memory_block/memory/tuner.py

170-170: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

core/event_codec/capnp_codec.py

20-20: Missing return type annotation for private function _load_schema

(ANN202)

25-25: Avoid specifying long messages outside the exception class

(TRY003)

34-34: Missing return type annotation for private function _get_schema

(ANN202)

49-49: Avoid specifying long messages outside the exception class

(TRY003)

71-71: Avoid specifying long messages outside the exception class

(TRY003)

83-83: Avoid specifying long messages outside the exception class

(TRY003)

86-86: Avoid specifying long messages outside the exception class

(TRY003)

90-90: Avoid specifying long messages outside the exception class

(TRY003)

110-110: Avoid specifying long messages outside the exception class

(TRY003)

112-112: Avoid specifying long messages outside the exception class

(TRY003)

114-114: Avoid specifying long messages outside the exception class

(TRY003)

149-149: Avoid specifying long messages outside the exception class

(TRY003)

151-151: Avoid specifying long messages outside the exception class

(TRY003)

153-155: Avoid specifying long messages outside the exception class

(TRY003)

157-157: Avoid specifying long messages outside the exception class

(TRY003)

infer/event_runtime.py

189-189: Avoid specifying long messages outside the exception class

(TRY003)

210-212: try-except-continue detected, consider logging the exception

(S112)

210-210: Do not catch blind exception: Exception

(BLE001)

237-237: Avoid specifying long messages outside the exception class

(TRY003)

layer/memory_block/memory/memory.py

244-244: Avoid specifying long messages outside the exception class

(TRY003)

289-289: Avoid specifying long messages outside the exception class

(TRY003)

450-450: Avoid specifying long messages outside the exception class

(TRY003)

510-510: Undefined name MemoryHealthTelemetry

(F821)

experiment/runner.py

136-136: Avoid specifying long messages outside the exception class

(TRY003)

143-143: f-string without any placeholders

Remove extraneous f prefix

(F541)

coderabbitai · 2026-01-10T22:44:28Z

.vscode/c_cpp_properties.json

+{
+	"configurations": [
+		{
+			"name": "Mac",
+			"includePath": [
+				"${workspaceFolder}/**",
+				"/Users/theapemachine/go/src/github.com/theapemachine/caramba/.venv/lib/python3.12/site-packages/torch/include",
+				"/Users/theapemachine/go/src/github.com/theapemachine/caramba/.venv/lib/python3.12/site-packages/torch/include/torch/csrc/api/include",
+				"/usr/local/include",
+				"/Library/Developer/CommandLineTools/usr/include/c++/v1"
+			],
+			"defines": [
+				"TORCH_EXTENSION_NAME=caramba_metal_ops",
+				"TORCH_API_INCLUDE_EXTENSION_H"
+			],
+			"macFrameworkPath": [
+				"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks"
+			],
+			"compilerPath": "/usr/bin/clang",
+			"cStandard": "c17",
+			"cppStandard": "c++17",
+			"intelliSenseMode": "macos-clang-arm64"
+		}
+	],
+	"version": 4
+}


⚠️ Potential issue | 🔴 Critical

Replace hardcoded absolute paths with portable alternatives.

The configuration contains hardcoded absolute paths specific to one developer's machine (lines 7-8), which will break IntelliSense for other contributors. VSCode C/C++ configurations should use workspace-relative paths or environment variables.

🔧 Proposed fix using portable paths

{ "configurations": [ { "name": "Mac", "includePath": [ "${workspaceFolder}/**", - "/Users/theapemachine/go/src/github.com/theapemachine/caramba/.venv/lib/python3.12/site-packages/torch/include", - "/Users/theapemachine/go/src/github.com/theapemachine/caramba/.venv/lib/python3.12/site-packages/torch/include/torch/csrc/api/include", + "${workspaceFolder}/.venv/lib/python3.12/site-packages/torch/include", + "${workspaceFolder}/.venv/lib/python3.12/site-packages/torch/include/torch/csrc/api/include", "/usr/local/include", "/Library/Developer/CommandLineTools/usr/include/c++/v1" ], "defines": [ "TORCH_EXTENSION_NAME=caramba_metal_ops", "TORCH_API_INCLUDE_EXTENSION_H" ], "macFrameworkPath": [ "/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks" ], "compilerPath": "/usr/bin/clang", "cStandard": "c17", "cppStandard": "c++17", "intelliSenseMode": "macos-clang-arm64" } ], "version": 4 }

Alternatively, consider adding this file to .gitignore and providing a .vscode/c_cpp_properties.json.example template that developers can copy and customize locally.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

{

"configurations": [

{

"name": "Mac",

"includePath": [

"${workspaceFolder}/**",

"/Users/theapemachine/go/src/github.com/theapemachine/caramba/.venv/lib/python3.12/site-packages/torch/include",

"/Users/theapemachine/go/src/github.com/theapemachine/caramba/.venv/lib/python3.12/site-packages/torch/include/torch/csrc/api/include",

"/usr/local/include",

"/Library/Developer/CommandLineTools/usr/include/c++/v1"

],

"defines": [

"TORCH_EXTENSION_NAME=caramba_metal_ops",

"TORCH_API_INCLUDE_EXTENSION_H"

],

"macFrameworkPath": [

"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks"

],

"compilerPath": "/usr/bin/clang",

"cStandard": "c17",

"cppStandard": "c++17",

"intelliSenseMode": "macos-clang-arm64"

}

],

"version": 4

}

{

"configurations": [

{

"name": "Mac",

"includePath": [

"${workspaceFolder}/**",

"${workspaceFolder}/.venv/lib/python3.12/site-packages/torch/include",

"${workspaceFolder}/.venv/lib/python3.12/site-packages/torch/include/torch/csrc/api/include",

"/usr/local/include",

"/Library/Developer/CommandLineTools/usr/include/c++/v1"

],

"defines": [

"TORCH_EXTENSION_NAME=caramba_metal_ops",

"TORCH_API_INCLUDE_EXTENSION_H"

],

"macFrameworkPath": [

"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks"

],

"compilerPath": "/usr/bin/clang",

"cStandard": "c17",

"cppStandard": "c++17",

"intelliSenseMode": "macos-clang-arm64"

}

],

"version": 4

}

🤖 Prompt for AI Agents

In @.vscode/c_cpp_properties.json around lines 1 - 26, The .vscode/c_cpp_properties.json currently contains hardcoded absolute include paths (the entries in the "includePath" array pointing to /Users/theapemachine/...) which breaks portability; update those specific includePath entries to use workspace-relative variables (e.g. ${workspaceFolder}/...) or env vars (e.g. ${env:PYENV} or ${workspaceFolder}/.venv/...) so all contributors resolve headers locally, and either add .vscode/c_cpp_properties.json to .gitignore or add a template file (c_cpp_properties.json.example) for local customization; target the "includePath" array and the file-level config in .vscode/c_cpp_properties.json when making the change.

coderabbitai · 2026-01-10T22:44:28Z

artifacts/checkpoints/original/params.json

+{
+  "dim": 2048,
+  "ffn_dim_multiplier": 1.5,
+  "multiple_of": 256,
+  "n_heads": 32,
+  "n_kv_heads": 8,
+  "n_layers": 16,
+  "norm_eps": 1e-05,
+  "rope_theta": 500000.0,
+  "use_scaled_rope": true,
+  "vocab_size": 128256
+}


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Check if the params.json file has a trailing newline od -c artifacts/checkpoints/original/params.json | tail -2

Repository: TheApeMachine/caramba

Length of output: 105

🏁 Script executed:

# Search for use_scaled_rope usage in the codebase rg "use_scaled_rope" -t json -t py -t js -A 2 -B 2

Repository: TheApeMachine/caramba

Length of output: 378

🏁 Script executed:

# Find loaders and converters that might consume params.json rg "params\.json" -t py -t js -A 3 -B 1

Repository: TheApeMachine/caramba

Length of output: 47

🏁 Script executed:

# Search for rope_scaling (HF convention) to understand the relationship rg "rope_scaling|rope_theta" -t py -t json -A 2 -B 2

Repository: TheApeMachine/caramba

Length of output: 3930

🏁 Script executed:

# List files in the checkpoints directory to understand structure find artifacts/checkpoints -type f -name "*.json" | head -20

Repository: TheApeMachine/caramba

Length of output: 313

🏁 Script executed:

# Check for trailing newline in params.json using cat -A (shows $ at EOF) cat -A artifacts/checkpoints/original/params.json | tail -3

Repository: TheApeMachine/caramba

Length of output: 118

🏁 Script executed:

# Search for all use_scaled_rope references across entire codebase rg "use_scaled_rope" -t py -t json -t ts -t js

Repository: TheApeMachine/caramba

Length of output: 134

🏁 Script executed:

# Find where params.json is loaded rg "params\.json|load.*params" -t py -A 5 -B 2

Repository: TheApeMachine/caramba

Length of output: 981

🏁 Script executed:

# Check what the config loader expects/accepts rg "class.*[Cc]onfig|def.*load.*config" -t py -A 10 -B 1 | head -80

Repository: TheApeMachine/caramba

Length of output: 4162

🏁 Script executed:

# Search more carefully for how params.json is loaded - check imports and usage patterns rg "json\.load|load.*json|open.*params" -t py -A 3 -B 1 | head -100

Repository: TheApeMachine/caramba

Length of output: 6264

🏁 Script executed:

# Check if there's a config loader that specifically handles params.json rg "def load|from_pretrained|artifacts/checkpoints" -t py -A 5 -B 2 | head -80

Repository: TheApeMachine/caramba

Length of output: 5526

🏁 Script executed:

# Look for the config layer initialization to understand field mapping cat -n config/layer.py | head -80

Repository: TheApeMachine/caramba

Length of output: 2926

🏁 Script executed:

# Continue reading config/layer.py to find rope_scaling definition cat -n config/layer.py | sed -n '190,250p'

Repository: TheApeMachine/caramba

Length of output: 2537

🏁 Script executed:

# Search for where params.json might be loaded in model initialization rg "artifacts/checkpoints.*original|params\.json" -t py -A 5 -B 2

Repository: TheApeMachine/caramba

Length of output: 47

🏁 Script executed:

# Look for any imports or usage of params.json in Python files rg "params" -t py -A 2 -B 2 | grep -A 2 -B 2 "json\|load" | head -60

Repository: TheApeMachine/caramba

Length of output: 1501

🏁 Script executed:

# Check if use_scaled_rope is referenced anywhere in Python code rg "use_scaled_rope" -t py

Repository: TheApeMachine/caramba

Length of output: 47

Add trailing newline and remove unused use_scaled_rope field.

The file lacks a trailing newline. Additionally, use_scaled_rope is never consumed by any loader or converter in the codebase. The actual RoPE scaling configuration is loaded from HF config.json via the rope_scaling dict (defined in config/layer.py and used throughout layer/rope.py and attention implementations). Remove the orphaned use_scaled_rope field and add a newline at EOF to maintain consistency with standard tooling expectations.

🤖 Prompt for AI Agents

In @artifacts/checkpoints/original/params.json around lines 1 - 12, Remove the unused "use_scaled_rope" key from params.json and add a trailing newline at EOF; the RoPE scaling is already handled via the rope_scaling dict (see config/layer.py) and consumed by layer/rope.py and attention implementations, so deleting use_scaled_rope avoids confusion and the newline ensures standard EOF formatting.

coderabbitai · 2026-01-10T22:44:28Z

artifacts/checkpoints/tokenizer_config.json

+{
+  "added_tokens_decoder": {
+    "128000": {
+      "content": "<|begin_of_text|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128001": {
+      "content": "<|end_of_text|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128002": {
+      "content": "<|reserved_special_token_0|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128003": {
+      "content": "<|reserved_special_token_1|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128004": {
+      "content": "<|finetune_right_pad_id|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128005": {
+      "content": "<|reserved_special_token_2|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128006": {
+      "content": "<|start_header_id|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128007": {
+      "content": "<|end_header_id|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128008": {
+      "content": "<|eom_id|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128009": {
+      "content": "<|eot_id|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128010": {
+      "content": "<|python_tag|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128011": {
+      "content": "<|reserved_special_token_3|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128012": {
+      "content": "<|reserved_special_token_4|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128013": {
+      "content": "<|reserved_special_token_5|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128014": {
+      "content": "<|reserved_special_token_6|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128015": {
+      "content": "<|reserved_special_token_7|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128016": {
+      "content": "<|reserved_special_token_8|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128017": {
+      "content": "<|reserved_special_token_9|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128018": {
+      "content": "<|reserved_special_token_10|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128019": {
+      "content": "<|reserved_special_token_11|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128020": {
+      "content": "<|reserved_special_token_12|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128021": {
+      "content": "<|reserved_special_token_13|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128022": {
+      "content": "<|reserved_special_token_14|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128023": {
+      "content": "<|reserved_special_token_15|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128024": {
+      "content": "<|reserved_special_token_16|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128025": {
+      "content": "<|reserved_special_token_17|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128026": {
+      "content": "<|reserved_special_token_18|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128027": {
+      "content": "<|reserved_special_token_19|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128028": {
+      "content": "<|reserved_special_token_20|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128029": {
+      "content": "<|reserved_special_token_21|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128030": {
+      "content": "<|reserved_special_token_22|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128031": {
+      "content": "<|reserved_special_token_23|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128032": {
+      "content": "<|reserved_special_token_24|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128033": {
+      "content": "<|reserved_special_token_25|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128034": {
+      "content": "<|reserved_special_token_26|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128035": {
+      "content": "<|reserved_special_token_27|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128036": {
+      "content": "<|reserved_special_token_28|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128037": {
+      "content": "<|reserved_special_token_29|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128038": {
+      "content": "<|reserved_special_token_30|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128039": {
+      "content": "<|reserved_special_token_31|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128040": {
+      "content": "<|reserved_special_token_32|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128041": {
+      "content": "<|reserved_special_token_33|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128042": {
+      "content": "<|reserved_special_token_34|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128043": {
+      "content": "<|reserved_special_token_35|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128044": {
+      "content": "<|reserved_special_token_36|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128045": {
+      "content": "<|reserved_special_token_37|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128046": {
+      "content": "<|reserved_special_token_38|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128047": {
+      "content": "<|reserved_special_token_39|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128048": {
+      "content": "<|reserved_special_token_40|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128049": {
+      "content": "<|reserved_special_token_41|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128050": {
+      "content": "<|reserved_special_token_42|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128051": {
+      "content": "<|reserved_special_token_43|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128052": {
+      "content": "<|reserved_special_token_44|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128053": {
+      "content": "<|reserved_special_token_45|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128054": {
+      "content": "<|reserved_special_token_46|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128055": {
+      "content": "<|reserved_special_token_47|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128056": {
+      "content": "<|reserved_special_token_48|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128057": {
+      "content": "<|reserved_special_token_49|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128058": {
+      "content": "<|reserved_special_token_50|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128059": {
+      "content": "<|reserved_special_token_51|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128060": {
+      "content": "<|reserved_special_token_52|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128061": {
+      "content": "<|reserved_special_token_53|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128062": {
+      "content": "<|reserved_special_token_54|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128063": {
+      "content": "<|reserved_special_token_55|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128064": {
+      "content": "<|reserved_special_token_56|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128065": {
+      "content": "<|reserved_special_token_57|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128066": {
+      "content": "<|reserved_special_token_58|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128067": {
+      "content": "<|reserved_special_token_59|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128068": {
+      "content": "<|reserved_special_token_60|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128069": {
+      "content": "<|reserved_special_token_61|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128070": {
+      "content": "<|reserved_special_token_62|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128071": {
+      "content": "<|reserved_special_token_63|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128072": {
+      "content": "<|reserved_special_token_64|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128073": {
+      "content": "<|reserved_special_token_65|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128074": {
+      "content": "<|reserved_special_token_66|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128075": {
+      "content": "<|reserved_special_token_67|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128076": {
+      "content": "<|reserved_special_token_68|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128077": {
+      "content": "<|reserved_special_token_69|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128078": {
+      "content": "<|reserved_special_token_70|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128079": {
+      "content": "<|reserved_special_token_71|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128080": {
+      "content": "<|reserved_special_token_72|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128081": {
+      "content": "<|reserved_special_token_73|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128082": {
+      "content": "<|reserved_special_token_74|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128083": {
+      "content": "<|reserved_special_token_75|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128084": {
+      "content": "<|reserved_special_token_76|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128085": {
+      "content": "<|reserved_special_token_77|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128086": {
+      "content": "<|reserved_special_token_78|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128087": {
+      "content": "<|reserved_special_token_79|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128088": {
+      "content": "<|reserved_special_token_80|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128089": {
+      "content": "<|reserved_special_token_81|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128090": {
+      "content": "<|reserved_special_token_82|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128091": {
+      "content": "<|reserved_special_token_83|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128092": {
+      "content": "<|reserved_special_token_84|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128093": {
+      "content": "<|reserved_special_token_85|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128094": {
+      "content": "<|reserved_special_token_86|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128095": {
+      "content": "<|reserved_special_token_87|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128096": {
+      "content": "<|reserved_special_token_88|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128097": {
+      "content": "<|reserved_special_token_89|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128098": {
+      "content": "<|reserved_special_token_90|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128099": {
+      "content": "<|reserved_special_token_91|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128100": {
+      "content": "<|reserved_special_token_92|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128101": {
+      "content": "<|reserved_special_token_93|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128102": {
+      "content": "<|reserved_special_token_94|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128103": {
+      "content": "<|reserved_special_token_95|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128104": {
+      "content": "<|reserved_special_token_96|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128105": {
+      "content": "<|reserved_special_token_97|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128106": {
+      "content": "<|reserved_special_token_98|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128107": {
+      "content": "<|reserved_special_token_99|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128108": {
+      "content": "<|reserved_special_token_100|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128109": {
+      "content": "<|reserved_special_token_101|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128110": {
+      "content": "<|reserved_special_token_102|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128111": {
+      "content": "<|reserved_special_token_103|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128112": {
+      "content": "<|reserved_special_token_104|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128113": {
+      "content": "<|reserved_special_token_105|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128114": {
+      "content": "<|reserved_special_token_106|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128115": {
+      "content": "<|reserved_special_token_107|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128116": {
+      "content": "<|reserved_special_token_108|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128117": {
+      "content": "<|reserved_special_token_109|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128118": {
+      "content": "<|reserved_special_token_110|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128119": {
+      "content": "<|reserved_special_token_111|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128120": {
+      "content": "<|reserved_special_token_112|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128121": {
+      "content": "<|reserved_special_token_113|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128122": {
+      "content": "<|reserved_special_token_114|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128123": {
+      "content": "<|reserved_special_token_115|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128124": {
+      "content": "<|reserved_special_token_116|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128125": {
+      "content": "<|reserved_special_token_117|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128126": {
+      "content": "<|reserved_special_token_118|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128127": {
+      "content": "<|reserved_special_token_119|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128128": {
+      "content": "<|reserved_special_token_120|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128129": {
+      "content": "<|reserved_special_token_121|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128130": {
+      "content": "<|reserved_special_token_122|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128131": {
+      "content": "<|reserved_special_token_123|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128132": {
+      "content": "<|reserved_special_token_124|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128133": {
+      "content": "<|reserved_special_token_125|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128134": {
+      "content": "<|reserved_special_token_126|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128135": {
+      "content": "<|reserved_special_token_127|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128136": {
+      "content": "<|reserved_special_token_128|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128137": {
+      "content": "<|reserved_special_token_129|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128138": {
+      "content": "<|reserved_special_token_130|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128139": {
+      "content": "<|reserved_special_token_131|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128140": {
+      "content": "<|reserved_special_token_132|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128141": {
+      "content": "<|reserved_special_token_133|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128142": {
+      "content": "<|reserved_special_token_134|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128143": {
+      "content": "<|reserved_special_token_135|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128144": {
+      "content": "<|reserved_special_token_136|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128145": {
+      "content": "<|reserved_special_token_137|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128146": {
+      "content": "<|reserved_special_token_138|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128147": {
+      "content": "<|reserved_special_token_139|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128148": {
+      "content": "<|reserved_special_token_140|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128149": {
+      "content": "<|reserved_special_token_141|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128150": {
+      "content": "<|reserved_special_token_142|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128151": {
+      "content": "<|reserved_special_token_143|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128152": {
+      "content": "<|reserved_special_token_144|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128153": {
+      "content": "<|reserved_special_token_145|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128154": {
+      "content": "<|reserved_special_token_146|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128155": {
+      "content": "<|reserved_special_token_147|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128156": {
+      "content": "<|reserved_special_token_148|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128157": {
+      "content": "<|reserved_special_token_149|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128158": {
+      "content": "<|reserved_special_token_150|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128159": {
+      "content": "<|reserved_special_token_151|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128160": {
+      "content": "<|reserved_special_token_152|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128161": {
+      "content": "<|reserved_special_token_153|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128162": {
+      "content": "<|reserved_special_token_154|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128163": {
+      "content": "<|reserved_special_token_155|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128164": {
+      "content": "<|reserved_special_token_156|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128165": {
+      "content": "<|reserved_special_token_157|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128166": {
+      "content": "<|reserved_special_token_158|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128167": {
+      "content": "<|reserved_special_token_159|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128168": {
+      "content": "<|reserved_special_token_160|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128169": {
+      "content": "<|reserved_special_token_161|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128170": {
+      "content": "<|reserved_special_token_162|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128171": {
+      "content": "<|reserved_special_token_163|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128172": {
+      "content": "<|reserved_special_token_164|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128173": {
+      "content": "<|reserved_special_token_165|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128174": {
+      "content": "<|reserved_special_token_166|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128175": {
+      "content": "<|reserved_special_token_167|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128176": {
+      "content": "<|reserved_special_token_168|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128177": {
+      "content": "<|reserved_special_token_169|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128178": {
+      "content": "<|reserved_special_token_170|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128179": {
+      "content": "<|reserved_special_token_171|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128180": {
+      "content": "<|reserved_special_token_172|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128181": {
+      "content": "<|reserved_special_token_173|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128182": {
+      "content": "<|reserved_special_token_174|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128183": {
+      "content": "<|reserved_special_token_175|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128184": {
+      "content": "<|reserved_special_token_176|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128185": {
+      "content": "<|reserved_special_token_177|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128186": {
+      "content": "<|reserved_special_token_178|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128187": {
+      "content": "<|reserved_special_token_179|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128188": {
+      "content": "<|reserved_special_token_180|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128189": {
+      "content": "<|reserved_special_token_181|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128190": {
+      "content": "<|reserved_special_token_182|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128191": {
+      "content": "<|reserved_special_token_183|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128192": {
+      "content": "<|reserved_special_token_184|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128193": {
+      "content": "<|reserved_special_token_185|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128194": {
+      "content": "<|reserved_special_token_186|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128195": {
+      "content": "<|reserved_special_token_187|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128196": {
+      "content": "<|reserved_special_token_188|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128197": {
+      "content": "<|reserved_special_token_189|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128198": {
+      "content": "<|reserved_special_token_190|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128199": {
+      "content": "<|reserved_special_token_191|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128200": {
+      "content": "<|reserved_special_token_192|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128201": {
+      "content": "<|reserved_special_token_193|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128202": {
+      "content": "<|reserved_special_token_194|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128203": {
+      "content": "<|reserved_special_token_195|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128204": {
+      "content": "<|reserved_special_token_196|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128205": {
+      "content": "<|reserved_special_token_197|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128206": {
+      "content": "<|reserved_special_token_198|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128207": {
+      "content": "<|reserved_special_token_199|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128208": {
+      "content": "<|reserved_special_token_200|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128209": {
+      "content": "<|reserved_special_token_201|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128210": {
+      "content": "<|reserved_special_token_202|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128211": {
+      "content": "<|reserved_special_token_203|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128212": {
+      "content": "<|reserved_special_token_204|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128213": {
+      "content": "<|reserved_special_token_205|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128214": {
+      "content": "<|reserved_special_token_206|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128215": {
+      "content": "<|reserved_special_token_207|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128216": {
+      "content": "<|reserved_special_token_208|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128217": {
+      "content": "<|reserved_special_token_209|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128218": {
+      "content": "<|reserved_special_token_210|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128219": {
+      "content": "<|reserved_special_token_211|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128220": {
+      "content": "<|reserved_special_token_212|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128221": {
+      "content": "<|reserved_special_token_213|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128222": {
+      "content": "<|reserved_special_token_214|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128223": {
+      "content": "<|reserved_special_token_215|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128224": {
+      "content": "<|reserved_special_token_216|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128225": {
+      "content": "<|reserved_special_token_217|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128226": {
+      "content": "<|reserved_special_token_218|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128227": {
+      "content": "<|reserved_special_token_219|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128228": {
+      "content": "<|reserved_special_token_220|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128229": {
+      "content": "<|reserved_special_token_221|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128230": {
+      "content": "<|reserved_special_token_222|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128231": {
+      "content": "<|reserved_special_token_223|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128232": {
+      "content": "<|reserved_special_token_224|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128233": {
+      "content": "<|reserved_special_token_225|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128234": {
+      "content": "<|reserved_special_token_226|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128235": {
+      "content": "<|reserved_special_token_227|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128236": {
+      "content": "<|reserved_special_token_228|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128237": {
+      "content": "<|reserved_special_token_229|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128238": {
+      "content": "<|reserved_special_token_230|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128239": {
+      "content": "<|reserved_special_token_231|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128240": {
+      "content": "<|reserved_special_token_232|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128241": {
+      "content": "<|reserved_special_token_233|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128242": {
+      "content": "<|reserved_special_token_234|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128243": {
+      "content": "<|reserved_special_token_235|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128244": {
+      "content": "<|reserved_special_token_236|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128245": {
+      "content": "<|reserved_special_token_237|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128246": {
+      "content": "<|reserved_special_token_238|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128247": {
+      "content": "<|reserved_special_token_239|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128248": {
+      "content": "<|reserved_special_token_240|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128249": {
+      "content": "<|reserved_special_token_241|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128250": {
+      "content": "<|reserved_special_token_242|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128251": {
+      "content": "<|reserved_special_token_243|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128252": {
+      "content": "<|reserved_special_token_244|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128253": {
+      "content": "<|reserved_special_token_245|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128254": {
+      "content": "<|reserved_special_token_246|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128255": {
+      "content": "<|reserved_special_token_247|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<|begin_of_text|>",
+  "clean_up_tokenization_spaces": true,
+  "eos_token": "<|end_of_text|>",
+  "model_input_names": [
+    "input_ids",
+    "attention_mask"
+  ],
+  "model_max_length": 131072,
+  "tokenizer_class": "PreTrainedTokenizerFast"
+}


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check the actual file size and line count echo "=== File Statistics ===" wc -l artifacts/checkpoints/tokenizer_config.json du -sh artifacts/checkpoints/tokenizer_config.json # Check the artifacts directory total size echo -e "\n=== Artifacts Directory Size ===" du -sh artifacts/checkpoints/ find artifacts/checkpoints/ -type f -exec du -h {} \; | sort -h # Extract and validate key fields from the JSON echo -e "\n=== JSON Validation ===" python3 << 'EOF' import json with open('artifacts/checkpoints/tokenizer_config.json', 'r') as f: config = json.load(f) # Check added_tokens_decoder added_tokens = config.get('added_tokens_decoder', {}) token_ids = sorted([int(k) for k in added_tokens.keys()]) print(f"Total special tokens: {len(token_ids)}") print(f"Token ID range: {min(token_ids)} to {max(token_ids)}") print(f"Contiguous range: {token_ids[0]}-{token_ids[-1]}") # Check for specific tokens mentioned in review special_token_names = {} for token_id, token_info in added_tokens.items(): content = token_info.get('content', '') special_token_names[content] = token_id print(f"\nNamed special tokens mentioned in review:") for name in ['<|begin_of_text|>', '<|end_of_text|>', '<|start_header_id|>', '<|end_header_id|>', '<|eom_id|>', '<|eot_id|>']: if name in special_token_names: print(f" {name}: {special_token_names[name]}") # Check other config fields print(f"\nModel max length: {config.get('model_max_length')}") print(f"Tokenizer class: {config.get('tokenizer_class')}") print(f"BOS token: {config.get('bos_token')}") print(f"EOS token: {config.get('eos_token')}") EOF

Repository: TheApeMachine/caramba

Length of output: 1170

🏁 Script executed:

#!/bin/bash # Check end of file and line 2060 specifically echo "=== Lines 2055-2061 (end of file) ===" tail -20 artifacts/checkpoints/tokenizer_config.json | head -20 # Count reserved tokens vs named tokens echo -e "\n=== Token Classification ===" python3 << 'EOF' import json with open('artifacts/checkpoints/tokenizer_config.json', 'r') as f: config = json.load(f) added_tokens = config.get('added_tokens_decoder', {}) # Classify tokens named_tokens = {} reserved_tokens = {} for token_id, token_info in added_tokens.items(): content = token_info.get('content', '') if 'reserved_special_token' in content: reserved_tokens[content] = token_id else: named_tokens[content] = token_id print(f"Named/special tokens (non-reserved): {len(named_tokens)}") print(f"Reserved tokens: {len(reserved_tokens)}") print(f"Total: {len(named_tokens) + len(reserved_tokens)}") print(f"\nNamed tokens:") for name in sorted(named_tokens.keys()): print(f" {name}") EOF

Repository: TheApeMachine/caramba

Length of output: 834

LGTM - Standard tokenizer configuration with extensive special tokens.

The tokenizer configuration is correctly structured for PreTrainedTokenizerFast with:

256 special tokens (IDs 128000-128255) properly mapped

Begin/end of text tokens at 128000/128001

Named special tokens for chat formatting (start_header_id, end_header_id, eom_id, eot_id, python_tag, finetune_right_pad_id)

248 reserved tokens for future extensibility

Model max length of 131072 (128K context) aligns with Llama 3.2 specifications

Consider repository bloat: The artifacts directory is 11M, primarily driven by tokenizer.json (8.7M) and tokenizer.model (2.1M) rather than the config file itself (52K). For production deployments, consider:

Using Git LFS for large model artifacts

Fetching tokenizer configs from HuggingFace Hub at runtime

Documenting why local copies are necessary if they must be included

🧰 Tools

🪛 Checkov (3.2.334)

[low] 2060-2061: Base64 High Entropy String

(CKV_SECRET_6)

🤖 Prompt for AI Agents

In @artifacts/checkpoints/tokenizer_config.json around lines 1 - 2061, Repo artifacts are bloating the repository because large tokenizer files (tokenizer.json ~8.7M and tokenizer.model ~2.1M) are committed under artifacts/checkpoints alongside tokenizer_config.json; remove or offload them to avoid heavy PRs. Fix by: (1) remove large binary tokenizer files from the commit and add them to Git LFS (or delete and re-add via git rm --cached) so tokenizer.json and tokenizer.model are tracked via LFS; (2) update documentation/README to state that artifacts are fetched at runtime from the HuggingFace Hub (use AutoTokenizer.from_pretrained or transformers' from_pretrained for the model repo) and include a fallback local path for tokenizer_config.json if you must keep a small config; (3) ensure CI/deployment pulls tokenizer artifacts at build/runtime (reference files tokenizer.json, tokenizer.model and tokenizer_config.json in artifacts/checkpoints and functions that load them) so the repo contains only the lightweight config while large blobs live in LFS or are downloaded on demand.

coderabbitai · 2026-01-10T22:44:28Z

artifacts/checkpoints/USE_POLICY.md

+**Llama 3.2** **Acceptable Use Policy**
+
+Meta is committed to promoting safe and fair use of its tools and features, including Llama 3.2. If you access or use Llama 3.2, you agree to this Acceptable Use Policy (“**Policy**”). The most recent copy of this policy can be found at [https://www.llama.com/llama3_2/use-policy](https://www.llama.com/llama3_2/use-policy).
+
+**Prohibited Uses**
+
+We want everyone to use Llama 3.2 safely and responsibly. You agree you will not use, or allow others to use, Llama 3.2 to:
+
+
+
+1. Violate the law or others’ rights, including to:
+    1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as:
+        1. Violence or terrorism
+        2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material
+        3. Human trafficking, exploitation, and sexual violence
+        4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials.
+        5. Sexual solicitation
+        6. Any other criminal activity
+    1. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals
+    2. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services
+    3. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices
+    4. Collect, process, disclose, generate, or infer private or sensitive information about individuals, including information about individuals’ identity, health, or demographic information, unless you have obtained the right to do so in accordance with applicable law
+    5. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials
+    6. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system
+    7. Engage in any action, or facilitate any action, to intentionally circumvent or remove usage restrictions or other safety measures, or to enable functionality disabled by Meta 
+2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.2 related to the following:
+    8. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State or to the U.S. Biological Weapons Anti-Terrorism Act of 1989 or the Chemical Weapons Convention Implementation Act of 1997
+    9. Guns and illegal weapons (including weapon development)
+    10. Illegal drugs and regulated/controlled substances
+    11. Operation of critical infrastructure, transportation technologies, or heavy machinery
+    12. Self-harm or harm to others, including suicide, cutting, and eating disorders
+    13. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual
+3. Intentionally deceive or mislead others, including use of Llama 3.2 related to the following:
+    14. Generating, promoting, or furthering fraud or the creation or promotion of disinformation
+    15. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content
+    16. Generating, promoting, or further distributing spam
+    17. Impersonating another individual without consent, authorization, or legal right
+    18. Representing that the use of Llama 3.2 or outputs are human-generated
+    19. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 
+4. Fail to appropriately disclose to end users any known dangers of your AI system
+5. Interact with third party tools, models, or software designed to generate unlawful content or engage in unlawful or harmful conduct and/or represent that the outputs of such tools, models, or software are associated with Meta or Llama 3.2
+
+With respect to any multimodal models included in Llama 3.2, the rights granted under Section 1(a) of the Llama 3.2 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models.


⚠️ Potential issue | 🟠 Major

Fix markdownlint blockers (headings, ordered-list prefixes, trailing spaces).

If markdownlint-cli2 is enforced, current MD041/MD036/MD029/MD009 findings will fail CI.

Proposed fix (format-only)

-**Llama 3.2** **Acceptable Use Policy** +# Llama 3.2 Acceptable Use Policy @@ -**Prohibited Uses** +## Prohibited Uses @@ -1. Violate the law or others’ rights, including to: +1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: @@ -2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.2 related to the following: +1. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.2 related to the following: @@ -3. Intentionally deceive or mislead others, including use of Llama 3.2 related to the following: +1. Intentionally deceive or mislead others, including use of Llama 3.2 related to the following: @@ -4. Fail to appropriately disclose to end users any known dangers of your AI system -5. Interact with third party tools, models, or software designed to generate unlawful content or engage in unlawful or harmful conduct and/or represent that the outputs of such tools, models, or software are associated with Meta or Llama 3.2 +1. Fail to appropriately disclose to end users any known dangers of your AI system +1. Interact with third-party tools, models, or software designed to generate unlawful content or engage in unlawful or harmful conduct and/or represent that the outputs of such tools, models, or software are associated with Meta or Llama 3.2

🧰 Tools

🪛 LanguageTool

[grammar] ~41-~41: Use a hyphen to join words.
Context: ...of your AI system 5. Interact with third party tools, models, or software designe...

(QB_NEW_EN_HYPHEN)

[style] ~42-~42: ‘With respect to’ might be wordy. Consider a shorter alternative.
Context: ... are associated with Meta or Llama 3.2 With respect to any multimodal models included in Llama...

(EN_WORDINESS_PREMIUM_WITH_RESPECT_TO)

🪛 markdownlint-cli2 (0.18.1)

1-1: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)

5-5: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

20-20: Ordered list item prefix
Expected: 1; Actual: 2; Style: 1/1/1

(MD029, ol-prefix)

21-21: Ordered list item prefix
Expected: 1; Actual: 3; Style: 1/1/1

(MD029, ol-prefix)

22-22: Ordered list item prefix
Expected: 1; Actual: 4; Style: 1/1/1

(MD029, ol-prefix)

23-23: Ordered list item prefix
Expected: 1; Actual: 5; Style: 1/1/1

(MD029, ol-prefix)

24-24: Ordered list item prefix
Expected: 1; Actual: 6; Style: 1/1/1

(MD029, ol-prefix)

25-25: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

25-25: Ordered list item prefix
Expected: 1; Actual: 7; Style: 1/1/1

(MD029, ol-prefix)

39-39: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

🤖 Prompt for AI Agents

In @artifacts/checkpoints/USE_POLICY.md around lines 1 - 43, Convert the initial bold title into a proper Markdown heading (e.g., use a single-level heading like "# Llama 3.2 Acceptable Use Policy") and ensure a blank line follows it; normalize ordered-list markers so nested and top-level numbered lists use a consistent prefix style (prefer using "1." for every ordered item per MD029 or sequential numbers consistently), fix nested indentation for sublists so they render as distinct levels (use two or four spaces per indent), remove all trailing spaces and hard tabs (address MD009), and ensure there is a blank line before/after block-level elements (lists/headings) to satisfy MD041/MD036/MD029/MD009 checks.

coderabbitai · 2026-01-10T22:44:28Z

artifacts/checkpoints/USE_POLICY.md

+Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means:
+
+
+
+* Reporting issues with the model: [https://github.com/meta-llama/llama-models/issues](https://l.workplace.com/l.php?u=https%3A%2F%2Fgithub.com%2Fmeta-llama%2Fllama-models%2Fissues&h=AT0qV8W9BFT6NwihiOHRuKYQM_UnkzN_NmHMy91OT55gkLpgi4kQupHUl0ssR4dQsIQ8n3tfd0vtkobvsEvt1l4Ic6GXI2EeuHV8N08OG2WnbAmm0FL4ObkazC6G_256vN0lN9DsykCvCqGZ)
+* Reporting risky content generated by the model: [developers.facebook.com/llama_output_feedback](http://developers.facebook.com/llama_output_feedback)
+* Reporting bugs and security concerns: [facebook.com/whitehat/info](http://facebook.com/whitehat/info)
+* Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama 3.2: LlamaUseReport@meta.com


⚠️ Potential issue | 🟠 Major

Replace the redirected/mismatched GitHub Issues link target; prefer https links.

Line 49’s link text is GitHub, but the href is a l.workplace.com redirect—please make the href the direct GitHub URL (and consider upgrading the http:// links on Lines 50-51 to https:// if supported).

Proposed fix (safer URLs)

-* Reporting issues with the model: [https://github.com/meta-llama/llama-models/issues](https://l.workplace.com/l.php?u=https%3A%2F%2Fgithub.com%2Fmeta-llama%2Fllama-models%2Fissues&h=AT0qV8W9BFT6NwihiOHRuKYQM_UnkzN_NmHMy91OT55gkLpgi4kQupHUl0ssR4dQsIQ8n3tfd0vtkobvsEvt1l4Ic6GXI2EeuHV8N08OG2WnbAmm0FL4ObkazC6G_256vN0lN9DsykCvCqGZ) -* Reporting risky content generated by the model: [developers.facebook.com/llama_output_feedback](http://developers.facebook.com/llama_output_feedback) -* Reporting bugs and security concerns: [facebook.com/whitehat/info](http://facebook.com/whitehat/info) +* Reporting issues with the model: [https://github.com/meta-llama/llama-models/issues](https://github.com/meta-llama/llama-models/issues) +* Reporting risky content generated by the model: [developers.facebook.com/llama_output_feedback](https://developers.facebook.com/llama_output_feedback) +* Reporting bugs and security concerns: [facebook.com/whitehat/info](https://facebook.com/whitehat/info)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means:

* Reporting issues with the model: [https://github.com/meta-llama/llama-models/issues](https://l.workplace.com/l.php?u=https%3A%2F%2Fgithub.com%2Fmeta-llama%2Fllama-models%2Fissues&h=AT0qV8W9BFT6NwihiOHRuKYQM_UnkzN_NmHMy91OT55gkLpgi4kQupHUl0ssR4dQsIQ8n3tfd0vtkobvsEvt1l4Ic6GXI2EeuHV8N08OG2WnbAmm0FL4ObkazC6G_256vN0lN9DsykCvCqGZ)

* Reporting risky content generated by the model: [developers.facebook.com/llama_output_feedback](http://developers.facebook.com/llama_output_feedback)

* Reporting bugs and security concerns: [facebook.com/whitehat/info](http://facebook.com/whitehat/info)

* Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama 3.2: LlamaUseReport@meta.com

Please report any violation of this Policy, software "bug," or other problems that could lead to a violation of this Policy through one of the following means:

* Reporting issues with the model: [https://github.com/meta-llama/llama-models/issues](https://github.com/meta-llama/llama-models/issues)

* Reporting risky content generated by the model: [developers.facebook.com/llama_output_feedback](https://developers.facebook.com/llama_output_feedback)

* Reporting bugs and security concerns: [facebook.com/whitehat/info](https://facebook.com/whitehat/info)

* Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama 3.2: LlamaUseReport@meta.com

🧰 Tools

🪛 markdownlint-cli2 (0.18.1)

52-52: Bare URL used

(MD034, no-bare-urls)

🤖 Prompt for AI Agents

In @artifacts/checkpoints/USE_POLICY.md around lines 45 - 52, Replace the redirect/incorrect hrefs in the policy links so the visible link text points directly to the intended secure URLs: update the "Reporting issues with the model" link to use the direct HTTPS URL https://github.com/meta-llama/llama-models/issues instead of the l.workplace.com redirect, and change the other two links to HTTPS (use https://developers.facebook.com/llama_output_feedback and https://facebook.com/whitehat/info) while leaving the LlamaUseReport@meta.com contact as-is; ensure the markdown link targets (the hrefs) are the direct secure URLs rather than redirects.

coderabbitai · 2026-01-10T22:44:30Z

research/ccp/notes copy.md

+Complete MOSAIC Differentiable VM Architecture
+This plan implements the five missing architectural elements discussed in the meeting notes to transform MOSAIC into a fully autonomous "Differentiable Virtual Machine."
+
+Current State Analysis
+What exists:
+
+HomeostaticLoop, IntrinsicDrive, DriveBand in core/homeostasis.py - primitives only, not integrated
+MosaicOpcode ISA in layer/mosaic/isa.py - defines 10 opcodes (NOP, READ_MEM, WRITE_MEM, CLEAR_MEM, IDLE, GATE_UP, GATE_DOWN, SCAN, COMMIT, RESPOND)
+opcode_head in MemoryBlockLayer - emit-only, does not control behavior
+sleep_replay_per_pair in datasets - training data exists, no runtime loop


⚠️ Potential issue | 🟠 Major

❓ Verification inconclusive

Script executed:

# Check if the file exists and read initial content find . -type f -name "notes copy.md" -o -name "*notes*copy*"

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

./research/ccp/notes copy.md

Script executed:

# Also search more broadly for this path fd -t f "notes copy" .

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

./research/ccp/notes copy.md

Script executed:

# List files in the research/ccp directory to understand structure ls -la research/ccp/ 2>/dev/null || echo "Directory not found at research/ccp"

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

total 16792 drwxr-xr-x 2 jailuser jailuser 140 Jan 10 22:36 . drwxr-xr-x 5 jailuser jailuser 100 Jan 10 22:36 .. -rw-r--r-- 1 jailuser jailuser 605422 Jan 10 22:36 Cognitive Control Plane.pdf -rw-r--r-- 1 jailuser jailuser 16495005 Jan 10 22:36 Cognitive_Control_Plane_MOSAIC_Runtime.pdf -rw-r--r-- 1 jailuser jailuser 8039 Jan 10 22:36 notes copy.md -rw-r--r-- 1 jailuser jailuser 67873 Jan 10 22:36 paper.tex -rw-r--r-- 1 jailuser jailuser 10941 Jan 10 22:36 references.bib

Script executed:

# Search for fp32 and Metal-related content in the repository rg -l "fp32|FP32" --type md --max-count 20

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

docs/inference.md CLAUDE.md

Script executed:

# Search for Metal-related mentions rg -l "metal|Metal" --type md --max-count 20

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

docs/optimization.md docs/getting-started.md docker/OPTIMIZATION.md README.md artifacts/ai/meeting_notes/temp.md CLAUDE.md

Script executed:

# Read the actual file content to verify scope cat -n ./research/ccp/notes\ copy.md | head -80

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

1 Complete MOSAIC Differentiable VM Architecture 2 This plan implements the five missing architectural elements discussed in the meeting notes to transform MOSAIC into a fully autonomous "Differentiable Virtual Machine." 3 4 Current State Analysis 5 What exists: 6 7 HomeostaticLoop, IntrinsicDrive, DriveBand in core/homeostasis.py - primitives only, not integrated 8 MosaicOpcode ISA in layer/mosaic/isa.py - defines 10 opcodes (NOP, READ_MEM, WRITE_MEM, CLEAR_MEM, IDLE, GATE_UP, GATE_DOWN, SCAN, COMMIT, RESPOND) 9 opcode_head in MemoryBlockLayer - emit-only, does not control behavior 10 sleep_replay_per_pair in datasets - training data exists, no runtime loop 11 EventResponder + ModelHandler in infer/event_runtime.py - reactive only (prompt-triggered) 12 mosaic_idle.yml preset - agent process exists but is a research loop, not dVM idle 13 What's missing: 14 15 Opcodes don't control registers/memory (just logged) 16 No impulse-driven awakening (only prompt-triggered) 17 No idle-time compute loop (model waits passively) 18 No medium/slow timescale learning (only fast writes) 19 No tool creation mechanism (only tool usage) 20 --- 21 22 Phase 1: Wire Opcodes to Control Behavior 23 Currently opcodes are emit-only. Wire them to actually gate subsystem operations. 24 25 Files to modify: 26 27 layer/mosaic/block.py - Add opcode dispatch logic 28 Implementation: 29 30 # In _process_token or forward loop: 31 if self.opcodes_enabled: 32 op_logits = self.opcode_head(u_t) 33 op_id = op_logits.argmax(dim=-1) # Hard decode 34 35 # Use STE for training gradients 36 op_soft = F.softmax(op_logits, dim=-1) 37 op_hard = F.one_hot(op_id, self.opcode_vocab) 38 op_sel = (op_hard - op_soft).detach() + op_soft 39 40 # Gate operations based on opcode 41 do_read = op_sel[:, MosaicOpcode.READ_MEM] 42 do_write = op_sel[:, MosaicOpcode.WRITE_MEM] * write_gate 43 do_clear = op_sel[:, MosaicOpcode.CLEAR_MEM] 44 Behavior mapping: 45 46 | Opcode | Effect | 47 48 |--------|--------| 49 50 | READ_MEM | Enable memory read path | 51 52 | WRITE_MEM | Gate write operation (AND with existing write_gate) | 53 54 | CLEAR_MEM | Zero target register/memory slot | 55 56 | GATE_UP/DOWN | Boost/suppress fusion gates | 57 58 | IDLE | Suppress output (internal consolidation) | 59 60 | COMMIT | Trigger commitment delta | 61 62 --- 63 64 Phase 2: Integrate Homeostatic Impulses into Runtime 65 Connect HomeostaticLoop to the inference runtime so the model can self-activate. 66 67 Files to modify: 68 69 infer/event_runtime.py - Add impulse handling 70 New file: infer/autonomous_runtime.py - Idle loop with homeostasis 71 Key metrics to expose: 72 73 Memory utilization (write rate, occupancy) 74 Routing entropy 75 Commitment balance (open vs closed) 76 Output confidence (logit entropy) 77 Implementation skeleton: 78 79 @dataclass 80 class AutonomousRuntime:

Script executed:

# Check file size and line count wc -l ./research/ccp/notes\ copy.md

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

261 ./research/ccp/notes copy.md

Script executed:

# Search for any mentions of fp32, FP32, metal, or Metal in this file rg -i "fp32|metal" ./research/ccp/notes\ copy.md

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 1

Script executed:

# Check git status and recent commits to understand PR context git log --oneline -20 2>/dev/null | head -20

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

45e6115 Merge branch 'main' into feature/fp32-support-for-metal

Script executed:

# Look for any other files with fp32 or metal in their names or recent changes git diff --name-only HEAD~5..HEAD 2>/dev/null | head -20

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

Script executed:

# Check git status to see all modified/added files in the current branch git diff --name-only main..HEAD 2>/dev/null | head -30

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

.vscode/c_cpp_properties.json artifacts/checkpoints/.gitattributes artifacts/checkpoints/LICENSE.txt artifacts/checkpoints/README.md artifacts/checkpoints/USE_POLICY.md artifacts/checkpoints/config.json artifacts/checkpoints/generation_config.json artifacts/checkpoints/original/params.json artifacts/checkpoints/original/tokenizer.model artifacts/checkpoints/special_tokens_map.json artifacts/checkpoints/tokenizer.json artifacts/checkpoints/tokenizer_config.json artifacts/mosaic/paper.pdf config/compute.py config/compute_test.py config/defaults.py config/layer.py config/presets/mosaic_resonant.yml config/target.py core/__init__.py core/event_codec/__init__.py core/event_codec/binary_codec.py core/event_codec/binary_codec_test.py core/event_codec/capnp_codec.py core/event_codec/capnp_codec_test.py core/event_codec/event.capnp core/event_codec/json_codec.py core/event_codec/stream_parser.py examples/lightning_vastai_demo.yml experiment/runner.py

Script executed:

# Search the entire codebase for actual fp32 or metal implementations rg -i "fp32" --type py --max-count 10

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

trainer/standard.py: # TF32 is safe and commonly enabled on Ampere+ for faster fp32 matmuls trainer/standard.py: # (many reductions / softmax stats run in fp32 even when weights are bf16). trainer/standard.py: "Supported: MPS fp16/fp32, CUDA fp16/bf16." trainer/steppers/global_stepper.py: # IMPORTANT: compute CE in fp32 for numerical stability (esp. MPS + 128k vocab). trainer/steppers/global_stepper.py: logger.warning("Disabling fp16 autocast on MPS (no GradScaler); using fp32 math for stability.") trainer/steppers/global_orchestrated.py: # MPS: fp16 *weights* + AdamW-style updates is a common source of NaNs (no fp32 master weights). trainer/chunked_cross_entropy.py: # Use fp32 for stability/perf, but preserve fp64 for gradcheck correctness. trainer/chunked_cross_entropy.py: # Compute LSE in fp32 for stability; matmul uses tensor cores for bf16/fp16. trainer/chunked_cross_entropy.py: # Recompute LSE in compute_dtype (fp32 fast path, fp64 for gradcheck). trainer/diffusion_codegen/training.py: if d in {"float32", "fp32"}: orchestrator/strategy.py: # MPS + fp16 weights + AdamW can poison weights without fp32 master params. orchestrator/nowcast.py: "predictor_dtype": str(self._pred_dtype) if self._pred_dtype is not None else "fp32", optimizer/dba_attention_triton.py: self._require(q_sem.dtype in (torch.float16, torch.bfloat16, torch.float32), msg="DBA attention supports fp16/bf16/fp32.") optimizer/dba_attention_triton_kernels_fwd.py: # (accumulation still happens in fp32). optimizer/fused_attention.py: """Single source of truth for decoupled score computation in fp32. optimizer/fused_attention.py: fp32 accumulation avoids mixed-precision drift when validating policies. optimizer/adamw_master.py:"""AdamW with fp32 master weights for fp16 models (MPS-safe). optimizer/adamw_master.py:- The standard fix is to keep fp32 "master" weights for the optimizer update, optimizer/adamw_master.py:but maintains AdamW state + updates in fp32. optimizer/adamw_master.py: """AdamW optimizer with fp32 master weights stored in state.""" optimizer/adamw_master.py: # fp32 master copy + fp32 moments optimizer/kernels.py: msg=f"RMSNorm on MPS requires fp16/fp32, got dtype={x.dtype}.", optimizer/kernels.py: msg=f"RoPE on MPS requires fp16/fp32, got dtype={x.dtype}.", optimizer/kernels.py: msg=f"LayerNorm on MPS requires fp16/fp32, got dtype={x.dtype}.", optimizer/kernels.py: msg=f"AdamW step on MPS requires fp16/fp32 params, got dtype={p.dtype}.", optimizer/kernels.py: msg="AdamW step on CUDA requires fp32 master/exp_avg/exp_avg_sq.", optimizer/kernels.py: msg=f"Lion step on MPS requires fp16/fp32 params, got dtype={p.dtype}.", optimizer/metal/adamw.py: raise RuntimeError("Metal AdamWMaster requires fp16/fp32 p/grad (matching)") optimizer/metal/adamw.py: raise RuntimeError("Metal AdamWMaster requires fp32 master/exp_avg/exp_avg_sq") optimizer/metal/rmsnorm.py: raise RuntimeError("Metal RMSNorm currently supports fp16/fp32 only") optimizer/metal/rmsnorm.py: """Fused RMSNorm (MPS/Metal) for fp16/fp32 tensors. optimizer/metal/rmsnorm.py: x: (..., D) fp16/fp32 tensor on MPS (contiguous required) optimizer/metal/rmsnorm.py: weight: (D,) fp16/fp32 tensor on MPS, or None for no affine scale optimizer/metal/rmsnorm.py: raise RuntimeError("Metal RMSNorm currently supports fp16/fp32 only") optimizer/metal/rope.py: raise RuntimeError("Metal RoPE currently supports fp16/fp32 only") optimizer/metal/rope.py: """Apply RoPE using the Metal extension (fp16/fp32). optimizer/metal/rope.py: raise RuntimeError("Metal RoPE currently supports fp16/fp32 only") optimizer/metal/lion.py: raise RuntimeError("Metal Lion currently supports fp16/fp32 tensors only (matching)") optimizer/metal/layernorm.py: raise RuntimeError("Metal LayerNorm currently supports fp16/fp32 only") optimizer/metal/layernorm.py: """Fused LayerNorm (MPS/Metal) for fp16/fp32 tensors. optimizer/metal/layernorm.py: raise RuntimeError("Metal LayerNorm currently supports fp16/fp32 only") optimizer/metal/ssm_test.py: # Reference in fp32 for numerics, cast back to fp16 like other Metal tests. optimizer/metal/rope_test.py: # Reference in fp32 for numerics. optimizer/metal/rmsnorm_test.py: # Reference in fp32 for better numerics. optimizer/metal/layernorm_test.py: # Reference in fp32 for numerics. optimizer/metal/dba_decode_test.py: # Reference (PyTorch) in fp32 for numerics. optimizer/flash_attention_triton_kernels_fwd.py: # (accumulation still happens in fp32). optimizer/flash_attention_triton.py: self._require(q.dtype in (torch.float16, torch.bfloat16, torch.float32), msg="FlashAttention supports fp16/bf16/fp32.") optimizer/adamw_triton_kernels.py:- State: fp32 master weights + fp32 exp_avg + fp32 exp_avg_sq (updated) optimizer/adamw_triton_kernels.py: # Load grad + state. Grad/param are fp16 or bf16; compute in fp32. optimizer/adamw_triton.py: _require(master.dtype == torch.float32, msg="AdamW master must be fp32.") optimizer/adamw_triton.py: _require(exp_avg.dtype == torch.float32 and exp_avg_sq.dtype == torch.float32, msg="AdamW moments must be fp32.") infer/cache_policy.py: if kind in (KVCacheKind.FP16, KVCacheKind.FP32): data/transforms/cast_dtype.py: Parses common dtype name variations (like "fp32" or "float32") into PyTorch data/transforms/cast_dtype.py: if t in ("float32", "fp32"): config/kvcache.py: FP16/FP32: Full precision (highest quality, most memory) config/kvcache.py: FP32 = "fp32" cache/tensor.py: - fp16/fp32: Full precision storage cache/tensor.py: if self.kind in ("fp16", "fp32"): cache/tensor.py: return self.kind not in ("fp16", "fp32") cache/tensor.py: if self.kind in ("fp16", "fp32"): cache/tensor.py: raise RuntimeError("Expected fp buffer for fp16/fp32 cache") cache/tensor.py: if self.kind in ("fp16", "fp32"): cache/tensor.py: raise RuntimeError("Expected fp buffer for fp16/fp32 cache") cache/tensor.py: if self.kind in ("fp16", "fp32"): cache/tensor.py: raise RuntimeError("Expected fp buffer for fp16/fp32 cache") benchmark/memory.py: "fp32": 4.0, cache/layer.py: synchronized. Supports both fp16/fp32 storage and quantized formats. carmath/carmath_test.py: assert bytes_per_kind("fp32") == 4.0 carmath/bytes.py: - fp32, fp16 carmath/bytes.py: if k == "fp32":

Script executed:

# Search for metal-related code in Python files rg -i "metal" --type py --max-count 10

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

runtime/readiness_test.py: monkeypatch.setattr(readiness, "metal_supported", False) runtime/readiness_test.py: monkeypatch.setattr(readiness, "metal_supported", False) runtime/readiness_test.py: monkeypatch.setattr(readiness, "metal_build_tools_available", True) runtime/readiness_test.py:def test_metal_build_tools_missing_is_error_or_warning_based_on_best_effort(monkeypatch) -> None: runtime/readiness_test.py: monkeypatch.setattr(readiness, "metal_supported", True) runtime/readiness_test.py: monkeypatch.setattr(readiness, "metal_build_tools_available", False) runtime/readiness_test.py: assert any(i.code == "metal_build_tools_missing" for i in rep_err.errors) runtime/readiness_test.py: assert any(i.code == "metal_build_tools_missing" for i in rep_warn.warnings) runtime/readiness.py:- Missing *performance* backends (Metal/Triton) are **errors** by default. runtime/readiness.py: metal_build_tools_available, runtime/readiness.py: metal_supported, runtime/readiness.py: # Metal (MPS) fused decode is used for fp16 caches on MPS. runtime/readiness.py: if not _probe_flag(metal_supported): runtime/readiness.py: elif not _probe_flag(metal_build_tools_available): runtime/readiness.py: code="metal_build_tools_missing", runtime/readiness.py: "Metal build tools are not available (missing `metal`/`metallib` in the active Xcode toolchain); " runtime/readiness.py: " - `xcrun -sdk macosx --find metal`\n" runtime/readiness.py: " - `xcrun -sdk macosx --find metallib`" layer/ssm.py: from caramba.optimizer.metal import MetalSSMSelectiveScan layer/ssm.py: scan = MetalSSMSelectiveScan() layer/attention/decoupled/layer.py: allowed = {"auto", "triton", "sdpa", "metal"} layer/attention/decoupled/layer.py: # "metal" is a manifest-friendly alias for the SDPA-style path on MPS. layer/attention/decoupled/layer.py: force_sdpa = (dba_backend == "sdpa" or dba_backend == "metal") layer/attention/decoupled/decode.py: from caramba.optimizer.metal import dba_decode_fp16, metal_dba_decode_available layer/attention/decoupled/decode.py: if not metal_dba_decode_available(): layer/attention/decoupled/decode.py: raise RuntimeError("Metal DBA decode kernel is unavailable on this runtime.") layer/attention/decoupled/decode.py: "Metal DBA decode requires fp16 KV caches on MPS.\n" experiment/runner.py: if best_effort and w.code in {"metal_build_tools_missing", "triton_missing"}: core/platform.py: "metal": "mps", config/layer.py: # Note: "metal" is accepted for MPS/Metal runs and maps to an SDPA-style path config/layer.py: dba_train_backend: Literal["auto", "triton", "sdpa", "metal"] = "auto" config/defaults.py: # If true, allow "best effort" fallbacks for optional perf backends (Metal/Triton), optimizer/kernels.py:_METAL_IMPORT_ERROR: Exception | None = None optimizer/kernels.py:_MetalSSMSelectiveScan: Any | None = None optimizer/kernels.py: from caramba.optimizer.metal import ( optimizer/kernels.py: MetalSSMSelectiveScan as _MetalSSMSelectiveScan, optimizer/kernels.py: _METAL_IMPORT_ERROR = e optimizer/kernels.py: bool(KERNELS.mps_available and KERNELS.metal_ops_loaded), optimizer/kernels.py: msg="RMSNorm on MPS requires the Metal extension to be available and loaded at startup.", optimizer/kernels.py: raise RuntimeError(f"Metal RMSNorm import failed: {_METAL_IMPORT_ERROR!r}") optimizer/kernels.py: bool(KERNELS.mps_available and KERNELS.metal_ops_loaded), optimizer/kernels.py: msg="RoPE on MPS requires the Metal extension to be available and loaded at startup.", optimizer/adamw_master.py: # Fast path: fused HAL kernel (Metal on MPS, Triton on CUDA). optimizer/triton_runtime.py:The Metal/MPS fused DBA work requires backend-agnostic detection, which now lives in optimizer/runtime.py:"""Backend availability detection (Triton + Metal/MPS). optimizer/runtime.py:- Metal (MPS) for Apple Silicon fused DBA decode (custom MSL kernel + ObjC++ bridge) optimizer/runtime.py: "metal_supported", optimizer/runtime.py:def metal_build_tools_available() -> bool: optimizer/runtime.py: """Whether the host can compile Metal shaders via Xcode toolchain. optimizer/runtime.py: must contain the `metal` and `metallib` tools. optimizer/runtime.py: # Do not call metal_supported() here: metal_supported() is a *runtime* check, optimizer/runtime.py: subprocess.check_output(["xcrun", "-sdk", "macosx", "--find", "metal"], stderr=subprocess.STDOUT) optimizer/runtime.py: subprocess.check_output(["xcrun", "-sdk", "macosx", "--find", "metallib"], stderr=subprocess.STDOUT) optimizer/runtime.py:def metal_supported() -> bool: optimizer/metal/layernorm_test.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/layernorm_test.py:def metal_ops() -> object: optimizer/metal/layernorm_test.py: """Build/load the Metal extension or skip the suite.""" optimizer/metal/layernorm_test.py: from caramba.optimizer.metal.jit import load_caramba_metal_ops optimizer/metal/layernorm_test.py: return load_caramba_metal_ops(verbose=False) optimizer/metal/layernorm_test.py: pytest.skip(f"caramba metal extension unavailable: {e}") optimizer/metal/layernorm_test.py:@pytest.mark.skipif(not metal_supported(), reason="Metal/MPS not supported on this platform") optimizer/metal/layernorm_test.py:def test_metal_layernorm_matches_reference(metal_ops: object) -> None: optimizer/metal/layernorm_test.py: _ = metal_ops optimizer/metal/layernorm_test.py: from caramba.optimizer.metal import layernorm_fp16 optimizer/metal/rope.py:"""Fused RoPE wrapper for the Metal extension.""" optimizer/metal/rope.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/rope.py:from .jit import load_caramba_metal_ops optimizer/metal/rope.py:def metal_rope_available() -> bool: optimizer/metal/rope.py: """Whether the runtime is capable of using the Metal RoPE path.""" optimizer/metal/rope.py: return metal_supported() optimizer/metal/rope.py:class _MetalRoPEFn(torch.autograd.Function): optimizer/metal/rope.py: raise RuntimeError("Metal RoPE requires device.type == 'mps'") optimizer/metal/rope.py: raise RuntimeError("Metal RoPE currently supports fp16/fp32 only") optimizer/metal/rope.py: ops = load_caramba_metal_ops(verbose=bool(verbose_build)) optimizer/metal/rmsnorm.py:"""Fused RMSNorm wrapper for the Metal extension.""" optimizer/metal/rmsnorm.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/rmsnorm.py:from .jit import load_caramba_metal_ops optimizer/metal/rmsnorm.py:def metal_rmsnorm_available() -> bool: optimizer/metal/rmsnorm.py: """Whether the runtime is capable of using the Metal RMSNorm path.""" optimizer/metal/rmsnorm.py: return metal_supported() optimizer/metal/rmsnorm.py:class _MetalRMSNormFn(torch.autograd.Function): optimizer/metal/rmsnorm.py: raise RuntimeError("Metal RMSNorm requires device.type == 'mps'") optimizer/metal/rmsnorm.py: raise RuntimeError("Metal RMSNorm currently supports fp16/fp32 only") optimizer/metal/rmsnorm.py: ops = load_caramba_metal_ops(verbose=bool(verbose_build)) optimizer/metal/lion.py:"""Fused Lion optimizer update wrapper for the Metal extension.""" optimizer/metal/lion.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/lion.py:from .jit import load_caramba_metal_ops optimizer/metal/lion.py:def metal_lion_available() -> bool: optimizer/metal/lion.py: return metal_supported() optimizer/metal/lion.py: raise RuntimeError("Metal Lion requires device.type == 'mps'") optimizer/metal/lion.py: raise RuntimeError("Metal Lion currently supports fp16/fp32 tensors only (matching)") optimizer/metal/lion.py: f"Metal Lion requires matching shapes for p/grad/m, got p={tuple(p.shape)}, grad={tuple(grad.shape)}, m={tuple(m.shape)}" optimizer/metal/lion.py: ops = load_caramba_metal_ops(verbose=bool(verbose_build)) optimizer/metal/layernorm.py:"""Fused LayerNorm wrapper for the Metal extension.""" optimizer/metal/layernorm.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/layernorm.py:from .jit import load_caramba_metal_ops optimizer/metal/layernorm.py:def metal_layernorm_available() -> bool: optimizer/metal/layernorm.py: """Whether the runtime is capable of using the Metal LayerNorm path.""" optimizer/metal/layernorm.py: return metal_supported() optimizer/metal/layernorm.py:class _MetalLayerNormFn(torch.autograd.Function): optimizer/metal/layernorm.py: raise RuntimeError("Metal LayerNorm requires device.type == 'mps'") optimizer/metal/layernorm.py: raise RuntimeError("Metal LayerNorm currently supports fp16/fp32 only") optimizer/metal/layernorm.py: raise RuntimeError("Metal LayerNorm expects x.dim() >= 1") optimizer/metal/ssm.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/ssm.py:from .jit import load_caramba_metal_ops optimizer/metal/ssm.py:class MetalSSMScanAvailability: optimizer/metal/ssm.py: return metal_supported() optimizer/metal/ssm.py:class MetalSSMSelectiveScanFn(torch.autograd.Function): optimizer/metal/ssm.py: raise RuntimeError("Metal SSM selective scan requires device.type == 'mps'") optimizer/metal/ssm.py: raise RuntimeError("Metal SSM selective scan currently supports fp16 only") optimizer/metal/ssm.py: ops = load_caramba_metal_ops(verbose=bool(verbose_build)) optimizer/metal/ssm.py: raise RuntimeError("Metal SSM selective scan backward requires grad_y") optimizer/metal/ssm.py: raise RuntimeError("Metal SSM selective scan backward requires MPS grad_y") optimizer/metal/rope_test.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/rope_test.py:@pytest.mark.skipif(not metal_supported(), reason="Metal/MPS not supported on this platform") optimizer/metal/rope_test.py:def test_metal_rope_matches_reference() -> None: optimizer/metal/rope_test.py: from caramba.optimizer.metal.jit import load_caramba_metal_ops optimizer/metal/rope_test.py: _ = load_caramba_metal_ops(verbose=False) optimizer/metal/rope_test.py: pytest.skip(f"caramba metal extension unavailable: {e}") optimizer/metal/rope_test.py: from caramba.optimizer.metal import rope_fp16 optimizer/metal/rope_test.py: out_metal = rope_fp16(x=x, cos=cos, sin=sin, rot_dim=rot) optimizer/metal/rope_test.py: assert out_metal.shape == out_ref.shape optimizer/metal/rope_test.py: assert torch.allclose(out_metal, out_ref, atol=3e-2, rtol=3e-2) optimizer/metal/ssm_test.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/ssm_test.py:def _skip_if_no_metal_extension() -> None: optimizer/metal/ssm_test.py: from caramba.optimizer.metal.jit import load_caramba_metal_ops optimizer/metal/ssm_test.py: _ = load_caramba_metal_ops(verbose=False) optimizer/metal/ssm_test.py: pytest.skip(f"caramba metal extension unavailable: {e}") optimizer/metal/ssm_test.py:@pytest.mark.skipif(not metal_supported(), reason="Metal/MPS not supported on this platform") optimizer/metal/ssm_test.py:def test_metal_ssm_scan_forward_matches_reference() -> None: optimizer/metal/ssm_test.py: _skip_if_no_metal_extension() optimizer/metal/ssm_test.py: from caramba.optimizer.metal import MetalSSMSelectiveScan optimizer/metal/ssm_test.py: y_metal = MetalSSMSelectiveScan().run(x=x, dt=dt, A=A, B=B, C=C, D=D, verbose_build=False) optimizer/metal/rmsnorm_test.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/rmsnorm_test.py:@pytest.mark.skipif(not metal_supported(), reason="Metal/MPS not supported on this platform") optimizer/metal/rmsnorm_test.py:def test_metal_rmsnorm_matches_reference() -> None: optimizer/metal/rmsnorm_test.py: from caramba.optimizer.metal.jit import load_caramba_metal_ops optimizer/metal/rmsnorm_test.py: _ = load_caramba_metal_ops(verbose=False) optimizer/metal/rmsnorm_test.py: pytest.skip(f"caramba metal extension unavailable: {e}") optimizer/metal/rmsnorm_test.py: from caramba.optimizer.metal import rmsnorm_fp16 optimizer/metal/rmsnorm_test.py: out_metal = rmsnorm_fp16(x=x, weight=w, eps=eps) optimizer/metal/rmsnorm_test.py: assert out_metal.shape == out_ref.shape optimizer/metal/rmsnorm_test.py: assert torch.allclose(out_metal, out_ref, atol=2e-2, rtol=2e-2) optimizer/metal/setup.py:def compile_metal(out_dir: Path) -> None: optimizer/metal/setup.py: HERE / "dba_decode.metal", optimizer/metal/setup.py: HERE / "rmsnorm.metal", optimizer/metal/setup.py: HERE / "layernorm.metal", optimizer/metal/setup.py: HERE / "rope.metal", optimizer/metal/setup.py: HERE / "lion.metal", optimizer/metal/setup.py: HERE / "adamw.metal", optimizer/metal/setup.py: HERE / "ssm_scan.metal", optimizer/metal/setup.py: raise FileNotFoundError(f"Missing Metal source: {src}") optimizer/metal/setup.py: metallib = out_dir / "caramba_ops.metallib" optimizer/metal/adamw.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/adamw.py:from .jit import load_caramba_metal_ops optimizer/metal/adamw.py:def metal_adamw_available() -> bool: optimizer/metal/adamw.py: return metal_supported() optimizer/metal/adamw.py: raise RuntimeError("Metal AdamWMaster requires device.type == 'mps'") optimizer/metal/adamw.py: raise RuntimeError("Metal AdamWMaster requires fp16/fp32 p/grad (matching)") optimizer/metal/adamw.py: raise RuntimeError("Metal AdamWMaster requires fp32 master/exp_avg/exp_avg_sq") optimizer/metal/adamw.py: raise RuntimeError("Metal AdamWMaster requires p and grad shapes to match") optimizer/metal/adamw.py: raise RuntimeError("Metal AdamWMaster requires state tensors to match param shape") optimizer/metal/adamw.py: ops = load_caramba_metal_ops(verbose=bool(verbose_build)) optimizer/metal/lion_test.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/lion_test.py:@pytest.mark.skipif(not metal_supported(), reason="Metal/MPS not supported on this platform") optimizer/metal/lion_test.py:def test_metal_lion_step_matches_reference() -> None: optimizer/metal/lion_test.py: from caramba.optimizer.metal.jit import load_caramba_metal_ops optimizer/metal/lion_test.py: _ = load_caramba_metal_ops(verbose=False) optimizer/metal/lion_test.py: pytest.skip(f"caramba metal extension unavailable: {e}") optimizer/metal/lion_test.py: from caramba.optimizer.metal import lion_fp16 optimizer/metal/dba_decode.py:"""Fused DBA decode wrapper for the Metal extension.""" optimizer/metal/dba_decode.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/dba_decode.py:from .jit import load_caramba_metal_ops optimizer/metal/dba_decode.py:def metal_dba_decode_available() -> bool: optimizer/metal/dba_decode.py: """Whether the runtime is capable of using the Metal DBA decode path. optimizer/metal/dba_decode.py: return metal_supported() optimizer/metal/dba_decode.py: """Fused DBA decode (MPS/Metal) for fp16 KV caches. optimizer/metal/dba_decode.py: raise RuntimeError("Metal DBA decode requires device.type == 'mps'") optimizer/metal/dba_decode.py: raise ValueError(f"Metal DBA decode requires v_hd <= 256 (got v_hd={v_hd})") optimizer/metal/dba_decode.py: ops = load_caramba_metal_ops(verbose=bool(verbose_build)) optimizer/metal/attention_training_test.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/attention_training_test.py:def _skip_if_no_metal_attention_extension() -> None: optimizer/metal/attention_training_test.py: from caramba.optimizer.metal.attention_jit import load_caramba_metal_attention_ops optimizer/metal/attention_training_test.py: _ = load_caramba_metal_attention_ops(verbose=False) optimizer/metal/attention_training_test.py: pytest.skip(f"caramba metal attention extension unavailable: {e}") optimizer/metal/attention_training_test.py:@pytest.mark.skipif(not metal_supported(), reason="Metal/MPS not supported on this platform") optimizer/metal/attention_training_test.py:def test_metal_attention_training_forward_matches_sdpa(causal: bool) -> None: optimizer/metal/attention_training_test.py: _skip_if_no_metal_attention_extension() optimizer/metal/attention_training_test.py: from caramba.optimizer.metal.attention_training import MetalAttentionTraining optimizer/metal/attention_training_test.py: y = MetalAttentionTraining().run(q=q, k=k, v=v, causal=causal, scale=scale, dropout_p=0.0) optimizer/metal/attention_jit.py:"""JIT build + load the Metal attention training extension. optimizer/metal/attention_jit.py:This is intentionally separate from `optimizer/metal/jit.py` to keep the optimizer/metal/attention_jit.py:from caramba.optimizer.runtime import metal_supported, metal_build_tools_available optimizer/metal/attention_jit.py: " `xcrun -sdk macosx --find metal`\n" optimizer/metal/attention_jit.py: " `xcrun -sdk macosx --find metallib`\n" optimizer/metal/attention_jit.py:def _compile_attention_metallib(*, out_dir: Path, verbose: bool) -> Path: optimizer/metal/attention_jit.py: """Compile attention training Metal shaders -> metallib in `out_dir`.""" optimizer/metal/attention_jit.py: src = _this_dir() / "attention_train.metal" optimizer/metal/attention_jit.py: metallib = out_dir / "caramba_attention_ops.metallib" optimizer/metal/attention_jit.py: metal = _xcrun_find("metal") optimizer/metal/rmsnorm_autograd_test.py:class TestMetalRMSNormAutograd: optimizer/metal/rmsnorm_autograd_test.py: # Metal optimizer/metal/rmsnorm_autograd_test.py: from caramba.optimizer.metal.rmsnorm import rmsnorm_fp16 optimizer/metal/dba_decode_test.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/dba_decode_test.py:@pytest.mark.skipif(not metal_supported(), reason="Metal/MPS not supported on this platform") optimizer/metal/dba_decode_test.py:def test_metal_dba_decode_matches_reference() -> None: optimizer/metal/dba_decode_test.py: from caramba.optimizer.metal.jit import load_caramba_metal_ops optimizer/metal/dba_decode_test.py: _ = load_caramba_metal_ops(verbose=False) optimizer/metal/dba_decode_test.py: pytest.skip(f"caramba metal extension unavailable: {e}") optimizer/metal/dba_decode_test.py: from caramba.optimizer.metal import dba_decode_fp16 optimizer/metal/dba_decode_test.py: out_metal = dba_decode_fp16( optimizer/metal/dba_decode_test.py: assert out_metal.shape == out_ref.shape optimizer/metal/dba_decode_test.py: assert torch.allclose(out_metal, out_ref, atol=3e-2, rtol=3e-2) optimizer/lion.py:"""Lion optimizer (with optional Metal fast path). optimizer/lion.py:for MPS/Metal fused update experiments. optimizer/lion.py: # Metal fused path (fp16 on MPS). optimizer/metal/layernorm_autograd_test.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/layernorm_autograd_test.py:def _skip_if_no_metal_extension() -> None: optimizer/metal/layernorm_autograd_test.py: from caramba.optimizer.metal.jit import load_caramba_metal_ops optimizer/metal/layernorm_autograd_test.py: _ = load_caramba_metal_ops(verbose=False) optimizer/metal/layernorm_autograd_test.py: pytest.skip(f"caramba metal extension unavailable: {e}") optimizer/metal/layernorm_autograd_test.py:@pytest.mark.skipif(not metal_supported(), reason="Metal/MPS not supported on this platform") optimizer/metal/layernorm_autograd_test.py: _skip_if_no_metal_extension() optimizer/metal/layernorm_autograd_test.py: from caramba.optimizer.metal.layernorm import layernorm_fp16 optimizer/metal/layernorm_autograd_test.py:@pytest.mark.skipif(not metal_supported(), reason="Metal/MPS not supported on this platform") optimizer/metal/layernorm_autograd_test.py: _skip_if_no_metal_extension() optimizer/kernel_registry.py: metal_build_tools_available, optimizer/kernel_registry.py: metal_supported, optimizer/kernel_registry.py: metal_supported: bool optimizer/kernel_registry.py: metal_build_tools_available: bool optimizer/kernel_registry.py: metal_ops_loaded: bool optimizer/kernel_registry.py: metal_supported=bool(metal_supported()), optimizer/kernel_registry.py: metal_build_tools_available=bool(metal_build_tools_available()), optimizer/kernel_registry.py: metal_ops_loaded=False, optimizer/kernel_registry.py: # ---- Metal/MPS validation (compile+load extension at startup) ---- optimizer/kernel_registry.py: metal_ops_loaded = False optimizer/metal/__init__.py:"""Metal (MPS) fused kernels for Apple Silicon. optimizer/metal/__init__.py:attention *decode* path for MPS, backed by a custom Metal Shading Language kernel optimizer/metal/__init__.py:softmax/value matmuls. The Metal kernel performs a numerically-stable, two-pass optimizer/metal/__init__.py:from .dba_decode import dba_decode_fp16, metal_dba_decode_available optimizer/metal/__init__.py:from .attention_training import MetalAttentionTraining, metal_attention_training_available optimizer/metal/__init__.py:from .adamw import AdamWMasterStep, metal_adamw_available optimizer/metal/__init__.py:from .rmsnorm import metal_rmsnorm_available, rmsnorm_fp16 optimizer/metal/__init__.py:from .layernorm import layernorm_fp16, metal_layernorm_available optimizer/metal/__init__.py:from .rope import metal_rope_available, rope_fp16 optimizer/metal/__init__.py:from .lion import lion_fp16, metal_lion_available optimizer/attention.py:- MPS training uses custom Metal fused attention (forward+backward). optimizer/attention.py:from caramba.optimizer.metal.attention_training import MetalAttentionTraining optimizer/attention.py: return MetalAttentionTraining().run( optimizer/metal/jit.py:"""JIT build + load the Metal extension. optimizer/metal/jit.py:from caramba.optimizer.runtime import metal_build_tools_available, metal_supported optimizer/metal/jit.py: " `xcrun -sdk macosx --find metal`\n" optimizer/metal/jit.py: " `xcrun -sdk macosx --find metallib`\n" optimizer/metal/jit.py:def _compile_metallib(*, out_dir: Path, verbose: bool) -> Path: optimizer/metal/jit.py: """Compile Metal shaders -> `caramba_ops.metallib` in `out_dir`.""" optimizer/metal/jit.py: _this_dir() / "dba_decode.metal", optimizer/metal/jit.py: _this_dir() / "rmsnorm.metal", optimizer/metal/jit.py: _this_dir() / "layernorm.metal", optimizer/metal/jit.py: _this_dir() / "rope.metal", optimizer/adamw_triton_test.py: # Matches Metal kernel semantics exactly. optimizer/metal/attention_training.py:"""Metal fused attention training (forward + backward). optimizer/metal/attention_training.py:from caramba.optimizer.runtime import metal_supported optimizer/metal/attention_training.py:from .attention_jit import load_caramba_metal_attention_ops optimizer/metal/attention_training.py:def metal_attention_training_available() -> bool: optimizer/metal/attention_training.py: """Whether the runtime is capable of using the Metal attention training path.""" optimizer/metal/attention_training.py: return metal_supported() optimizer/metal/attention_training.py:class _MetalAttnTrainFn(torch.autograd.Function): optimizer/metal/attention_training.py: raise RuntimeError("MetalAttentionTraining requires device.type == 'mps'") optimizer/metal/attention_training.py: raise RuntimeError(f"MetalAttentionTraining requires fp16 inputs on MPS (got dtype={q.dtype})") optimizer/metal/attention_training.py: raise RuntimeError("MetalAttentionTraining requires q/k/v shapes to match (B,H,T,D)") optimizer/adamw_triton_kernels.py:Implements the same update semantics as the Metal `adamw_master_step_fp16` kernel:

Script executed:

# Get the full PR diff summary to understand scope git diff --stat main..HEAD 2>/dev/null | head -30

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

.vscode/c_cpp_properties.json | 26 + artifacts/checkpoints/.gitattributes | 35 + artifacts/checkpoints/LICENSE.txt | 111 + artifacts/checkpoints/README.md | 473 + artifacts/checkpoints/USE_POLICY.md | 52 + artifacts/checkpoints/config.json | 35 + artifacts/checkpoints/generation_config.json | 9 + artifacts/checkpoints/original/params.json | 12 + artifacts/checkpoints/original/tokenizer.model | 128000 ++++++ artifacts/checkpoints/special_tokens_map.json | 16 + artifacts/checkpoints/tokenizer.json | 410563 ++++++++++++++++++ artifacts/checkpoints/tokenizer_config.json | 2061 + artifacts/mosaic/paper.pdf | Bin 589308 -> 598293 bytes config/compute.py | 51 + config/compute_test.py | 54 + config/defaults.py | 9 +- config/layer.py | 9 +- config/presets/mosaic_resonant.yml | 122 + config/target.py | 2 + core/__init__.py | 6 +- core/event_codec/__init__.py | 27 +- core/event_codec/binary_codec.py | 81 - core/event_codec/binary_codec_test.py | 33 - core/event_codec/capnp_codec.py | 164 + core/event_codec/capnp_codec_test.py | 70 + core/event_codec/event.capnp | 38 + core/event_codec/json_codec.py | 103 - core/event_codec/stream_parser.py | 41 - examples/lightning_vastai_demo.yml | 53 + experiment/runner.py | 28 +-

Script executed:

# Check if there are actual implementation files for MOSAIC in this PR git diff --name-only main..HEAD | grep -i mosaic | head -20

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

artifacts/mosaic/paper.pdf config/presets/mosaic_resonant.yml research/ccp/Cognitive_Control_Plane_MOSAIC_Runtime.pdf

Script executed:

# Read more of the notes file to see if it documents the PRs actual changes cat -n ./research/ccp/notes\ copy.md | tail -100

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

163 optimizer: torch.optim.Optimizer 164 update_interval: int = 1000 # tokens between updates 165 replay_buffer_size: int = 10000 166 167 def maybe_step(self, loss: Tensor, token_count: int): 168 self.buffer.append((loss.detach(), token_count)) 169 170 if token_count % self.update_interval == 0: 171 # Compute gradient on adapter params only 172 for p in self.model.parameters(): 173 p.requires_grad = p in self.adapter_params 174 175 # Aggregate recent losses 176 recent_loss = torch.stack([l for l, _ in self.buffer[-100:]]) 177 recent_loss.mean().backward() 178 self.optimizer.step() 179 self.optimizer.zero_grad() 180 --- 181 182 Phase 5: Native Tool Building 183 Implement the mechanism for the model to create new tools, not just use predefined ones. 184 185 Approach: The model emits a structured "tool definition" event that gets validated and registered. 186 187 Files to create/modify: 188 189 New file: ai/tools/builder.py - Tool builder/registry 190 core/event.py - Add ToolDefinition event type 191 infer/event_runtime.py - Handle tool creation events 192 Tool Definition Schema: 193 194 @dataclass 195 class ToolDefinition: 196 name: str 197 description: str 198 parameters: dict[str, ParameterSpec] 199 implementation: str # Python code or MCP endpoint 200 sandbox: bool = True # Run in sandbox by default 201 202 def validate(self) -> list[str]: 203 """Return list of validation errors, empty if valid.""" 204 errors = [] 205 if not self.name.isidentifier(): 206 errors.append(f"Invalid tool name: {self.name}") 207 # ... more validation 208 return errors 209 Event flow: 210 211 ToolRegistry 212 ToolBuilder 213 EventBus 214 Model 215 ToolRegistry 216 ToolBuilder 217 EventBus 218 Model 219 alt 220 [Valid] 221 [Invalid] 222 Emit ToolDefinition event 223 Handle event 224 Validate definition 225 Register new tool 226 Emit ToolRegistered event 227 Emit ToolRejected event 228 Safety constraints: 229 230 Sandboxed execution (no filesystem/network by default) 231 Rate limiting on tool creation 232 Human approval for privileged tools 233 Automatic deprecation of unused tools 234 --- 235 236 Architecture Diagram 237 Tool System 238 Continuous Learning 239 MOSAIC Model 240 Autonomous Runtime 241 ToolBuilder 242 ToolRegistry 243 Fast: Memory Writes 244 Medium: Adapter Updates 245 Slow: Full Training 246 OpcodeHead 247 MosaicBlock 248 Memory 249 Registers 250 EventBus 251 HomeostaticLoop 252 ImpulseRouter 253 --- 254 255 Testing Strategy 256 Each phase includes verification tests: 257 258 Opcodes: Unit test that READ_MEM opcode gates memory read contribution 259 Homeostasis: Test that high-entropy state triggers impulse event 260 Idle compute: Test that consolidation runs during no-event periods 261 Continuous learning: Test adapter weights change during inference 262 Tool building: Test that valid ToolDefinition creates usable tool

Script executed:

# Check the specific PR changes to optimizer/metal and related fp32 files git diff --stat main..HEAD | grep -E "(metal|fp32)" | head -20

Repository: TheApeMachine/caramba

Repository: TheApeMachine/caramba
Exit code: 0

stdout:

optimizer/metal/adamw.metal | 46 +- optimizer/metal/adamw.py | 4 +- optimizer/metal/layernorm.metal | 670 +- optimizer/metal/layernorm.py | 26 +- optimizer/metal/lion.metal | 40 +- optimizer/metal/lion.py | 4 +- optimizer/metal/ops.mm | 234 +- optimizer/metal/rmsnorm.metal | 484 +- optimizer/metal/rmsnorm.py | 27 +- optimizer/metal/rope.metal | 108 +- optimizer/metal/rope.py | 27 +-

Resolve mismatch between PR title and this document's scope.

The PR is titled "Feature/fp32 support for metal" and includes substantial Metal GPU fp32 optimizations (optimizer/metal/ kernels and wrappers). However, this markdown document describes a general MOSAIC Differentiable VM architecture roadmap with no Metal or fp32-specific content.

Either:

This document should be removed from this PR, or

Clarify whether it is intended as reference/design material for this PR (if so, note that in a commit message or comment), or

Move this to a separate documentation or design PR

Also, consider renaming the file to remove "copy" from the filename.

🧰 Tools

🪛 markdownlint-cli2 (0.18.1)

1-1: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)

5-5: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

🤖 Prompt for AI Agents

In @research/ccp/notes copy.md around lines 1 - 10, This file research/ccp/notes copy.md is unrelated to the "Feature/fp32 support for metal" PR and should be either removed or explicitly linked to the PR as design/reference material; either delete the file from the branch, move it into a dedicated documentation/design PR, or add a clear note in the commit message and at the top of the file stating it is reference material for this PR (and rename to remove "copy" from the filename) so reviewers understand its intent.

coderabbitai · 2026-01-10T22:44:30Z

research/ccp/notes_copy.md

+This plan implements the five missing architectural elements discussed in the meeting notes to transform MOSAIC into a fully autonomous "Differentiable Virtual Machine."
+
+Current State Analysis
+What exists:


⚠️ Potential issue | 🟡 Minor

Convert emphasis to a proper heading.

Line 5 uses bold emphasis (**What exists:**) where a Markdown heading would be more semantically appropriate (MD036).

📝 Proposed fix

-**What exists:** +## What Exists

Consider converting other section titles similarly:

Line 13: **What's missing:** → ## What's Missing

Line 27: **Implementation skeleton:** → (already a comment, skip)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

What exists:

## What Exists

🧰 Tools

🪛 markdownlint-cli2 (0.18.1)

5-5: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

🤖 Prompt for AI Agents

In @research/ccp/notes copy.md at line 5, Replace the bolded section title "**What exists:**" with a proper Markdown heading (e.g., "## What exists") to satisfy MD036; also apply the same treatment to similar titles such as changing "**What's missing:**" to "## What's Missing" while leaving comments/sections like "Implementation skeleton" unchanged.

coderabbitai · 2026-01-10T22:44:30Z

research/ccp/notes_copy.md

+---
+
+Phase 1: Wire Opcodes to Control Behavior
+Currently opcodes are emit-only. Wire them to actually gate subsystem operations.
+
+Files to modify:


⚠️ Potential issue | 🟡 Minor

Fix ordered list numbering.

Lines 20–25 have inconsistent list prefixes (2, 3, 4, 5, 6, 7). Markdown list items should be numbered sequentially starting from 1 (MD029). This section appears to be a simple enumeration; use unordered lists (- or *) or restart numbering at 1.

📝 Proposed fix

-2. Opcodes don't control registers/memory (just logged) -3. No impulse-driven awakening (only prompt-triggered) -4. No idle-time compute loop (model waits passively) -5. No medium/slow timescale learning (only fast writes) -6. No tool creation mechanism (only tool usage) +1. Opcodes don't control registers/memory (just logged) +2. No impulse-driven awakening (only prompt-triggered) +3. No idle-time compute loop (model waits passively) +4. No medium/slow timescale learning (only fast writes) +5. No tool creation mechanism (only tool usage)

🧰 Tools

🪛 markdownlint-cli2 (0.18.1)

20-20: Ordered list item prefix
Expected: 1; Actual: 2; Style: 1/1/1

(MD029, ol-prefix)

21-21: Ordered list item prefix
Expected: 1; Actual: 3; Style: 1/1/1

(MD029, ol-prefix)

22-22: Ordered list item prefix
Expected: 1; Actual: 4; Style: 1/1/1

(MD029, ol-prefix)

23-23: Ordered list item prefix
Expected: 1; Actual: 5; Style: 1/1/1

(MD029, ol-prefix)

24-24: Ordered list item prefix
Expected: 1; Actual: 6; Style: 1/1/1

(MD029, ol-prefix)

25-25: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

25-25: Ordered list item prefix
Expected: 1; Actual: 7; Style: 1/1/1

(MD029, ol-prefix)

🤖 Prompt for AI Agents

In @research/ccp/notes copy.md around lines 20 - 25, The ordered list under the "Phase 1: Wire Opcodes to Control Behavior" heading has inconsistent numbering (lines currently show 2–7) and triggers MD029; fix it by either converting the items to an unordered list (using "-" or "*") or renumbering the list to start at 1 and be sequential; update the list items referenced in that section so they are consistently formatted and valid Markdown (e.g., restart numbering at 1 or replace the numeric prefixes with "-" for each item).

coderabbitai · 2026-01-10T22:44:30Z

research/ccp/notes_copy.md

+Phase 1: Wire Opcodes to Control Behavior
+Currently opcodes are emit-only. Wire them to actually gate subsystem operations.
+
+Files to modify:


⚠️ Potential issue | 🟡 Minor

Remove trailing spaces.

Lines 25 and 39 contain trailing whitespace (MD009), which is not allowed per Markdown style guidelines.

🧹 Proposed fix

Use a linter or editor with trim-trailing-whitespace enabled to automatically fix these.

Alternatively, manually remove the trailing space on:

Line 25 (after "modify:")

Line 39 (after "# Gate operations based on opcode")

Also applies to: 39-39

🧰 Tools

🪛 markdownlint-cli2 (0.18.1)

25-25: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

25-25: Ordered list item prefix
Expected: 1; Actual: 7; Style: 1/1/1

(MD029, ol-prefix)

🤖 Prompt for AI Agents

In @research/ccp/notes copy.md at line 25, Remove the trailing whitespace characters on the two affected Markdown lines: delete the extra space after "modify:" (the text "Files to modify:") and after "# Gate operations based on opcode" to satisfy MD009; you can enable trim-trailing-whitespace in your editor or run a markdown linter/formatter to strip trailing spaces automatically and re-run the linter to confirm the fixes.

coderabbitai · 2026-01-10T22:44:30Z

runtime/engine/lightning_engine.py

+        dataset_comp = self.registry.build(target.data, backend="torch")
+        system = self.registry.build(target.system, backend="torch")
+        objective = self.registry.build(target.objective, backend="torch")
+
+        # For Lightning, we typically run one "representative" run or handle multiple?
+        # Standard Caramba runner iterates over target.runs.
+        results = {}
+        for run in target.runs:
+            train_cfg = run.train
+            if not train_cfg:
+                continue
+
+            # Wrap in LightningModule
+            lightning_model = CarambaLightningModule(system, objective, train_cfg)


⚠️ Potential issue | 🟠 Major

System is shared across runs without re-initialization.

The system is built once (line 78) but used for all runs in the loop (line 90). If runs are meant to be independent experiments, this causes weight state to carry over. Either rebuild the system per run or document that runs are sequential fine-tuning.

🔧 Option A: Rebuild per run

results = {} for run in target.runs: train_cfg = run.train if not train_cfg: continue + # Build fresh system per run for independent experiments + system = self.registry.build(target.system, backend="torch") + objective = self.registry.build(target.objective, backend="torch") + # Wrap in LightningModule lightning_model = CarambaLightningModule(system, objective, train_cfg)

🔧 Option B: Add documentation

- # For Lightning, we typically run one "representative" run or handle multiple? - # Standard Caramba runner iterates over target.runs. + # Note: system and objective are shared across runs, so runs represent + # sequential training (weights carry over). For independent runs, + # move system/objective building inside the loop. results = {}

coderabbitai

Review continued from previous batch...

coderabbitai · 2026-01-10T22:45:39Z

core/event_codec/capnp_codec.py

+def _load_schema():
+    """Load the Cap'n Proto schema file."""
+    schema_path = Path(__file__).parent / "event.capnp"
+
+    if not schema_path.exists():
+        raise FileNotFoundError(f"Cap'n Proto schema not found: {schema_path}")
+
+    return capnp.load(str(schema_path))


🧹 Nitpick | 🔵 Trivial

Optional: Add return type annotations to private functions.

While not critical, adding return type annotations improves IDE support and type checking. The static analyzer flags missing annotations for _load_schema() and _get_schema().

📝 Proposed type annotations

-def _load_schema(): +def _load_schema() -> Any: # or the specific capnp module type if available """Load the Cap'n Proto schema file."""

-def _get_schema(): +def _get_schema() -> Any: # or the specific capnp module type if available global _schema

Note: The exact type depends on the pycapnp library's type stubs. Using Any is acceptable if precise types are unavailable.

Also applies to: 34-38

🧰 Tools

🪛 Ruff (0.14.10)

20-20: Missing return type annotation for private function _load_schema

(ANN202)

25-25: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

In @core/event_codec/capnp_codec.py around lines 20 - 27, Add return type annotations for the private helper functions to satisfy the static analyzer: annotate _load_schema() and _get_schema() with an appropriate return type (use typing.Any if pycapnp types are unavailable), and add an import for Any from typing. Ensure the function signatures include the return annotation (e.g., def _load_schema() -> Any:) and keep behavior unchanged.

coderabbitai · 2026-01-10T22:45:39Z

core/event_codec/capnp_codec.py

+        if not buf:
+            raise ValueError("Cap'n Proto serialization produced empty bytes")
+
+        return torch.tensor(list(buf), dtype=torch.long)


🧹 Nitpick | 🔵 Trivial

Consider using torch.uint8 instead of torch.long for byte storage.

The current implementation uses torch.long (int64) to store byte values, which consumes 8x more memory than necessary. Since Cap'n Proto serialization produces bytes (0-255), using torch.uint8 would be more memory-efficient and semantically correct.

♻️ Proposed refactor to use torch.uint8

- return torch.tensor(list(buf), dtype=torch.long) + return torch.tensor(list(buf), dtype=torch.uint8)

Additionally, update the decoder at line 119 to match:

- vals = ids.detach().cpu().to(dtype=torch.int64).tolist() + vals = ids.detach().cpu().to(dtype=torch.uint8).tolist()

And line 93 in encode_padded:

- ids = torch.full((batch_size, max_len), pad, dtype=torch.long) + ids = torch.full((batch_size, max_len), pad, dtype=torch.uint8)

This change reduces memory footprint by 87.5% for encoded events.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

return torch.tensor(list(buf), dtype=torch.long)

return torch.tensor(list(buf), dtype=torch.uint8)

🤖 Prompt for AI Agents

In @core/event_codec/capnp_codec.py at line 73, The tensor created from the byte buffer is using dtype=torch.long and should use dtype=torch.uint8 to store bytes efficiently: change the return in the function that does "return torch.tensor(list(buf), dtype=torch.long)" to use torch.uint8; also update the corresponding decoder function (the decoder at the review note) and the encode_padded implementation (encode_padded) to produce and consume torch.uint8 tensors, and add explicit casting to larger integer types only where arithmetic expects int64 so behavior doesn’t change.

core/event_codec/capnp_codec.py

coderabbitai · 2026-01-10T22:45:39Z

infer/event_runtime.py

+    encoder: EventEncoder = field(default_factory=EventEncoder)
+    decoder: EventDecoder = field(default_factory=EventDecoder)
+    delimiter: int = 0  # Cap'n Proto uses 0-byte as segment delimiter


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Cap'n Proto message framing format segment delimiter null byte

💡 Result:

Short answer: No — there is no extra "null byte" delimiter between segments. The stream framing is:

4 bytes: (segment count − 1), little-endian.

N × 4 bytes: each segment length in 64-bit words, little-endian.

0 or 4 bytes padding to the next 8-byte (word) boundary.

Then the raw contents of each segment concatenated in order.

Segments are just concatenated back-to-back (with the above header/padding), and packing uses its own tag bytes inside words — there is no separate null-byte segment delimiter. [1][2]

Sources:
[1] Cap'n Proto Encoding / Serialization Over a Stream.
[2] Stack Overflow discussion: Cap'n Proto supports delimited messages.

🏁 Script executed:

# Find how delimiter is used in the codebase rg "delimiter" infer/ -B 2 -A 2

Repository: TheApeMachine/caramba

Length of output: 2833

🏁 Script executed:

# Look at the EventStreamCodec class and its usage of delimiter rg -A 10 "class EventStreamCodec" infer/event_runtime.py

Repository: TheApeMachine/caramba

Length of output: 634

🏁 Script executed:

# Check if there's any framing logic or how messages are actually serialized rg "EventEncoder|EventDecoder" infer/ -A 3 | head -50

Repository: TheApeMachine/caramba

Length of output: 701

🏁 Script executed:

# Find EventEncoder and EventDecoder definitions fd -e py "event_codec" infer/ --exec head -100 {} \;

Repository: TheApeMachine/caramba

Length of output: 47

🏁 Script executed:

# Search for EventEncoder implementation details rg "class EventEncoder" -A 20

Repository: TheApeMachine/caramba

Length of output: 47

🏁 Script executed:

# Check EventDecoder to see how it handles/expects the delimiter rg "class EventDecoder" -A 20

Repository: TheApeMachine/caramba

Length of output: 47

🏁 Script executed:

# Look for token vocabulary or encoder range information rg "EventEncoder|EventDecoder|vocab" infer/ -B 1 -A 1 | head -60

Repository: TheApeMachine/caramba

Length of output: 3546

🏁 Script executed:

# Find event_codec file location fd "event_codec" --type f

Repository: TheApeMachine/caramba

Length of output: 47

🏁 Script executed:

# Check core directory structure fd --type f -e py caramba/core/ | head -20

Repository: TheApeMachine/caramba

Length of output: 47

🏁 Script executed:

# If file exists, read EventEncoder implementation find . -name "event_codec.py" -exec head -150 {} \;

Repository: TheApeMachine/caramba

Length of output: 47

🏁 Script executed:

# List the actual directory structure git ls-files | grep -E "caramba|core|event" | head -30

Repository: TheApeMachine/caramba

Length of output: 947

🏁 Script executed:

# Check for requirements or setup files that might indicate external dependencies git ls-files | grep -E "requirements|setup|pyproject|poetry"

Repository: TheApeMachine/caramba

Length of output: 384

🏁 Script executed:

# Look at the imports more carefully in event_runtime.py head -30 infer/event_runtime.py

Repository: TheApeMachine/caramba

Length of output: 1008

🏁 Script executed:

# Check what's exported from event_codec/__init__.py cat core/event_codec/__init__.py

Repository: TheApeMachine/caramba

Length of output: 611

🏁 Script executed:

# Read the capnp_codec.py file to find EventEncoder and EventDecoder definitions cat core/event_codec/capnp_codec.py

Repository: TheApeMachine/caramba

Length of output: 5890

The comment about Cap'n Proto using 0-byte delimiters is incorrect, but the delimiter choice is still problematic.

Cap'n Proto uses length-prefixed framing (4-byte segment count, lengths, padding), not null-byte delimiters. However, Cap'n Proto can produce null bytes in its serialized output, which will collide with the delimiter=0 value appended by encode_with_delimiter(). The decoder cannot distinguish between a 0 byte that is part of the Cap'n Proto message and the 0 byte terminator. Either use a different delimiter byte (e.g., 255), implement length-prefixed framing like Cap'n Proto does internally, or document why this collision is safe in your use case.

🤖 Prompt for AI Agents

In @infer/event_runtime.py around lines 54 - 56, The delimiter=0 field and comment are wrong for Cap'n Proto framing and risk collisions because Cap'n Proto payloads can contain 0 bytes; change the framing: either replace delimiter default with a safe unused byte (e.g., delimiter: int = 255) and update encode_with_delimiter() and the decoder logic in EventEncoder/EventDecoder to use that byte, or implement proper length-prefixed framing (write a 4-byte length before each encoded message and update decode logic to read length then that many bytes) so the decoder can unambiguously parse messages; also remove or correct the misleading comment about Cap'n Proto using 0-byte delimiters.

infer/event_runtime.py

coderabbitai · 2026-01-10T22:45:39Z

layer/memory_block/memory/memory.py

+        if "resonant_final_sim" in routing:
+            res_metrics = ResonantSettlingMetrics(
+                final_sim=float(routing["resonant_final_sim"]),
+                convergence_steps=int(routing.get("resonant_convergence_steps", 0)),
+                energy_drop=0.0, # TODO: implement if needed
+                bucket_entropy=float(routing.get("resonant_bucket_entropy", 0.0)),
+                state_drift=0.0,
+            )


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, find the routing.py file and verify its location find . -name "routing.py" -type f | head -20

Repository: TheApeMachine/caramba

Length of output: 103

🏁 Script executed:

# Search for resonant_convergence_steps in the entire codebase rg -n "resonant_convergence_steps" --type py

Repository: TheApeMachine/caramba

Length of output: 189

🏁 Script executed:

# Check if ResonantRouter class exists and where it's defined rg -n "class ResonantRouter" --type py

Repository: TheApeMachine/caramba

Length of output: 138

🏁 Script executed:

# Read the ResonantRouter class, starting from line 179 sed -n '179,350p' layer/memory_block/memory/routing.py

Repository: TheApeMachine/caramba

Length of output: 5593

🏁 Script executed:

# Check the context around line 527 in memory.py to understand the flow sed -n '510,540p' layer/memory_block/memory/memory.py

Repository: TheApeMachine/caramba

Length of output: 1612

resonant_convergence_steps is never populated by ResonantRouter.

ResonantRouter.route() does not add resonant_convergence_steps to the aux dict, so this will always default to 0. The method tracks energy history internally but never exports it. Either add convergence step tracking to ResonantRouter or remove this field from ResonantSettlingMetrics.

🤖 Prompt for AI Agents

In @layer/memory_block/memory/memory.py around lines 524 - 531, ResonantSettlingMetrics is being populated with resonant_convergence_steps from routing but ResonantRouter.route() never sets "resonant_convergence_steps", so it always defaults to 0; either have ResonantRouter.route() compute and emit convergence steps into its aux/routing dict (e.g., track when energy history converges and set "resonant_convergence_steps") or remove the field usage here and stop reading "resonant_convergence_steps" in memory.py; locate ResonantRouter.route and the place where routing/aux is returned to add the convergence counter, or adjust the construction of ResonantSettlingMetrics to omit/respectably default that metric.

coderabbitai · 2026-01-10T22:45:39Z

layer/memory_block/memory/tuner.py

+        if self.resonant_sim_ema is not None:
+            if self.resonant_sim_ema < 0.4:
+                # Weak coupling -> increase target
+                self.target_resonant_coupling = min(5.0, self.target_resonant_coupling * 1.05)
+            elif self.resonant_sim_ema > 0.9:
+                # Strong overlap -> relax coupling
+                self.target_resonant_coupling = max(0.2, self.target_resonant_coupling * 0.98)
+
+        if self.resonant_steps_ema is not None and self.resonant_steps_ema > 15:
+            self.target_resonant_steps_delta = min(20.0, self.target_resonant_steps_delta + 0.1)
+
+        # --- Storage/Write Logic ---
+        if self.utilization_ema is not None:
+            if self.utilization_ema < self.target_utilization * 0.5:
+                # Starvation -> lower targets to admit more
+                self.target_write_threshold = max(0.1, self.target_write_threshold * 0.95)
+                self.target_vsa_novelty = max(0.1, self.target_vsa_novelty * 0.95)
+
+            if self.conflict_ema is not None and self.conflict_ema > 0.3:
+                # Saturation -> increase targets
+                self.target_vsa_novelty = min(10.0, self.target_vsa_novelty * 1.05)
+                self.target_write_threshold = min(10.0, self.target_write_threshold * 1.05)


🧹 Nitpick | 🔵 Trivial

Document or extract magic numbers as named constants.

The heuristic thresholds (0.4, 0.9, 15, 0.5, 0.3) and adjustment factors (1.05, 0.98, 0.95) appear to be empirically derived but lack documentation. Consider extracting these as class constants with descriptive names or documenting their rationale in comments.

📚 Example refactor with named constants

Add class constants at the beginning of the class:

class UniversalMemoryTuner: """Cybernetic controller for memory blocks.""" # Heuristic thresholds WEAK_COUPLING_THRESHOLD = 0.4 STRONG_OVERLAP_THRESHOLD = 0.9 SLOW_CONVERGENCE_STEPS = 15 STARVATION_THRESHOLD = 0.5 # fraction of target_utilization HIGH_CONFLICT_THRESHOLD = 0.3 # Adjustment factors INCREASE_FACTOR = 1.05 DECREASE_FACTOR = 0.98 AGGRESSIVE_DECREASE_FACTOR = 0.95

Then update the heuristics:

if self.resonant_sim_ema is not None: - if self.resonant_sim_ema < 0.4: + if self.resonant_sim_ema < self.WEAK_COUPLING_THRESHOLD: # Weak coupling -> increase target - self.target_resonant_coupling = min(5.0, self.target_resonant_coupling * 1.05) + self.target_resonant_coupling = min(5.0, self.target_resonant_coupling * self.INCREASE_FACTOR)

This improves maintainability and makes tuning adjustments easier.

coderabbitai · 2026-01-10T22:45:39Z

layer/memory_block/memory/tuner.py

+            # 3. Update state
+            # Convert back to int for discrete levers
+            if actual_name == "resonant_steps_delta":
+                setattr(self, actual_name, int(round(final_val)))


🧹 Nitpick | 🔵 Trivial

Remove redundant int() call.

The round() function already returns an int in Python 3, making the outer int() call unnecessary. While not harmful, removing it improves clarity.

✨ Proposed simplification

- setattr(self, actual_name, int(round(final_val))) + setattr(self, actual_name, round(final_val))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

setattr(self, actual_name, int(round(final_val)))

setattr(self, actual_name, round(final_val))

🧰 Tools

🪛 Ruff (0.14.10)

170-170: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

🤖 Prompt for AI Agents

In @layer/memory_block/memory/tuner.py at line 170, The setattr call in MemoryTuner (look for setattr(self, actual_name, int(round(final_val)))) uses an unnecessary outer int() because round(final_val) already returns an int in Python 3; remove the redundant int() and set the attribute with round(final_val) directly to improve clarity.

…ore, added CCL trainer, and improved error handling across various modules.

coderabbitai

Actionable comments posted: 81

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)

api/app.py (1)
281-294: Narrow the except Exception around os.kill() to OSError (fix Ruff BLE001).
os.kill() raises OSError subclasses; catching Exception can mask unrelated bugs.
Proposed diff
     if proc is None:
         # best-effort by pid
         try:
             os.kill(pid, signal.SIGTERM)
         except ProcessLookupError:
             return True
-        except Exception as e:
+        except OSError as e:
             logger.warning(f"StopRun: Failed to terminate process {pid} (best-effort): {e}")
             return False
         return True
ai/worker.py (1)
28-40: Add error handling to prevent worker crashes.

The infinite loop in the run() method has no error handling around the database operations. If list_tasks raises an exception (e.g., database connection failure), the entire worker will crash and stop processing tasks.
🛡️ Proposed fix to add error handling
 async def run(self) -> None:
     """Run the background worker."""
     logger.trace("Running TaskWorker")
 
     while True:
-        await asyncio.sleep(1)
-        logger.trace("Polling for pending tasks")
-        tasks = [t async for t in self.task_store.list_tasks(tenant=self.tenant)]
-        logger.trace(f"Found {len(tasks)} pending tasks")
-
-        for task in tasks:
-            logger.trace(f"Enqueuing task {task.id}")
-            await self.queue.put(task)
+        try:
+            await asyncio.sleep(1)
+            logger.trace("Polling for pending tasks")
+            tasks = [t async for t in self.task_store.list_tasks(tenant=self.tenant)]
+            logger.trace(f"Found {len(tasks)} pending tasks")
+
+            for task in tasks:
+                logger.trace(f"Enqueuing task {task.id}")
+                await self.queue.put(task)
+        except Exception as e:
+            logger.error(f"Error polling tasks: {e}", exc_info=True)
+            await asyncio.sleep(5)  # Back off on error
ai/push_notifications.py (1)

11-11: Pin the httpx dependency to a specific version in pyproject.toml.

The httpx library is listed in pyproject.toml (line 64) without any version constraint. This allows any version to be installed, creating non-reproducible builds and risking breaking changes or security vulnerabilities. Specify an explicit version constraint (e.g., "httpx>=0.24.0,<0.25.0").
benchmark/artifacts.py (1)
155-163: Add BLE001 suppression and optionally format exception details more concisely.

The except Exception on line 159 is intentional for best-effort behavior (never fail the artifact run). Add # noqa: BLE001 – best-effort artifact generation to suppress the Ruff lint warning and document the justification.

Line 162: Consider formatting the exception details as f"({type(e).__name__}: {e})" for consistency with similar patterns in the codebase (e.g., trainer/ccl.py, trainer/upcycle.py); this reduces verbosity if the exception message is large. The import on line 161 is safe and widely used throughout the codebase with no documented issues.
Suggested improvement (optional)
-                except Exception as e:
+                except Exception as e:  # noqa: BLE001 – best-effort artifact generation
                     # Best-effort: never fail the run due to LaTeX formatting.
                     from caramba.console import logger
-                    logger.warning(f"ArtifactGenerator: Failed to write LaTeX behavior table: {e}")
+                    logger.warning(
+                        f"ArtifactGenerator: Failed to write LaTeX behavior table "
+                        f"({type(e).__name__}: {e})"
+                    )
core/task_queue.py (1)
45-64: Use timezone-aware timestamps (and ideally TIMESTAMPTZ) for consistency across machines.

Right now you write naive datetime.now() into TIMESTAMP columns (Ruff DTZ005 flagged Line 135), which is ambiguous if hosts differ in local timezone/DST. At minimum, write UTC-aware datetimes consistently; ideally, store as TIMESTAMPTZ.
Minimal code-side fix (UTC-aware)
@@
-from datetime import datetime
+from datetime import datetime, timezone
@@
-                datetime.now(),
+                datetime.now(timezone.utc),
@@
-                datetime.now(),
+                datetime.now(timezone.utc),
Schema improvement (breaking; consider migration)
@@
-                    created_at TIMESTAMP NOT NULL,
-                    updated_at TIMESTAMP NOT NULL
+                    created_at TIMESTAMPTZ NOT NULL,
+                    updated_at TIMESTAMPTZ NOT NULL
Also applies to: 135-136, 160-163
layer/memory_block/block/layer.py (1)
288-293: Variable step is re-declared, shadowing the earlier assignment.

Line 291 re-declares step from ctx.step, but step was already computed on lines 240-247 with fallback logic. This shadows the earlier value and could cause inconsistent behavior if ctx.step differs from the fallback-computed step.

If the intention is to use ctx.step specifically for the write warmup check, consider using a different variable name.
🔧 Suggested fix
         if ctx is not None:
             warm = int(getattr(ctx, "memblock_write_warmup_steps", 0) or 0)
-            step = int(getattr(ctx, "step", 0) or 0)
-            if warm > 0 and step > 0 and step <= warm:
+            ctx_step = int(getattr(ctx, "step", 0) or 0)
+            if warm > 0 and ctx_step > 0 and ctx_step <= warm:
                 write_mask = torch.zeros((int(B), int(T)), device=x.device, dtype=torch.float32)

🤖 Fix all issues with AI agents

In @.gitignore:
- Line 35: Remove the redundant specific artifact ignore entries now covered by
the broad "artifacts/" pattern: delete the lines containing
"artifacts/llama32_1b_dba_paper_efficiency/" and "artifacts/platform_improve/"
(and any duplicate occurrences such as the one noted at line 43) so only the
single "artifacts/" pattern remains.

In @ai/push_notifications.py:
- Around line 124-138: The credential normalization currently uses unsafe
str(val) conversions; change the logic in the credential handling block
(variables: cred, val, key, headers) to only treat a value as a token if it is
already a string (isinstance(val, str)); if val is bytes try decoding with utf-8
and use that string; otherwise treat it as invalid (leave key None) so you don't
set headers["Authorization"] to a coerced non-string token. Ensure the
if-branches that set key use these validations (replace str(val) calls) so only
legitimate string tokens populate headers.

In @ai/server.py:
- Around line 270-276: The in-loop import of MessageSendParams should be moved
out of the loop and merged with the module-level import from a2a.types (where
RequestContext is already imported); remove the repeated "from a2a.types import
MessageSendParams" inside the while True loop and add MessageSendParams to the
existing top-level import list so the RequestContext(...) call can continue to
use MessageSendParams without runtime repeated imports.

In @ai/worker.py:
- Line 12: Several files still import from the old module path
`a2a.server.tasks`; update those imports to the new internal path under
`caramba.ai.task` (e.g., replace `a2a.server.tasks` with `caramba.ai.task` in
ai/push_notifications.py, ai/server.py (all three import sites),
ai/task_store.py, ai/executor.py, and ai/task/store/__init__.py), ensuring each
referenced symbol such as TaskStore and any task-related classes/functions keep
their original names but point to the new module; after changes, run the test
suite or import checks to verify there are no broken references.

In @ccl/context_counts.py:
- Around line 119-124: The assignment uses a redundant int() around len(uniq);
replace "num_classes = int(len(uniq))" with "num_classes = len(uniq)" to avoid
the unnecessary cast while preserving behavior (uniq and len(uniq) are already
ints); keep the subsequent validation "if int(num_classes) <= 0" as-is or
optionally remove its cast if you want consistency.
- Line 11: Replace legacy typing imports on the top-level import line in
ccl/context_counts.py by removing Dict, List, Tuple, Optional and instead import
Iterable and Sequence from collections.abc; then update all type annotations in
this module: use built-in generic types (dict[int, int], list[ClassCountsModel],
tuple[int, int], etc.) and replace Optional[SparseCounts] with the union syntax
SparseCounts | None; ensure any remaining Iterable/Sequence references are using
the imported names from collections.abc rather than typing.

In @ccl/patch_vq.py:
- Around line 76-82: The unpacked but unused variables should be prefixed with
an underscore to satisfy linters: change the unpack of x.shape from "n, d =
x.shape" to "n, _d = x.shape" (or "n, _ = x.shape") and similarly rename the
unused unpacked "k" at the later unpack site to "_k" (or "_") so the variables
are clearly marked as intentionally unused; update any references if they were
accidentally relying on those names.
- Around line 164-177: The _decode_grid function currently accepts out_h and
out_w but never uses them; either remove these parameters from the _decode_grid
signature and docstring and update all call sites that pass out_h/out_w to stop
providing them (search for _decode_grid usages), or keep the parameters and add
validation inside _decode_grid that computed h and w (from ht,wt,stride,patch)
match out_h and out_w (raise a clear ValueError with context if they differ);
update callers only if you choose to remove the parameters.

In @ccl/system.py:
- Around line 1-8: The module docstring in ccl.system is placed after the from
__future__ import annotations statement so it isn't recognized as the module
__doc__; move the triple-quoted module docstring to be the very first statement
in the file (before from __future__ import annotations) so it becomes the module
docstring while keeping the existing content and preserving the future import.
- Around line 82-90: The local import of loglik_grid inside the forward
implementation causes unnecessary per-call lookup overhead; move the import to
module scope by adding "from caramba.ccl.context_counts import loglik_grid" at
the top of ccl/system.py and remove the in-method import on line containing
"from caramba.ccl.context_counts import loglik_grid", leaving the loop that
calls loglik_grid(m, grid) unchanged.

In @codegraph/parser.py:
- Line 271: The warning log currently uses an f-string which eagerly formats
even when the message will be suppressed; change the call to use logging's
parameterized form by passing a format string and arguments to logger.warning
(replace logger.warning(f"Codegraph: Failed to read {file_path}: {e}") with a
parameterized call that references file_path and e), ensuring lazy evaluation
and preserving the same message content and variables.

In @config/presets/ccl_mnist.yml:
- Around line 85-86: The file ends with a trailing blank line after the YAML key
"dtype: float32"; remove the extra empty line at the end of
config/presets/ccl_mnist.yml so the file terminates immediately after the
"dtype: float32" line (no final blank line) to satisfy YAML linting.

In @console/logger.py:
- Around line 352-360: The conditional sets speed_style to an undefined theme
key "amber" (in the velocity < 0 branch), causing fallback; update the velocity
branch in the logger where speed_style and arrow are set (the block assigning
speed_style/arrow for velocity >0, <0, ==0) to use the theme's correct key
"warning" instead of "amber" so negative velocities use the defined warning
style.
- Around line 536-543: The finally block cleans _live_display and
_live_renderable but leaves panel state intact, so clear both panels on exit by
resetting self._tuner_panel and self._health_panel to None in the same cleanup
path; locate the context manager where self._live_display is stopped and add
nullification of self._tuner_panel and self._health_panel (or call their
appropriate cleanup methods if present) so no stale panel data persists when the
logger is reused.
- Around line 404-406: Remove the redundant inner imports of Table, Panel, and
Text in console/logger.py: delete the lines "from rich.table import Table",
"from rich.panel import Panel", and "from rich.text import Text" that appear
inside the function/block (they duplicate the top-of-file imports for Table,
Panel, Text); ensure no local shadowing remains and run lint/tests to confirm
nothing else relies on those inner imports.
- Around line 55-59: The __init__ currently sets self.console,
self._live_display and self._live_renderable but never initializes
self._tuner_panel and self._health_panel used by tuner_status() and
health_bars(); update the __init__ of the Console class to set self._tuner_panel
= None and self._health_panel = None (alongside existing initializations) so
those attributes exist predictably and avoid hasattr() reliance when
tuner_status() or health_bars() run.

In @core/task_queue.py:
- Around line 78-83: push() mutates the caller's Task by assigning
task.status.state = TaskState.submitted and uses naive datetime.now(); instead,
avoid mutating the input Task by creating a local copy (e.g., deep copy or new
Task/TaskStatus instance) and set the state on that local copy before
persisting, and change all naive datetime.now() calls in push() and update()
(and the other occurrences at the noted sites) to timezone-aware calls such as
datetime.now(timezone.utc) and ensure timezone is imported from datetime. Keep
modifications scoped to push(), update(), and the datetime usages so callers’
Task objects remain untouched and timestamps are UTC-aware.

In @data/hf_image_classification.py:
- Line 10: The import statement brings in Callable and Optional which are
unused; remove those two names from the typing import so only actually used
symbols (e.g., Any) remain. Update the import line that currently reads "from
typing import Any, Callable, Optional" to import only the symbols used in this
module (remove Callable and Optional).
- Around line 53-60: The current branch that checks isinstance(img, Image.Image)
then calls Image.open(img) should also detect dict-style HuggingFace image
entries (e.g., {"bytes": ..., "path": ...}) and handle them explicitly: if img
is a dict and contains "bytes", wrap the bytes in io.BytesIO and call Image.open
on that stream; if it contains "path" use that path with Image.open; preserve
the existing TypeError fallback for unsupported types and ensure any caught
exception is raised from the original for traceback clarity (refer to the
variables img, im and the Image.open call).

In @data/icl_rule_test.py:
- Around line 52-57: The code sets V = int(256) unnecessarily; change it to V =
256 in the test so V is a plain integer literal. Update the assignment in
data/icl_rule_test.py where V is defined (near the logits construction that uses
V) to remove the redundant int() call.

In @infer/event_runtime_test.py:
- Line 49: Remove the unused local variable prompt_len from the test (it’s
assigned but never used); update the code in infer/event_runtime_test.py to
delete the line setting prompt_len = 71 so the test relies on self._prompt_len
as intended and no unused-variable remains.
- Around line 83-84: Move the mid-file import "from caramba.core.event_codec
import EventEncoder" into the module's top import block with the other imports
and remove the duplicate from between class definitions; update import ordering
to match project style (group stdlib, third-party, local) if needed and ensure
any references to EventEncoder in the classes still work after the move (no
local shadowing or conditional imports).

In @infer/event_runtime.py:
- Around line 219-234: The except Exception: pass in the replay recording block
silently swallows errors from building or adding the replay seq (involving
self.replay_max_len, prompt, out_ids, delim, self.replay.add), losing visibility
into failures; change it to catch Exception as e and log the failure (using the
module/class logger or process logger) with contextual details (e.g., values or
shapes of prompt/out_ids, max_len, delim and the exception message/stack) while
preserving best-effort behavior (do not re-raise so inference continues).

In @instrumentation/run_logger.py:
- Around line 67-72: Move the inline import of console_logger out of the
exception handler and into the module top-level (e.g., add "from caramba.console
import logger as console_logger" at the top of instrumentation/run_logger.py),
then remove the local import inside the except block that logs the failure to
create self.out_dir; keep the same warning call and setting self.enabled = False
so behavior of RunLogger remains unchanged.

In @instrumentation/training_metrics.py:
- Around line 36-49: The module claims "zero-overhead" yet the property getters
loss, accuracy, and step acquire self._lock, so update the module docstring and
the singleton comment to reflect that reads are protected and incur minimal
locking overhead (not literally zero), and add a short note advising callers
that very frequent reads (per-step) should be batched or sampled to avoid
unnecessary lock contention; locate and edit the module-level docstring and any
comment around the singleton and the properties loss, accuracy, step to make
this clarification.
- Around line 12-25: The class TrainingMetrics is annotated with @dataclass but
defines a manual __init__ and uses object.__setattr__ unnecessarily; replace the
custom initializer by keeping @dataclass and declare _lock =
field(default_factory=Lock) with _loss: float | None = None, _accuracy: float |
None = None, _step: int = 0, remove the __init__ method entirely, and import
field from dataclasses, or alternatively remove @dataclass and keep the existing
__init__; update references to _lock, _loss, _accuracy, and _step accordingly.

In @instrumentation/wandb_writer.py:
- Around line 79-81: Import console_logger once at module level and remove the
repeated inline imports in each exception handler: add "from caramba.console
import logger as console_logger" to the top of instrumentation/wandb_writer.py,
then replace occurrences of "from caramba.console import logger as
console_logger" inside exception blocks (e.g., in the except handling within
WandBWriter methods that log "WandBWriter: Failed to dump config" and similar
messages) with direct uses of console_logger.warning(...). Keep the broad
"except Exception" handlers as-is per the module's design.

In @layer/attention/standard/layer.py:
- Around line 101-102: Add a unit/integration test that exercises the standard
attention path when qh.dtype is torch.float32 and qh.device.type == "mps" to
assert the code falls back to SDPA rather than attempting the
AttentionTraining/Triton kernel; specifically, create a test that constructs
tensors resembling qh, invokes the standard attention entry point (the branch
that checks the guard containing qh.dtype and qh.device.type), and verifies the
SDPA implementation is selected (and runs) while ensuring
DecoupledAttentionTraining is not used.

In @layer/memory_block/block/viz_test.py:
- Around line 39-42: The test accesses mock_routing.call_args without confirming
the mock was invoked; update each test in this class that patches
self.layer.memory.compute_routing (e.g., the block using "with
patch.object(self.layer.memory, 'compute_routing', ... ) as mock_routing:") to
assert the mock was called before unpacking call_args—use
mock_routing.assert_called() or self.assertTrue(mock_routing.called) immediately
after invoking self.layer(x, ctx=ctx) and before reading mock_routing.call_args
to avoid AttributeError when compute_routing wasn't called.

In @layer/memory_block/memory/memory.py:
- Around line 638-643: The code imports the statistics module inside the method
when computing loss_variance from self.loss_history (the import appears before
calculating loss_variance and the variable loss_variance is set from
statistics.variance); move the import to the top of the module (module-level
import statistics) and remove the in-function import so the method simply calls
statistics.variance(self.loss_history) to avoid repeated imports and minor
runtime overhead.
- Around line 522-557: Remove the duplicated comment lines around the post-write
telemetry block and avoid re-importing modules on every hot-path call: delete
the repeated "After write..." line, and move or cache the conditional imports
for should_update_tuner, get_shared_tuner and get_training_metrics so they
aren’t executed on every call when routing.collect_aux is true (e.g., import
once at module load or assign them to local/static variables on first use). Keep
the rest of the logic (checks of self.mem_autotune, routing.get("collect_aux"),
global_step logic, tuner.update, mem_autotune_viz block calling
tuner.get_viz_data/get_health_metrics) unchanged.
- Around line 625-636: The loop that converts routing values to a float
(iterating keys ("read_teacher_agree_free", "read_teacher_agree",
"vq_read_group_acc", "vq_write_group_acc") and using
v.detach().float().mean().item()) currently swallows all exceptions; instead
restrict the except to likely conversion errors (e.g., RuntimeError, ValueError,
TypeError) or validate v (isinstance(v, torch.Tensor) and v.numel() > 0 and
torch.isfinite(v).all()) before converting, and on failure emit a warning or
debug log that includes the routing key and a brief representation of v so
malformed values are visible; keep falling back to metrics.accuracy only after
these checks and logging.
- Around line 656-662: The bare "except Exception: pass" around the telemetry
assignments (setting tel.aux["table_utilization"], tel.aux["write_fire_frac"]
and tel.aux["write_threshold_eff"] when routing["write_threshold_eff"] is a
torch.Tensor) silently swallows errors; replace it by catching only expected
errors (e.g., ValueError, TypeError, AttributeError) and log the exception
details instead of ignoring them (use the module/class logger or raise after
logging), so failures in tel.aux assignment are visible and include the
exception message and stack trace.

In @layer/memory_block/memory/reader.py:
- Around line 184-193: The duplicated pattern that fetches the shared tuner when
self.tuner_mode != "off" should be extracted into a helper on the class to avoid
repetition; add a method (e.g., _get_tuner or _shared_tuner) that checks
self.tuner_mode, imports and calls get_shared_tuner(mode=self.tuner_mode) and
returns the tuner or None, then update both places that currently mutate
vsa_weight (vsa weight scaling) and read_temp (read temp scaling) to call this
helper and apply getattr(tuner, "...", 1.0) only if the helper returned a tuner.

In @layer/memory_block/memory/routing.py:
- Around line 306-309: The variable energy_history is collected but never
returned; either remove its collection or attach it to the aux output. To fix,
update the function that defines energy_history (variable energy_history) and
either (a) delete the energy_history list and the append at the accumulation
site, or (b) add energy_history to the aux dict returned (e.g., ensure aux or
aux_requested handling adds aux["energy_history"] = energy_history before the
function returns), so telemetry/convergence data is exposed; adjust any
conditional that only populates aux so energy_history is included when aux is
requested.
- Around line 326-339: Remove the unused local variables u_r and u_i that are
assigned from u_r_t.transpose... and u_i_t.transpose... (they are never used
later); keep the computations that use u_r_t and u_i_t and the subsequent
v_r_t/v_i_t and transposes, and simply delete the two lines that create u_r and
u_i to resolve the unused-assignment warning.
- Around line 222-224: Replace the print call in method _check_nan of
ResonantRouter with the project logger: add "from caramba.console import logger"
at the top of the file and change the print(...) to logger.warning(...)
(including the same message and the name variable) so warnings go through the
standard logging pipeline; ensure the message text remains clear and use
logger.warning for non-fatal NaN/Inf detection.

In @layer/memory_block/memory/tuner_test.py:
- Around line 47-57: The test modifies a frozen dataclass field
(MemoryHealthTelemetry.step) causing FrozenInstanceError; change the loop in
test_warmup_period_no_exploration to create a new MemoryHealthTelemetry instance
each iteration (e.g., tel = MemoryHealthTelemetry(step=i,
utilization=previous_utilization)) before calling tuner.update(tel) so the
UniversalMemoryTuner.update sees the incremented step without mutating a frozen
object and the assertion on tuner.resonant_coupling_mult remains valid.
- Around line 103-108: The test's change detection is redundant because it
compares tuner.resonant_damping_mult to tuner.explorers["damping"].value (and
similarly for novelty), which will always be equal; update the test to capture
initial values for all three tracked params (e.g., initial_coupling =
tuner.resonant_coupling_mult, initial_damping = tuner.resonant_damping_mult,
initial_novelty = tuner.vsa_novelty_mult) before warmup and then compute changed
by comparing current values to those saved initials
(tuner.resonant_coupling_mult != initial_coupling or tuner.resonant_damping_mult
!= initial_damping or tuner.vsa_novelty_mult != initial_novelty) so the
assertion actually detects exploration.

In @layer/memory_block/memory/tuner.py:
- Around line 259-307: The _compute_objective method embeds hardcoded "magic"
weights; extract these into configurable class-level constants or constructor
parameters (e.g., UTIL_CRITICAL_PENALTY, UTIL_OVER_PENALTY, UTIL_HEALTHY_BONUS,
ACCURACY_WEIGHT, LOSS_VARIANCE_WEIGHT, CONFLICT_WEIGHT, RESONANT_MAX_SCORE) and
replace the numeric literals in _compute_objective with those symbols, providing
sensible defaults and allowing overrides in __init__ so experiments can tune
weights without editing the method.
- Around line 189-190: The method resonant_steps_delta currently returns
int(round(self.explorers["steps"].value)) which redundantly wraps round(...)
with int(); remove the outer int() and return
round(self.explorers["steps"].value) directly so the return type remains an int;
update the resonant_steps_delta implementation accordingly, referencing the
explorers["steps"].value expression.
- Around line 409-437: The module-level singleton and step-tracking globals
(_shared_tuner, _shared_tuner_last_step) are not thread-safe; update the
docstrings/comments for get_shared_tuner, should_update_tuner, and
reset_shared_tuner to state the single-threaded assumption and potential race on
concurrent calls, and suggest remedies (e.g., protecting access with a
threading.Lock around reads/writes or using a thread-safe singleton pattern) so
callers know to serialize access or change the implementation if used across
threads.
- Line 178: Replace the dict comprehension used to initialize self.deltas with
dict.fromkeys for brevity and style: in the Tuner (or the class where
self.deltas is set) change the initialization of self.deltas from the
comprehension over self.explorers.keys() to using
dict.fromkeys(self.explorers.keys(), 0.0) (or dict.fromkeys(self.explorers,
0.0)) so the intent is clearer and the code is more concise.
- Around line 12-13: The file imports unused symbols `torch` and `nn`; remove
these imports from the top of the file so the tuner only uses plain
floats/telemetry types—delete or replace the lines "import torch" and "from
torch import nn" and run linters/tests to verify no other references to `torch`
or `nn` remain in the module.
- Around line 45-111: Add two unit tests for ParameterExplorer.step(): one that
forces a braking-edge collision by setting up an explorer with momentum=1,
braking=True, direction pointing toward a boundary, and a non-improving step so
momentum falls to 0 and triggers the braking-reversal logic; assert that after
the step the direction has flipped, momentum is reset to >=1, value is clamped
inside [min_value, max_value], and braking is cleared. The second test should
cover the cooldown-to-active reactivation: initialize explorer.active=False,
steps_since_deactivation set to cooldown-1, then call step and assert it returns
0.0 until cooldown is reached, after which explorer.active becomes True,
steps_since_deactivation and steps_without_improvement are zero, momentum==1,
and direction is set to 1.0 or -1.0 based on value vs (min_value+max_value)/2;
use the ParameterExplorer.step, active, cooldown, steps_since_deactivation,
steps_without_improvement, momentum, direction, base_step_size, min_value and
max_value attributes to drive and verify behavior.

In @optimizer/metal/resonant_jit.py:
- Around line 120-134: The code uses the private API ce._get_build_directory to
construct build_dir; replace it by calling the public
torch.utils.cpp_extension.get_default_build_root() (import it) and set build_dir
= Path(get_default_build_root()) / name, or simply omit manual build_dir
construction and pass no build_directory to ce.load so PyTorch uses its
default/TORCH_EXTENSIONS_DIR; update references to ce._get_build_directory to
use get_default_build_root and ensure the variable name (name) is appended when
forming build_dir before passing build_directory to ce.load.

In @optimizer/metal/resonant_update_autograd_test.py:
- Around line 33-37: The compound assertion checking all gradients at once
(assert x.grad is not None and y.grad is not None and vr.grad is not None and
vi.grad is not None) should be split into individual assertions so failures
identify which gradient is missing; add four separate assertions (e.g., assert
x.grad is not None, "x.grad is None") for x.grad, y.grad, vr.grad, and vi.grad
before the detach/clone lines (gx_m, gy_m, gvr_m, gvi_m) so the test fails with
a clear diagnostic pointing to the missing gradient.
- Around line 12-14: The test function test_backward_matches_pytorch_reference
currently returns when MPS isn't available which marks the test as passed;
replace the bare return with pytest.skip("MPS backend unavailable") and ensure
pytest is imported at the top of the file (add import pytest if missing) so the
test is reported as skipped rather than silently succeeding.

In @optimizer/metal/resonant_update.py:
- Around line 74-78: The code uses verbose getattr calls to read attributes from
the saved context; replace getattr(ctx, "scale"), getattr(ctx, "damping"), and
getattr(ctx, "zero_diag") with direct attribute access (ctx.scale, ctx.damping,
ctx.zero_diag) and keep the existing float()/bool() casts for scale, damping,
and zero_diag respectively; update the block that unpacks x, y, diag, a, b,
inv_r and calls load_caramba_metal_resonant_ops to use the direct attribute
accesses.
- Around line 48-66: The forward saves the original x, y, diag but calls the
Metal op with x.contiguous(), y.contiguous(), diag.contiguous(), so change the
saved tensors to the contiguous versions used in the op: save the contiguous
copies (the same objects passed to resonant_update_forward_fp32) via
ctx.save_for_backward instead of the original x, y, diag; keep storing
ctx.scale, ctx.damping, ctx.zero_diag as before and ensure xo, yo, a, b, inv_r
assignment and return are unchanged.

In @optimizer/resonant_update_triton_kernels.py:
- Around line 111-142: The loads for x and y (the tl.load calls assigning to
variables x and y from x_ptr and y_ptr) are dead code and should be removed to
avoid wasted memory bandwidth; delete the two lines that load x and y and their
.to(tl.float32) conversions inside the backward kernel loop, but keep the kernel
signature parameters x_ptr and y_ptr intact for API compatibility (do not change
other computations that use offs, d, h, diag or the stores to
grad_x_ptr/grad_y_ptr/grad_vr_ptr/grad_vi_ptr).

In @optimizer/resonant_update_triton_test.py:
- Around line 34-38: Split the single compound assertion into separate checks so
failures indicate which tensor lacks grad: replace the single assert that checks
x.grad, y.grad, vr.grad and vi.grad together with four distinct assertions (or
individual if statements) that validate x.grad is not None, y.grad is not None,
vr.grad is not None and vi.grad is not None before cloning to gx_m, gy_m, gvr_m
and gvi_m; reference the existing symbols x.grad, y.grad, vr.grad, vi.grad and
the subsequent clones gx_m, gy_m, gvr_m, gvi_m to ensure each grad is present
and provide clearer diagnostics on failure.
- Around line 11-15: Replace the early returns in
test_backward_matches_pytorch_reference with pytest.skip calls: import pytest if
not already, then use pytest.skip("CUDA not available") when
torch.cuda.is_available() is False and pytest.skip("Triton not supported") when
triton_supported() is False so the test reports as skipped rather than silently
passing; reference the existing torch.cuda.is_available() and triton_supported()
checks and update them accordingly.

In @optimizer/resonant_update_triton.py:
- Around line 129-133: The code currently only checks grad_xo's device in
ResonantPhaseUpdateTriton; add a similar CUDA device validation for grad_yo
using the same _require pattern (e.g., _require(grad_yo.device.type == "cuda",
msg="grad_yo must be CUDA for ResonantPhaseUpdateTriton.")) so both gradients
are verified before creating gx/gy/gvr/gvi.

In @pyproject.toml:
- Around line 24-73: The dependency list in pyproject.toml has been unpinned
(e.g., "torch", "numpy", "transformers", etc.), removing reproducibility and
risking silent breakages; restore explicit version constraints by pinning each
dependency to the tested version or using bounded ranges (e.g.,
torch==<tested_version> or torch>=<min_version>,< <upper_bound>) for all listed
packages (torch, numpy, tiktoken, datasets, transformers, huggingface_hub,
pydantic, fastapi, uvicorn[standard], pytorch-lightning, wandb, etc.), ensuring
the chosen versions reflect what CI/local tests currently validate and applying
the same pattern to optional extras like "uvicorn[standard]" and "mcp[cli]".

In @research/ccl/ccl_mnist.py:
- Line 295: The zip() usages in the training and test loops (the tqdm loops that
iterate over zip(tokens, labels) and zip(test_tokens, y_test)) can silently
truncate on length mismatch; update both zip(...) calls to zip(..., strict=True)
so mismatched lengths raise an error, and keep the existing total=... and desc
arguments unchanged (refer to the loop variables grid, y in the training loop
and the analogous variables in the test loop).
- Around line 59-63: Replace the overly broad "except Exception as e" in the
import try/except blocks with "except ImportError as e" so only import failures
are caught; specifically update the try/except that wraps "import torch; from
torchvision import datasets" and the other two analogous import blocks
referenced in the review (the blocks around lines 117-120 and 629-632) to use
"except ImportError as e" and re-raise the same RuntimeError with the original
exception as the cause.
- Line 273: The assert num_classes == 10 is unsafe because assertions can be
disabled; replace it with an explicit runtime check such as: if num_classes !=
10: raise ValueError("This script requires num_classes == 10 (MNIST labels
0..9)"), updating the check near the existing reference to num_classes in
ccl_mnist.py so the script always enforces the requirement.
- Line 151: Replace the bare assert with an explicit runtime check: verify that
D == patch * patch using an if statement and raise a ValueError with a clear
message (e.g., f"Expected D == patch*patch, got D={D}, patch={patch}") so the
validation always runs even when Python is optimized; update the check
surrounding the variables D and patch in ccl_mnist.py accordingly.
- Around line 483-490: The variable lam_uni2 is assigned but never used; remove
lam_uni2 from the renormalization branch (and any other assignment sites) and
keep using the original lam_uni in the sampling logic, or alternatively update
the sampling code to use lam_uni2 consistently; specifically edit the
model.use_full conditional that sets lam_full2, lam_mid2, lam_uni2 so it only
defines lam_full2 and lam_mid2 (or ensure downstream sampling thresholds
reference lam_uni2 instead of implicit unigram), and update any comments to
reflect the chosen approach.
- Around line 227-258: SparseCounts, _update_sparse and _prob_sparse are
duplicated here and in ccl/context_counts.py; remove the local definitions and
import the shared implementations instead. Replace the local dataclass and
functions SparseCounts, _update_sparse, and _prob_sparse with from
ccl.context_counts import SparseCounts, _update_sparse, _prob_sparse (or the
public names used there), and update any local references (e.g., usages in
ClassModel.mid/full and calls to _update_sparse/_prob_sparse) to use the
imported symbols so behavior remains identical and duplication is eliminated.
- Line 36: The file currently imports legacy typing names (Dict, Tuple,
Optional, List); remove those from the "from typing ..." import and update all
annotations that use them to Python 3.9+ built-in generics and union syntax
(e.g., Dict -> dict[…], Tuple -> tuple[…], List -> list[…], and Optional[X] -> X
| None or X | None where appropriate). Locate uses of the symbols Dict, Tuple,
Optional, and List in the module and replace each with the corresponding
built-in generic or union form so annotations use modern typing syntax.
- Around line 407-436: The function _sample_from_sparse_dirichlet_smoothed
declares an unused parameter alpha; remove alpha from the function signature and
from all call sites so the function only accepts (ent, rng, alphaK, K), or
alternatively if API must remain, add a short docstring note explaining alpha is
unused and keep it—but preferred fix is to delete the alpha parameter and update
every caller that passes alpha to pass only the remaining parameters (search for
_sample_from_sparse_dirichlet_smoothed usages to update).

In @research/ccl/requirements.txt:
- Around line 1-6: The requirements file lists unpinned packages (numpy,
matplotlib, tqdm, torch, torchvision, scikit-learn); pin each to explicit,
tested versions in the requirements.txt so installs are reproducible and
torch/torchvision remain compatible; choose specific version numbers consistent
with your target Metal/Triton kernel and any parent/CI manifests, verify
torch/torchvision pairs against the PyTorch compatibility matrix, and update
related manifests/CI to match the chosen pins.

In @runtime/engine/lightning_engine.py:
- Around line 50-58: The Lightning module's configure_optimizers currently
hardcodes AdamW which diverges from Caramba's optimizer construction; replace
the fixed optimizer with Caramba's builder by delegating to
StandardTrainer._build_optimizer (or the same internal factory) using the
module's training config (self.train_cfg) and model parameters so options like
AdamWMaster, Lion, SGD and schedulers are preserved; update configure_optimizers
to call that builder and return whatever optimizer/scheduler tuple it produces,
ensuring any required flags or device/amp settings from the Lightning context
are forwarded.
- Around line 104-110: The Trainer instantiation sets devices and accelerator
incorrectly; update the L.Trainer call in lightning_engine.py (the trainer =
L.Trainer(...) block) so that CPU training uses devices=1 (not 0) and remove the
inverted explicit CPU branch—prefer accelerator="auto" and set devices to 1 when
train_cfg.device == "cpu" or to "auto"/appropriate GPU count when not, and keep
precision as "16-mixed" if train_cfg.use_amp else "32-true"; ensure the
devices/accelerator logic reflects that devices=0 is invalid and that
accelerator="auto" can handle CPU fallback.

In @trainer/ccl.py:
- Around line 47-106: The _collect_images_labels function can consume huge
memory when max_items is None because it preallocates all images; add a brief
docstring note describing memory implications and also emit a runtime
warning/log when max_items is None and the dataset length (len(cast(Sized, ds)))
exceeds a safe threshold (e.g., >10_000 or make threshold a small constant),
using the module logger or warnings.warn to advise callers to pass max_items or
use a streaming approach; update the _collect_images_labels docstring and insert
the warning check early (before allocation) so users see it before large memory
is reserved.
- Around line 376-406: The try/except around "from caramba.ccl.context_counts
import sample_grid" and "import matplotlib.pyplot as plt" is catching all
Exception; narrow it to only import-related errors by catching ImportError and
ModuleNotFoundError (e.g., replace "except Exception as e" with "except
(ImportError, ModuleNotFoundError) as e") so runtime errors in the generation
logic (sample_grid, codec.decode, plotting) still surface; keep the existing
logger.warning message and the rest of the else block unchanged.
- Around line 109-116: The current _split_train_val implementation slices the
dataset in order, which can bias splits if data is ordered; instead, generate a
list of indices (0..n-1), shuffle them (e.g., via random.shuffle or a passed RNG
for reproducibility), then take the first n_train indices for train and the rest
for val and construct Subset(ds, train_indices) and Subset(ds, val_indices);
update _split_train_val to perform this shuffled-index split so train/val are
randomized.
- Around line 403-406: After saving the figure with plt.savefig(out_path) in the
CCL sampling code, explicitly close the Matplotlib figure to avoid resource
leaks; update the block around plt.tight_layout(pad=0.2), plt.savefig(out_path),
logger.info(...) to call plt.close(fig) (or plt.close() if no fig variable is
used) immediately after logger.info so the figure is released.

In @trainer/gradient_isolation.py:
- Around line 420-421: The warning message in the except block using
logger.warning(...) incorrectly states "failed to merge MOSAIC aux outputs" even
though this try/except is counting keys that start with "mosaic_" rather than
performing a merge; update the log text in trainer/gradient_isolation.py (the
except block catching Exception as e) to accurately reflect the operation (e.g.,
"failed to count MOSAIC aux output keys" or "error while counting mosaic_ keys")
and keep the exception variable (e) included in the message so the original
behavior and diagnostics remain unchanged.
- Around line 197-199: The current except block in GradientIsolationTrainer that
catches Exception e and only logs a warning should not silently degrade to
single-device mode; change it to log an error-level message that clearly states
distributed wrapping failed and distributed training has been disabled, then
re-raise the exception (or raise a new RuntimeError) so callers can fail fast
and handle the failure; update the handler around the system module wrapping
(the try/except around the distributed wrapping code in the
GradientIsolationTrainer class/method) to use logger.error and then raise the
original exception (or an explicit RuntimeError) instead of continuing.

In @trainer/standard.py:
- Around line 343-346: Keep the inline import of reset_shared_tuner where it’s
called rather than moving it to module level to preserve the established pattern
used across the codebase; ensure the code still calls reset_shared_tuner() in
the same place and add a brief comment above the inline import explaining the
consistency rationale (the function name to locate is reset_shared_tuner).

In @trainer/swap_manager.py:
- Around line 32-34: The current broad "except Exception as e" in the
SwapManager swap/optimizer load path should be narrowed to only the expected
failure types to avoid hiding unexpected bugs; replace the blanket except with a
targeted catch (e.g., IOError/OSError for file access, pickle.UnpicklingError
for deserialization, KeyError/ValueError for missing or malformed state, and any
framework-specific load errors) while keeping the
console_logger.warning(f"SwapManager: Failed to load optimizer state: {e}")
behavior; locate the except block in SwapManager (the optimizer-state load
method) and update the exception tuple accordingly so unexpected exceptions
still propagate.

coderabbitai · 2026-01-11T20:06:56Z

.gitignore

 *.synctex.gz
-artifacts/mosaic/CONTEXT.md
-artifacts/ai/brainstorm.jsonl
+artifacts/


🛠️ Refactor suggestion | 🟠 Major

Remove redundant artifact-specific patterns.

Adding a broad artifacts/ ignore pattern (line 35) is appropriate given the PR's extensive artifact additions. However, this makes the specific artifact directory patterns at lines 6-7 (artifacts/llama32_1b_dba_paper_efficiency/ and artifacts/platform_improve/) redundant and should be removed to avoid maintenance confusion.

🧹 Proposed cleanup

Remove lines 6-7, as they are now covered by the broad artifacts/ pattern:

__pycache__/ .DS_Store .venv/ .pytest_cache/ .pycache/ - artifacts/llama32_1b_dba_paper_efficiency/ - artifacts/platform_improve/ internal/ caramba.egg-info/ WARP.md .env wandb/ runs/ .caramba/ train.h5 x.jsonl fineweb_100m.npy fineweb_100m.npy.meta fineweb_20b.npy fineweb_20b.npy.meta *.crdownload train.h5 .coverage fineweb_edu_v29_100m.npy NOTES.md *.fdb_latexmk *.aux *.blg *.bbl *.fls *.log *.out *.toc *.synctex.gz + artifacts/ .cache/ .tmp/ *.pt tmp/ *.npy *.npy.meta ADK.txt + research/dba/runs100k.zip

Also applies to: 43-43

🤖 Prompt for AI Agents

In @.gitignore at line 35, Remove the redundant specific artifact ignore entries now covered by the broad "artifacts/" pattern: delete the lines containing "artifacts/llama32_1b_dba_paper_efficiency/" and "artifacts/platform_improve/" (and any duplicate occurrences such as the one noted at line 43) so only the single "artifacts/" pattern remains.

coderabbitai · 2026-01-11T20:06:56Z

ai/push_notifications.py

+                            # A2A SDKs differ in how credentials are represented (string vs object).
+                            # Normalize to a string token if present.
+                            key: str | None = None
+                            if isinstance(cred, str):
+                                key = cred
+                            elif isinstance(cred, dict):
+                                val = cred.get("key") or cred.get("token")
+                                key = str(val) if val else None
+                            else:
+                                val = getattr(cred, "key", None) or getattr(cred, "token", None)
+                                key = str(val) if val else None
+
+                            if key:
+                                headers["Authorization"] = f"Bearer {key}"
                                break


⚠️ Potential issue | 🟠 Major

Add type validation before converting credentials to strings.

The str(val) conversions at lines 131 and 134 are unsafe. If val is not a string (e.g., a boolean, number, or complex object), the conversion could produce invalid authentication tokens. For example:

str(False) → "False"

str(0) → "0"

str(SomeObject()) → potentially arbitrary string representation

This could lead to authentication failures or mask configuration errors.

🔒 Proposed fix with type validation

# Add authentication if configured if config.authentication: for cred in config.authentication.credentials or []: # A2A SDKs differ in how credentials are represented (string vs object). # Normalize to a string token if present. key: str | None = None if isinstance(cred, str): key = cred elif isinstance(cred, dict): val = cred.get("key") or cred.get("token") - key = str(val) if val else None + key = val if isinstance(val, str) else None + if val and not isinstance(val, str): + _logger.warning( + f"Invalid credential type {type(val).__name__} for task {task.id}, expected string" + ) else: val = getattr(cred, "key", None) or getattr(cred, "token", None) - key = str(val) if val else None + key = val if isinstance(val, str) else None + if val and not isinstance(val, str): + _logger.warning( + f"Invalid credential type {type(val).__name__} for task {task.id}, expected string" + ) if key: headers["Authorization"] = f"Bearer {key}" break

🤖 Prompt for AI Agents

In @ai/push_notifications.py around lines 124 - 138, The credential normalization currently uses unsafe str(val) conversions; change the logic in the credential handling block (variables: cred, val, key, headers) to only treat a value as a token if it is already a string (isinstance(val, str)); if val is bytes try decoding with utf-8 and use that string; otherwise treat it as invalid (leave key None) so you don't set headers["Authorization"] to a coerced non-string token. Ensure the if-branches that set key use these validations (replace str(val) calls) so only legitimate string tokens populate headers.

coderabbitai · 2026-01-11T20:06:56Z

ai/server.py

+                    from a2a.types import MessageSendParams
                    context = RequestContext(
-                        message=message,
-                        current_task=task,
+                        request=MessageSendParams(message=message),
+                        task_id=task.id,
+                        context_id=task.context_id,
+                        task=task,
                    )


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure and locate the imports section head -n 20 ai/server.py

Repository: TheApeMachine/caramba

Length of output: 591

🏁 Script executed:

# Check the imports and context around lines 180-290 sed -n '180,290p' ai/server.py | cat -n

Repository: TheApeMachine/caramba

Length of output: 7279

🏁 Script executed:

# Search for RequestContext usage patterns in the codebase rg -n 'RequestContext' --type py -B 2 -A 2

Repository: TheApeMachine/caramba

Length of output: 5137

🏁 Script executed:

# Check if MessageSendParams exists in the codebase and where it's used rg -n 'MessageSendParams' --type py

Repository: TheApeMachine/caramba

Length of output: 378

🏁 Script executed:

# Look at the file structure to understand the loop context sed -n '260,280p' ai/server.py | cat -n

Repository: TheApeMachine/caramba

Length of output: 1151

🏁 Script executed:

# Check the function structure and loop context around these imports sed -n '150,200p' ai/server.py | cat -n

Repository: TheApeMachine/caramba

Length of output: 3142

🏁 Script executed:

# Check if there's a while True loop and confirm the import placement sed -n '230,280p' ai/server.py | cat -n

Repository: TheApeMachine/caramba

Length of output: 3156

🏁 Script executed:

# Verify RequestContext signature in a2a.server.agent_execution rg -n 'class RequestContext|def __init__' --type py -B 1 -A 5 | grep -A 10 'RequestContext'

Repository: TheApeMachine/caramba

Length of output: 668

🏁 Script executed:

# Check imports in other files to see how MessageSendParams is typically imported rg -n 'from a2a.types import.*MessageSendParams' --type py -B 2 -A 2

Repository: TheApeMachine/caramba

Length of output: 366

🏁 Script executed:

# Search for RequestContext class definition to verify constructor signature rg -n 'class RequestContext' --type py -A 10

Repository: TheApeMachine/caramba

Length of output: 47

🏁 Script executed:

# Check the actual a2a-sdk package info if available find . -name 'pyproject.toml' -o -name 'requirements.txt' -o -name 'setup.py' | head -5

Repository: TheApeMachine/caramba

Length of output: 255

🏁 Script executed:

# Look for any constraints on the RequestContext usage rg -n 'RequestContext(' --type py -B 2 -A 5

Repository: TheApeMachine/caramba

Length of output: 145

🏁 Script executed:

# Search for RequestContext construction with escaped parenthesis rg -n 'RequestContext\(' --type py -B 2 -A 5

Repository: TheApeMachine/caramba

Length of output: 545

🏁 Script executed:

# Check if RequestContext is defined in the codebase or from a2a-sdk rg -n 'class RequestContext' --type py

Repository: TheApeMachine/caramba

Length of output: 47

🏁 Script executed:

# Look at the imports at the top of ai/server.py to understand dependencies head -n 50 ai/server.py

Repository: TheApeMachine/caramba

Length of output: 1914

🏁 Script executed:

# Check the main project requirements to understand a2a-sdk version cat pyproject.toml 2>/dev/null | grep -A 50 'dependencies' || cat setup.py 2>/dev/null || find . -maxdepth 2 -name 'requirements*.txt' -exec cat {} \;

Repository: TheApeMachine/caramba

Length of output: 980

🏁 Script executed:

# Check what's actually in a2a.types module (if available in the repo) find . -path '*/a2a/types*' -type f 2>/dev/null

Repository: TheApeMachine/caramba

Length of output: 47

🏁 Script executed:

# Verify there are no other conditional imports of MessageSendParams in the loop sed -n '180,300p' ai/server.py | grep -n 'import.*MessageSendParams'

Repository: TheApeMachine/caramba

Length of output: 127

🏁 Script executed:

# Confirm the exact line numbers where imports occur within the loop sed -n '170,280p' ai/server.py | grep -n 'from a2a\|while True' | head -20

Repository: TheApeMachine/caramba

Length of output: 366

Move MessageSendParams import outside the loop.

The MessageSendParams import on line 270 is inside the while True loop and should be consolidated with the existing imports at line 186. Both import from a2a.types, so they can be combined into a single import statement. While Python caches imports, eliminating the repeated lookup on each loop iteration improves efficiency and code organization.

Move line 270's import to line 186:

-from a2a.types import Message, Part, TextPart, Role +from a2a.types import Message, MessageSendParams, Part, TextPart, Role

Then remove line 270:

- from a2a.types import MessageSendParams context = RequestContext(

The RequestContext constructor signature using request, task_id, context_id, and task parameters is consistent with the a2a-sdk API as observed across the codebase.

🤖 Prompt for AI Agents

In @ai/server.py around lines 270 - 276, The in-loop import of MessageSendParams should be moved out of the loop and merged with the module-level import from a2a.types (where RequestContext is already imported); remove the repeated "from a2a.types import MessageSendParams" inside the while True loop and add MessageSendParams to the existing top-level import list so the RequestContext(...) call can continue to use MessageSendParams without runtime repeated imports.

coderabbitai · 2026-01-11T20:06:56Z

ai/worker.py

 import logging

-from a2a.server.tasks import TaskStore
+from caramba.ai.task.store import TaskStore


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Check if the new module exists find . -path "*caramba/ai/task/store*" -type f # Check ai/worker.py content head -40 ai/worker.py # Search for remaining references to old import path rg 'a2a\.server\.tasks' --type py

Repository: TheApeMachine/caramba

Length of output: 1946

Update remaining imports from old module path.

While the import in ai/worker.py has been correctly refactored to caramba.ai.task.store, multiple other files still reference the old a2a.server.tasks module path:

ai/push_notifications.py

ai/server.py (3 separate imports)

ai/task_store.py

ai/executor.py

ai/task/store/__init__.py

These imports should be updated to use the new internal module path as part of the same refactoring effort to complete the migration.

🤖 Prompt for AI Agents

In @ai/worker.py at line 12, Several files still import from the old module path `a2a.server.tasks`; update those imports to the new internal path under `caramba.ai.task` (e.g., replace `a2a.server.tasks` with `caramba.ai.task` in ai/push_notifications.py, ai/server.py (all three import sites), ai/task_store.py, ai/executor.py, and ai/task/store/__init__.py), ensuring each referenced symbol such as TaskStore and any task-related classes/functions keep their original names but point to the new module; after changes, run the test suite or import checks to verify there are no broken references.

coderabbitai · 2026-01-11T20:06:56Z

ai/worker.py

            await asyncio.sleep(1)
            logger.trace("Polling for pending tasks")
-            tasks = await self.task_store.list_tasks(tenant=self.tenant)
+            tasks = [t async for t in self.task_store.list_tasks(tenant=self.tenant)]


🧹 Nitpick | 🔵 Trivial

Consider refactoring to avoid holding database connections during collection.

The async comprehension immediately materializes all tasks from the generator, which keeps the database connection open throughout the iteration. Since the list_tasks implementation in ai/task/store/__init__.py already loads all tasks into memory with .all() before yielding, the generator pattern doesn't provide streaming benefits here.

♻️ Suggested approaches

Option 1 (Preferred): Refactor list_tasks to return List[Task] directly

In ai/task/store/__init__.py, change the signature and close the connection immediately:

async def list_tasks(self, tenant: str) -> list[Task]: """List tasks for a tenant""" async with self.engine.connect() as connect: result = await connect.execute( select(self.task_model).where(self.task_model.context_id == tenant) ) return [self._from_orm(task) for task in result.scalars().all()]

Then in ai/worker.py:

-tasks = [t async for t in self.task_store.list_tasks(tenant=self.tenant)] +tasks = await self.task_store.list_tasks(tenant=self.tenant)

Option 2: Process tasks without materialization

If streaming is desired in the future, consume the generator without building an intermediate list:

-tasks = [t async for t in self.task_store.list_tasks(tenant=self.tenant)] -logger.trace(f"Found {len(tasks)} pending tasks") - -for task in tasks: +task_count = 0 +async for task in self.task_store.list_tasks(tenant=self.tenant): + task_count += 1 logger.trace(f"Enqueuing task {task.id}") await self.queue.put(task) +logger.trace(f"Enqueued {task_count} pending tasks")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

tasks = [t async for t in self.task_store.list_tasks(tenant=self.tenant)]

task_count = 0

async for task in self.task_store.list_tasks(tenant=self.tenant):

task_count += 1

logger.trace(f"Enqueuing task {task.id}")

await self.queue.put(task)

logger.trace(f"Enqueued {task_count} pending tasks")

research/ccl/requirements.txt

coderabbitai · 2026-01-11T20:06:59Z

runtime/engine/lightning_engine.py

+    def configure_optimizers(self):
+        # This is a bit tricky because Caramba has its own optimizer building logic.
+        # For now, we'll implement a simple version or try to reuse Caramba's builders.
+        optimizer = torch.optim.AdamW(
+            self.model.parameters(),
+            lr=self.train_cfg.lr,
+            weight_decay=getattr(self.train_cfg, "weight_decay", 0.0)
+        )
+        return optimizer


🧹 Nitpick | 🔵 Trivial

Consider reusing Caramba's optimizer building logic.

The current implementation uses a simple AdamW, but Caramba's StandardTrainer._build_optimizer supports:

AdamWMaster with fused kernels (the main feature of this PR)

Lion optimizer

SGD

Learning rate schedulers

This may cause behavioral differences between Lightning and standard training paths.

Would you like me to help integrate Caramba's optimizer building logic into the Lightning module?

🤖 Prompt for AI Agents

In @runtime/engine/lightning_engine.py around lines 50 - 58, The Lightning module's configure_optimizers currently hardcodes AdamW which diverges from Caramba's optimizer construction; replace the fixed optimizer with Caramba's builder by delegating to StandardTrainer._build_optimizer (or the same internal factory) using the module's training config (self.train_cfg) and model parameters so options like AdamWMaster, Lion, SGD and schedulers are preserved; update configure_optimizers to call that builder and return whatever optimizer/scheduler tuple it produces, ensuring any required flags or device/amp settings from the Lightning context are forwarded.

coderabbitai · 2026-01-11T20:06:59Z

trainer/gradient_isolation.py

+            except Exception as e:
+                logger.warning(f"GradientIsolationTrainer: [best-effort] failed to wrap system module for distributed training: {e}")
+                # Best-effort: continue without distributed wrapping if it fails.


⚠️ Potential issue | 🟠 Major

Silently degrading to single-device training may mask critical distributed setup issues.

When distributed wrapping fails, continuing execution without it means the user-requested distributed training strategy is ignored. This could lead to:

Unexpected single-device training with potential OOM or drastically slower training

Training results that differ from expectations (e.g., effective batch size, gradient synchronization)

Difficult-to-diagnose issues when users assume distributed training is active

Consider re-raising the exception or at minimum elevating this to an error-level log with a prominent notice that distributed training has been disabled.

🔧 Proposed fix: Re-raise the exception

- except Exception as e: - logger.warning(f"GradientIsolationTrainer: [best-effort] failed to wrap system module for distributed training: {e}") - # Best-effort: continue without distributed wrapping if it fails. + except Exception as e: + raise RuntimeError(f"Failed to wrap system module for distributed training: {e}") from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

except Exception as e:

logger.warning(f"GradientIsolationTrainer: [best-effort] failed to wrap system module for distributed training: {e}")

# Best-effort: continue without distributed wrapping if it fails.

except Exception as e:

raise RuntimeError(f"Failed to wrap system module for distributed training: {e}") from e

🧰 Tools

🪛 Ruff (0.14.10)

197-197: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

In @trainer/gradient_isolation.py around lines 197 - 199, The current except block in GradientIsolationTrainer that catches Exception e and only logs a warning should not silently degrade to single-device mode; change it to log an error-level message that clearly states distributed wrapping failed and distributed training has been disabled, then re-raise the exception (or raise a new RuntimeError) so callers can fail fast and handle the failure; update the handler around the system module wrapping (the try/except around the distributed wrapping code in the GradientIsolationTrainer class/method) to use logger.error and then raise the original exception (or an explicit RuntimeError) instead of continuing.

trainer/gradient_isolation.py

coderabbitai · 2026-01-11T20:06:59Z

trainer/swap_manager.py

+        except Exception as e:
+            from caramba.console import logger as console_logger
+            console_logger.warning(f"SwapManager: Failed to load optimizer state: {e}")


🧹 Nitpick | 🔵 Trivial

Good: Logging improves observability of swap failures.

The change from silent exception swallowing to explicit warning logging aligns with the broader PR pattern of improving error visibility.

♻️ Optional: Consider narrowing exception types

While the broad Exception catch aligns with the "best-effort" design (documented in the docstring), you could improve specificity:

- except Exception as e: + except (RuntimeError, IOError, torch.cuda.OutOfMemoryError) as e: from caramba.console import logger as console_logger console_logger.warning(f"SwapManager: Failed to load optimizer state: {e}")

This would catch the most likely failure modes while still allowing truly unexpected exceptions to propagate. However, given the "best-effort and safe" design goal, the current broad catch may be intentional.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

except Exception as e:

from caramba.console import logger as console_logger

console_logger.warning(f"SwapManager: Failed to load optimizer state: {e}")

except (RuntimeError, IOError, torch.cuda.OutOfMemoryError) as e:

from caramba.console import logger as console_logger

console_logger.warning(f"SwapManager: Failed to load optimizer state: {e}")

🧰 Tools

🪛 Ruff (0.14.10)

32-32: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

In @trainer/swap_manager.py around lines 32 - 34, The current broad "except Exception as e" in the SwapManager swap/optimizer load path should be narrowed to only the expected failure types to avoid hiding unexpected bugs; replace the blanket except with a targeted catch (e.g., IOError/OSError for file access, pickle.UnpicklingError for deserialization, KeyError/ValueError for missing or malformed state, and any framework-specific load errors) while keeping the console_logger.warning(f"SwapManager: Failed to load optimizer state: {e}") behavior; locate the except block in SwapManager (the optimizer-state load method) and update the exception tuple accordingly so unexpected exceptions still propagate.

coderabbitai

Review continued from previous batch...

ccl/context_counts.py

coderabbitai · 2026-01-11T20:08:18Z

ccl/context_counts.py

+
+    uniq = sorted({int(v) for v in y.tolist()})
+    if num_classes is None:
+        num_classes = int(len(uniq))
+    if int(num_classes) <= 0:
+        raise ValueError("num_classes must be > 0")


🧹 Nitpick | 🔵 Trivial

Redundant int() cast.

Line 122: len(uniq) is already an int. The cast is unnecessary.

♻️ Remove redundant cast

if num_classes is None: - num_classes = int(len(uniq)) + num_classes = len(uniq)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

uniq = sorted({int(v) for v in y.tolist()})

if num_classes is None:

num_classes = int(len(uniq))

if int(num_classes) <= 0:

raise ValueError("num_classes must be > 0")

uniq = sorted({int(v) for v in y.tolist()})

if num_classes is None:

num_classes = len(uniq)

if int(num_classes) <= 0:

raise ValueError("num_classes must be > 0")

🧰 Tools

🪛 Ruff (0.14.10)

122-122: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

124-124: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

In @ccl/context_counts.py around lines 119 - 124, The assignment uses a redundant int() around len(uniq); replace "num_classes = int(len(uniq))" with "num_classes = len(uniq)" to avoid the unnecessary cast while preserving behavior (uniq and len(uniq) are already ints); keep the subsequent validation "if int(num_classes) <= 0" as-is or optionally remove its cast if you want consistency.

coderabbitai · 2026-01-11T20:08:18Z

ccl/system.py

+from __future__ import annotations
+
+"""CCL system wrapper.
+
+This is a small nn.Module so it can participate in the same orchestration path
+as other Caramba systems (e.g., can be returned from a trainer and inspected).
+"""
+


⚠️ Potential issue | 🟡 Minor

Module docstring should come before from __future__ import.

PEP 257 specifies that module docstrings should be the first statement in a module file. The current placement after the future import means Python won't recognize it as the module's __doc__.

Proposed fix

+"""CCL system wrapper. + +This is a small nn.Module so it can participate in the same orchestration path +as other Caramba systems (e.g., can be returned from a trainer and inspected). +""" + from __future__ import annotations -"""CCL system wrapper. - -This is a small nn.Module so it can participate in the same orchestration path -as other Caramba systems (e.g., can be returned from a trainer and inspected). -"""

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from __future__ import annotations

"""CCL system wrapper.

This is a small nn.Module so it can participate in the same orchestration path

as other Caramba systems (e.g., can be returned from a trainer and inspected).

"""

"""CCL system wrapper.

This is a small nn.Module so it can participate in the same orchestration path

as other Caramba systems (e.g., can be returned from a trainer and inspected).

"""

from __future__ import annotations

🤖 Prompt for AI Agents

In @ccl/system.py around lines 1 - 8, The module docstring in ccl.system is placed after the from __future__ import annotations statement so it isn't recognized as the module __doc__; move the triple-quoted module docstring to be the very first statement in the file (before from __future__ import annotations) so it becomes the module docstring while keeping the existing content and preserving the future import.

coderabbitai · 2026-01-11T20:08:18Z

ccl/system.py

+        b = int(tokens.shape[0])
+        c = int(len(self.models))
+        logits = np.empty((b, c), dtype=np.float32)
+        from caramba.ccl.context_counts import loglik_grid
+        for i in range(b):
+            grid = tokens[i]
+            for j, m in enumerate(self.models):
+                logits[i, j] = float(loglik_grid(m, grid))
+        return {"logits": torch.from_numpy(logits)}


🧹 Nitpick | 🔵 Trivial

Move import out of forward method to avoid per-call overhead.

The loglik_grid import on line 85 executes on every forward() call. While Python caches imports, there's still lookup overhead. Since loglik_grid is from the same package, move it to module-level imports.

Proposed fix

from caramba.ccl.context_counts import ClassCountsModel, ContextTemplate, predict_class +from caramba.ccl.context_counts import loglik_grid from caramba.ccl.patch_vq import PatchKMeansVQ

Then remove line 85:

logits = np.empty((b, c), dtype=np.float32) - from caramba.ccl.context_counts import loglik_grid for i in range(b):

🧰 Tools

🪛 Ruff (0.14.10)

83-83: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

🤖 Prompt for AI Agents

In @ccl/system.py around lines 82 - 90, The local import of loglik_grid inside the forward implementation causes unnecessary per-call lookup overhead; move the import to module scope by adding "from caramba.ccl.context_counts import loglik_grid" at the top of ccl/system.py and remove the in-method import on line containing "from caramba.ccl.context_counts import loglik_grid", leaving the loop that calls loglik_grid(m, grid) unchanged.

coderabbitai · 2026-01-11T20:08:18Z

core/event_codec/capnp_codec.py

+    def encode(self, event: EventEnvelope) -> Tensor:
+        if not isinstance(event, EventEnvelope):
+            raise TypeError(f"Expected EventEnvelope, got {type(event).__name__}")
+
+        schema = _get_schema()
+
+        # Create message
+        msg = schema.EventEnvelope.new_message()
+        msg.id = str(event.id)
+        msg.ts = float(event.ts)
+        msg.type = str(event.type)
+        msg.sender = str(event.sender)
+        msg.priority = int(event.priority)
+        msg.budgetMs = int(event.budget_ms) if event.budget_ms is not None else -1
+        msg.commitmentDelta = int(event.commitment_delta)
+        msg.commitmentId = str(event.commitment_id) if event.commitment_id else ""
+
+        # Serialize payload to JSON bytes
+        payload_bytes = json.dumps(event.payload, ensure_ascii=False).encode("utf-8")
+        msg.payload = payload_bytes
+
+        # Serialize to bytes
+        buf = msg.to_bytes()
+        if not buf:
+            raise ValueError("Cap'n Proto serialization produced empty bytes")
+
+        return torch.tensor(list(buf), dtype=torch.long)


🧹 Nitpick | 🔵 Trivial

Potential data loss: encoding bytes as int64 tensor is wasteful.

Each byte (0-255) is stored as a 64-bit integer, resulting in 8x memory overhead. Consider using torch.uint8 or torch.int8 for efficiency.

♻️ Proposed optimization to use uint8

- return torch.tensor(list(buf), dtype=torch.long) + return torch.tensor(list(buf), dtype=torch.uint8)

Note: This would require updating the decoder and any downstream consumers that expect torch.long.

🧰 Tools

🪛 Ruff (0.14.10)

52-52: Avoid specifying long messages outside the exception class

(TRY003)

74-74: Avoid specifying long messages outside the exception class

(TRY003)

coderabbitai · 2026-01-11T20:08:20Z

trainer/ccl.py

+def _collect_images_labels(
+    ds: Dataset[Any],
+    *,
+    input_key: str,
+    target_key: str,
+    max_items: int | None,
+) -> tuple[np.ndarray, np.ndarray]:
+    n = int(len(cast(Sized, ds)))
+    if max_items is not None:
+        n = min(n, int(max_items))
+    if n <= 0:
+        raise ValueError("Dataset is empty")
+
+    # Discover shape from first sample.
+    ex0 = _as_dict(ds[0])
+    x0 = ex0.get(str(input_key), None)
+    y0 = ex0.get(str(target_key), None)
+    if x0 is None or y0 is None:
+        raise KeyError(f"Dataset items must contain keys {input_key!r} and {target_key!r}")
+
+    def to_np_img(x: object) -> np.ndarray:
+        if isinstance(x, torch.Tensor):
+            t = x.detach().cpu()
+            if t.dtype != torch.float32:
+                t = t.float()
+            return t.numpy()
+        if isinstance(x, np.ndarray):
+            return x.astype(np.float32, copy=False)
+        raise TypeError(f"Unsupported image type {type(x).__name__}")
+
+    def to_int(y: Any) -> int:
+        if isinstance(y, torch.Tensor):
+            return int(y.detach().cpu().item())
+        return int(y)
+
+    # Collect images into (N,C,H,W) float32
+    x0n = to_np_img(x0)
+    if x0n.ndim == 2:
+        x0n = x0n[None, :, :]  # 1HW
+    if x0n.ndim != 3:
+        raise ValueError(f"Expected image as (H,W) or (C,H,W), got {x0n.shape}")
+    c, h, w = int(x0n.shape[0]), int(x0n.shape[1]), int(x0n.shape[2])
+
+    images = np.empty((n, c, h, w), dtype=np.float32)
+    labels = np.empty((n,), dtype=np.int64)
+    images[0] = x0n.astype(np.float32, copy=False)
+    labels[0] = np.int64(to_int(y0))
+
+    for i in range(1, n):
+        ex = _as_dict(ds[i])
+        xi = to_np_img(ex[str(input_key)])
+        yi = to_int(ex[str(target_key)])
+        if xi.ndim == 2:
+            xi = xi[None, :, :]
+        if xi.shape != (c, h, w):
+            raise ValueError(f"Inconsistent image shapes: expected {(c,h,w)}, got {xi.shape}")
+        images[i] = xi.astype(np.float32, copy=False)
+        labels[i] = np.int64(yi)
+
+    return images, labels


🧹 Nitpick | 🔵 Trivial

Consider potential memory pressure with large datasets.

The _collect_images_labels function pre-allocates and fills arrays sequentially. For very large datasets with max_items=None, this could cause memory pressure since all images are loaded into memory at once.

The implementation is correct, but consider adding a warning or documentation note about memory requirements for large datasets.

🧰 Tools

🪛 Ruff (0.14.10)

54-54: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

58-58: Avoid specifying long messages outside the exception class

(TRY003)

65-65: Avoid specifying long messages outside the exception class

(TRY003)

75-75: Avoid specifying long messages outside the exception class

(TRY003)

77-77: Dynamically typed expressions (typing.Any) are disallowed in y

(ANN401)

87-87: Avoid specifying long messages outside the exception class

(TRY003)

102-102: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

In @trainer/ccl.py around lines 47 - 106, The _collect_images_labels function can consume huge memory when max_items is None because it preallocates all images; add a brief docstring note describing memory implications and also emit a runtime warning/log when max_items is None and the dataset length (len(cast(Sized, ds))) exceeds a safe threshold (e.g., >10_000 or make threshold a small constant), using the module logger or warnings.warn to advise callers to pass max_items or use a streaming approach; update the _collect_images_labels docstring and insert the warning check early (before allocation) so users see it before large memory is reserved.

coderabbitai · 2026-01-11T20:08:20Z

trainer/ccl.py

+def _split_train_val(ds: Dataset[Any], *, val_frac: float) -> tuple[Subset[Any], Subset[Any]]:
+    n = int(len(cast(Sized, ds)))
+    n_val = int(round(float(n) * float(val_frac)))
+    n_val = max(1, min(n - 1, n_val)) if n > 1 else 0
+    n_train = int(n - n_val)
+    train = Subset(ds, range(0, n_train))
+    val = Subset(ds, range(n_train, n))
+    return train, val


⚠️ Potential issue | 🟡 Minor

Validation split doesn't shuffle - potential data ordering bias.

The train/val split takes the first n_train items for training and the remainder for validation. If the dataset has any ordering (e.g., sorted by class), this could introduce bias.

🔧 Proposed fix to shuffle indices before splitting

-def _split_train_val(ds: Dataset[Any], *, val_frac: float) -> tuple[Subset[Any], Subset[Any]]: +def _split_train_val(ds: Dataset[Any], *, val_frac: float, seed: int = 0) -> tuple[Subset[Any], Subset[Any]]: n = int(len(cast(Sized, ds))) n_val = int(round(float(n) * float(val_frac))) n_val = max(1, min(n - 1, n_val)) if n > 1 else 0 n_train = int(n - n_val) - train = Subset(ds, range(0, n_train)) - val = Subset(ds, range(n_train, n)) + rng = np.random.default_rng(seed) + indices = rng.permutation(n).tolist() + train = Subset(ds, indices[:n_train]) + val = Subset(ds, indices[n_train:]) return train, val

🧰 Tools

🪛 Ruff (0.14.10)

110-110: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

111-111: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

114-114: Unnecessary start argument in range

Remove start argument

(PIE808)

🤖 Prompt for AI Agents

In @trainer/ccl.py around lines 109 - 116, The current _split_train_val implementation slices the dataset in order, which can bias splits if data is ordered; instead, generate a list of indices (0..n-1), shuffle them (e.g., via random.shuffle or a passed RNG for reproducibility), then take the first n_train indices for train and the rest for val and construct Subset(ds, train_indices) and Subset(ds, val_indices); update _split_train_val to perform this shuffled-index split so train/val are randomized.

coderabbitai · 2026-01-11T20:08:20Z

trainer/ccl.py

+            try:
+                from caramba.ccl.context_counts import sample_grid
+                import matplotlib.pyplot as plt  # type: ignore
+            except Exception as e:
+                logger.warning(f"CCL generation skipped (missing deps): {type(e).__name__}: {e}")
+            else:
+                ht, wt = int(train_tokens.shape[1]), int(train_tokens.shape[2])
+                # Assume channels inferred from data.
+                channels = int(train_images.shape[1])
+                n_per = int(max(1, self.n_gen_per_class))
+                c = int(len(models))
+                fig = plt.figure(figsize=(max(6, int(n_per * 1.2)), max(3, int(c * 0.8))), dpi=150)
+                plot_idx = 1
+                for ci in range(c):
+                    for j in range(n_per):
+                        g = sample_grid(models[ci], ht=ht, wt=wt, seed=int(self.seed + 1000 + ci * 100 + j))
+                        img = codec.decode(g, centers=centers, channels=int(channels))
+                        ax = fig.add_subplot(c, n_per, plot_idx)
+                        if channels == 1:
+                            ax.imshow(img[0], cmap="gray", vmin=0.0, vmax=1.0)
+                        else:
+                            ax.imshow(np.transpose(img, (1, 2, 0)))
+                        ax.set_xticks([])
+                        ax.set_yticks([])
+                        if j == 0:
+                            ax.set_ylabel(str(class_to_label.get(ci, ci)), rotation=0, labelpad=10)
+                        plot_idx += 1
+                plt.tight_layout(pad=0.2)
+                out_path = base_out / "generated_grid.png"
+                plt.savefig(out_path)
+                logger.info(f"Saved CCL samples to {out_path}")


⚠️ Potential issue | 🟡 Minor

Broad exception catch suppresses potentially important errors.

Catching bare Exception at line 379 may suppress important errors beyond just missing dependencies. Consider catching more specific exceptions.

🔧 Proposed fix to catch specific exceptions

if bool(self.generate): try: from caramba.ccl.context_counts import sample_grid import matplotlib.pyplot as plt # type: ignore - except Exception as e: - logger.warning(f"CCL generation skipped (missing deps): {type(e).__name__}: {e}") + except (ImportError, ModuleNotFoundError) as e: + logger.warning(f"CCL generation skipped (missing deps): {type(e).__name__}: {e}") else:

🧰 Tools

🪛 Ruff (0.14.10)

379-379: Do not catch blind exception: Exception

(BLE001)

386-386: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

🤖 Prompt for AI Agents

In @trainer/ccl.py around lines 376 - 406, The try/except around "from caramba.ccl.context_counts import sample_grid" and "import matplotlib.pyplot as plt" is catching all Exception; narrow it to only import-related errors by catching ImportError and ModuleNotFoundError (e.g., replace "except Exception as e" with "except (ImportError, ModuleNotFoundError) as e") so runtime errors in the generation logic (sample_grid, codec.decode, plotting) still surface; keep the existing logger.warning message and the rest of the else block unchanged.

coderabbitai · 2026-01-11T20:08:21Z

trainer/ccl.py

+                plt.tight_layout(pad=0.2)
+                out_path = base_out / "generated_grid.png"
+                plt.savefig(out_path)
+                logger.info(f"Saved CCL samples to {out_path}")


⚠️ Potential issue | 🟡 Minor

Missing plt.close(fig) may cause resource leak.

The matplotlib figure is created but never explicitly closed. In long-running processes or repeated calls, this can lead to memory accumulation.

🔧 Proposed fix to close the figure after saving

plt.tight_layout(pad=0.2) out_path = base_out / "generated_grid.png" plt.savefig(out_path) + plt.close(fig) logger.info(f"Saved CCL samples to {out_path}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

plt.tight_layout(pad=0.2)

out_path = base_out / "generated_grid.png"

plt.savefig(out_path)

logger.info(f"Saved CCL samples to {out_path}")

plt.tight_layout(pad=0.2)

out_path = base_out / "generated_grid.png"

plt.savefig(out_path)

plt.close(fig)

logger.info(f"Saved CCL samples to {out_path}")

🤖 Prompt for AI Agents

In @trainer/ccl.py around lines 403 - 406, After saving the figure with plt.savefig(out_path) in the CCL sampling code, explicitly close the Matplotlib figure to avoid resource leaks; update the block around plt.tight_layout(pad=0.2), plt.savefig(out_path), logger.info(...) to call plt.close(fig) (or plt.close() if no fig variable is used) immediately after logger.info so the figure is released.

coderabbitai · 2026-01-11T20:08:21Z

trainer/standard.py

+
+        # Reset shared tuner for new training runs
+        from caramba.layer.memory_block.memory.tuner import reset_shared_tuner
+        reset_shared_tuner()


🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

# First, let's explore the repository structure git ls-files | grep -E "(trainer|caramba)" | head -20

Repository: TheApeMachine/caramba

Length of output: 644

🏁 Script executed:

# Find and read the trainer/standard.py file to understand context find . -name "standard.py" -path "*/trainer/*" | head -5

Repository: TheApeMachine/caramba

Length of output: 86

🏁 Script executed:

# Get the file size first to determine how to read it wc -l trainer/standard.py 2>/dev/null || find . -name "standard.py" -path "*/trainer/*" -exec wc -l {} \;

Repository: TheApeMachine/caramba

Length of output: 89

🏁 Script executed:

# Read the context around lines 343-346 sed -n '340,350p' trainer/standard.py

Repository: TheApeMachine/caramba

Length of output: 499

🏁 Script executed:

# Read the module-level imports at the top of trainer/standard.py head -50 trainer/standard.py

Repository: TheApeMachine/caramba

Length of output: 1664

🏁 Script executed:

# Check if the tuner module exists and examine its imports find . -path "*/memory_block/memory/tuner.py" | head -1

Repository: TheApeMachine/caramba

Length of output: 101

🏁 Script executed:

# Read the tuner.py file to check for circular dependencies head -50 ./layer/memory_block/memory/tuner.py

Repository: TheApeMachine/caramba

Length of output: 1788

🏁 Script executed:

# Check if reset_shared_tuner is ever imported at module level rg "from caramba.layer.memory_block.memory.tuner import|import.*reset_shared_tuner" --type py

Repository: TheApeMachine/caramba

Length of output: 1490

🏁 Script executed:

# Find where reset_shared_tuner is defined and understand it rg "def reset_shared_tuner" -A 5

Repository: TheApeMachine/caramba

Length of output: 445

🏁 Script executed:

# Check if trainer/standard.py imports from caramba.layer modules at the top sed -n '1,60p' trainer/standard.py | grep "caramba.layer"

Repository: TheApeMachine/caramba

Length of output: 177

🏁 Script executed:

# Check if there's any circular dependency by looking at what tuner imports cat ./layer/memory_block/memory/tuner.py | head -100

Repository: TheApeMachine/caramba

Length of output: 4000

🏁 Script executed:

# Check the frequency of _run_single being called to understand overhead impact rg "_run_single" --type py -B 2 -A 2

Repository: TheApeMachine/caramba

Length of output: 3119

🏁 Script executed:

# Let's see the broader context of the method to understand when/how often this is called sed -n '330,360p' trainer/standard.py

Repository: TheApeMachine/caramba

Length of output: 1213

🏁 Script executed:

# Check if trainer/standard.py is imported by tuner or any of its dependencies rg "from caramba.trainer|import.*trainer" ./layer/memory_block/memory/tuner.py ./layer/memory_block/memory/telemetry.py

Repository: TheApeMachine/caramba

Length of output: 47

🏁 Script executed:

# Check if any layer modules that trainer imports also import from trainer (circular) grep -n "from caramba.trainer" ./layer/memory_block/block.py ./layer/memory_block/memory/memory.py ./layer/memory_block/memory/routing.py 2>/dev/null | head -20

Repository: TheApeMachine/caramba

Length of output: 47

🏁 Script executed:

# Let's see the full method signature and understand how many times _run_single is called sed -n '330,340p' trainer/standard.py

Repository: TheApeMachine/caramba

Length of output: 346

Moving inline import to module level is optional—no circular dependency exists, but the pattern is intentionally consistent across the codebase.

There is no circular import issue with moving from caramba.layer.memory_block.memory.tuner import reset_shared_tuner to the module level. The tuner module only imports from torch, math, typing, and internal telemetry—it has no dependencies on the trainer module. However, the inline import pattern is used consistently throughout the codebase (routing.py, writer.py, reader.py, memory.py), suggesting a deliberate architectural choice. Since the function is called once per training run (not in a tight loop), the overhead is negligible. If you prefer consistency with the rest of the codebase, keep it inline; if you prefer cleanliness at the module level, moving it is safe.

🤖 Prompt for AI Agents

In @trainer/standard.py around lines 343 - 346, Keep the inline import of reset_shared_tuner where it’s called rather than moving it to module level to preserve the established pattern used across the codebase; ensure the code still calls reset_shared_tuner() in the same place and add a brief comment above the inline import explaining the consistency rationale (the function name to locate is reset_shared_tuner).

coderabbitai

Review continued from previous batch...

coderabbitai · 2026-01-11T20:09:31Z

layer/memory_block/memory/tuner.py

+    def step(self, improved: bool) -> float:
+        """Take a step based on whether last move improved objective.
+
+        Momentum increments/decrements by 1 each step, acting as multiplier.
+
+        Args:
+            improved: True if objective improved, False if degraded
+
+        Returns:
+            Delta applied to value
+        """
+        if not self.active:
+            # Check if cooldown period has passed
+            self.steps_since_deactivation += 1
+            if self.steps_since_deactivation >= self.cooldown:
+                # Reactivate
+                self.active = True
+                self.steps_without_improvement = 0
+                self.steps_since_deactivation = 0
+                self.momentum = 1  # Start with small momentum
+                self.direction = 1.0 if self.value < (self.min_value + self.max_value) / 2 else -1.0
+            else:
+                return 0.0
+
+        if improved:
+            # Accelerate: increase momentum by 1
+            self.momentum = min(self.momentum + 1, self.max_momentum)
+            self.steps_without_improvement = 0
+            self.braking = False  # Not braking if improving
+        else:
+            # Decelerate: decrease momentum by 1
+            self.momentum = max(self.momentum - 1, 0)
+            self.steps_without_improvement += 1
+
+            # If momentum reached zero, decide what to do
+            if self.momentum == 0:
+                if self.braking:
+                    # We were braking and have now fully stopped - reverse direction
+                    self.direction *= -1.0
+                    self.momentum = 1  # Start moving in new direction
+                    self.braking = False
+                    self.steps_without_improvement = 0
+                elif self.steps_without_improvement >= self.patience:
+                    # Stop exploring after patience exhausted
+                    self.active = False
+                    self.steps_since_deactivation = 0
+                    return 0.0
+                else:
+                    # Start braking phase (will reverse after fully stopped)
+                    self.braking = True
+                    self.momentum = 1  # Continue decelerating
+
+        # Apply step: momentum * base_step_size * direction
+        delta = self.direction * self.momentum * self.base_step_size
+        new_value = self.value + delta
+
+        # Bounce off boundaries
+        if new_value < self.min_value or new_value > self.max_value:
+            # Hit boundary - reverse and reduce momentum
+            self.direction *= -1.0
+            self.momentum = max(1, self.momentum // 2)
+            new_value = max(self.min_value, min(self.max_value, new_value))
+
+        actual_delta = new_value - self.value
+        self.value = new_value
+
+        return actual_delta


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Check for existing unit tests for ParameterExplorer rg -n "ParameterExplorer" --type=py -g "*test*"

Repository: TheApeMachine/caramba

Length of output: 698

🏁 Script executed:

# First, let's locate the file and understand its structure find . -name "tuner.py" -type f | head -20

Repository: TheApeMachine/caramba

Length of output: 101

🏁 Script executed:

# Get the full class definition to understand ParameterExplorer rg -A 200 "class ParameterExplorer" --type=py

Repository: TheApeMachine/caramba

Length of output: 15059

🏁 Script executed:

# Now let's read the test file to see what edge cases are actually covered cat -n layer/memory_block/memory/tuner_test.py

Repository: TheApeMachine/caramba

Length of output: 6814

Add unit tests for two specific edge cases in ParameterExplorer.step().

Tests exist but don't cover: (1) boundary collision while in braking state (when momentum reaches 0 and direction reverses), and (2) full cooldown-to-active reactivation cycle. The state machine logic itself appears sound, but these transitions should be explicitly verified with test cases.

🧰 Tools

🪛 Ruff (0.14.10)

45-45: Boolean-typed positional argument in function definition

(FBT001)

🤖 Prompt for AI Agents

In @layer/memory_block/memory/tuner.py around lines 45 - 111, Add two unit tests for ParameterExplorer.step(): one that forces a braking-edge collision by setting up an explorer with momentum=1, braking=True, direction pointing toward a boundary, and a non-improving step so momentum falls to 0 and triggers the braking-reversal logic; assert that after the step the direction has flipped, momentum is reset to >=1, value is clamped inside [min_value, max_value], and braking is cleared. The second test should cover the cooldown-to-active reactivation: initialize explorer.active=False, steps_since_deactivation set to cooldown-1, then call step and assert it returns 0.0 until cooldown is reached, after which explorer.active becomes True, steps_since_deactivation and steps_without_improvement are zero, momentum==1, and direction is set to 1.0 or -1.0 based on value vs (min_value+max_value)/2; use the ParameterExplorer.step, active, cooldown, steps_since_deactivation, steps_without_improvement, momentum, direction, base_step_size, min_value and max_value attributes to drive and verify behavior.

…or handling, and updating documentation for clarity. Adjusted various modules to improve consistency and readability.

claude and others added 23 commits January 9, 2026 05:48

Update ai/process/__init__.py

3aaf9cb

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Update instrumentation/hdf5_store.py

08a7c60

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Update instrumentation/run_logger.py

2ae72a5

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Merge pull request #7 from TheApeMachine/claude/find-perf-issues-mk6g…

1fdf8ca

…543tv8g3aojo-7jbM7 Fix performance anti-patterns across multiple modules

fix: Remove platform_improve from process map per user request

5cc2f78

Merge pull request #8 from TheApeMachine/claude/review-ai-package-nKdEV

648a72f

Review AI package for machine learning assistance

updates

556ae3b

Create prepare_fineweb.py

a816301

Update prepare_fineweb.py

28f12b1

updates

f39af6d

Merge branch 'main' of https://github.com/theapemachine/caramba

b4b1518

fix

b6c7cc5

fixes

4424b76

fixes

099278e

fixes

767c13a

fixes

fd64703

fixes

26ab603

CCP

8153edd

auto-tuning and cluster deployment

ac7ca9f

TheApeMachine added 2 commits January 10, 2026 23:29

Merge branch 'main' of https://github.com/theapemachine/caramba

44d5fb8

Merge branch 'main' into feature/fp32-support-for-metal

45e6115

gemini-code-assist bot reviewed Jan 10, 2026

View reviewed changes

coderabbitai bot requested changes Jan 10, 2026

View reviewed changes

Refactor and clean up codebase: removed unused files, updated .gitign…

6922553

…ore, added CCL trainer, and improved error handling across various modules.

coderabbitai bot requested changes Jan 11, 2026

View reviewed changes

Refactor codebase by removing 'best_effort' references, enhancing err…

dc5f3e7

…or handling, and updating documentation for clarity. Adjusted various modules to improve consistency and readability.

TheApeMachine merged commit fcd4198 into main Jan 11, 2026
0 of 2 checks passed

	return torch.tensor(list(buf), dtype=torch.long)
	return torch.tensor(list(buf), dtype=torch.uint8)

	setattr(self, actual_name, int(round(final_val)))
	setattr(self, actual_name, round(final_val))

-            tasks = [t async for t in self.task_store.list_tasks(tenant=self.tenant)]
+            task_count = 0
+            async for task in self.task_store.list_tasks(tenant=self.tenant):
+                task_count += 1
+                logger.trace(f"Enqueuing task {task.id}")
+                await self.queue.put(task)
+            logger.trace(f"Enqueued {task_count} pending tasks")

Feature/fp32 support for metal #9

Feature/fp32 support for metal #9

Uh oh!

Conversation

TheApeMachine commented Jan 10, 2026

Uh oh!

coderabbitai bot commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

gemini-code-assist bot commented Jan 10, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

.vscode/c_cpp_properties.json (1-26)

ai/init.py (51-87)

ai/connection.py (352)

ai/tools/codegraph/tool.py (81-84)

ai/tools/deeplake/tool.py (167-188)

codegraph/sync.py (111)

core/event_codec/init.py (1-23)

docker/Dockerfile.persona-a2a (1-131)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Jan 10, 2026 •

edited

Loading