Skip to content

feat/new anneal shard#683

Merged
joellidin merged 4 commits intodevfrom
feat/new-anneal-shard
Jan 17, 2026
Merged

feat/new anneal shard#683
joellidin merged 4 commits intodevfrom
feat/new-anneal-shard

Conversation

@joellidin
Copy link
Copy Markdown
Collaborator

@joellidin joellidin commented Jan 17, 2026

  • (neurons) Switch anneal mode to shard 4
  • (hparams) Revert to pre-anneal checkpoint
  • (hparams) Adjust anneal schedule for new shard
  • Bump run version

Description

Related Issue(s)

  • Closes #[issue number]

Type of Change

  • Feature (adding new functionality)
  • Fix (resolving a bug or issue)
  • Docs (documentation updates)
  • Refactor (code changes that don't affect functionality)
  • Maintenance (dependency updates or other maintenance)
  • Tests (adding or improving tests)
  • Breaking change (fix or feature with incompatible API changes)
  • Other: _____

Branch Naming

  • My branch follows the project's naming convention (e.g., feature/add-new-capability)

Commit Messages

  • My commits are small, atomic, and have proper commit messages
  • Commit messages are in imperative mood with a capitalized summary under 50 chars

Code Quality

  • I've performed a self-review of my code
  • I've added appropriate docstrings following the project's conventions
  • I've added proper logging where necessary (without trailing periods)
  • I've applied linting and formatting with Ruff
  • My code generates no new warnings

Testing

  • I've added tests for new functionality or bug fixes
  • All tests pass locally with my changes
  • Test coverage has not decreased

Documentation

  • I've updated documentation to reflect my changes
  • I've updated comments in hard-to-understand areas

If this is a breaking change

Screenshots/Examples

Additional Notes

Summary by CodeRabbit

  • Chores

    • Version bumped to 2.1.26
  • Configuration Updates

    • Updated annealing schedule parameters (warmup and decay steps adjusted)
    • Updated checkpoint initialization settings
  • Updates

    • Modified shard assignment logic in anneal mode
    • Updated documentation examples for partial data migration

✏️ Tip: You can customize this high-level summary in your review settings.

Update anneal mode to use shard 4 instead of shard 2 for both miner and
validator. This change ensures consistency across the network as the
anneal schedule progresses.

- Update miner.py to use shard 4 in anneal mode
- Update validator.py to use shard 4 in anneal mode
- Update documentation with shard 4 migration examples
Roll back checkpoint_init_version to 2.1.18 and checkpoint_init_window
to 62834 to restore the bootstrapping configuration from before the
annealing phase began.
Update anneal mode parameters for the 22.8B token shard:
- warmup_inner_steps: 200 → 100
- decay_outer_steps: 550 → 120
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 17, 2026

Walkthrough

Package version bumped to 2.1.26. Shard assignment in anneal mode changed from shard 2 to shard 4 across miner and validator neurons. Annealing schedule hyperparameters updated (warmup_inner_steps: 200→100, decay_outer_steps: 550→120). Checkpoint initialization parameters and documentation example updated to match new shard configuration.

Changes

Cohort / File(s) Summary
Version & Core Package
src/tplr/__init__.py
Version string bumped from 2.1.25 to 2.1.26
Hyperparameters
hparams/hparams.json
Updated annealing schedule (warmup_inner_steps: 200→100, decay_outer_steps: 550→120) and checkpoint initialization (checkpoint_init_version: 2.1.24→2.1.18, checkpoint_init_window: 63571→62834)
Shard Assignment Logic
neurons/miner.py, neurons/validator.py
Anneal mode shard assignment changed from shard 2 to shard 4; current_shard/shard_epoch forced to 4/0 with updated comments
Documentation
docs/shared_sharded_dataset.md
Partial Migration example updated to reference anneal shard 4 instead of shard 2 (rclone commands updated)

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~4 minutes

Possibly related PRs

  • v2.1.24 #680 — Modifies the same anneal-mode shard selection in neurons/miner.py and neurons/validator.py (changed shard to 2; this PR changes it to 4).
  • v2.1.23 #678 — Updates the same hyperparameters in hparams/hparams.json (warmup_inner_steps modified in opposite direction).
  • feat/new anneal shard #679 — Modifies the same shard assignment logic in neurons/miner.py and neurons/validator.py during anneal mode (forced shard 2 in same code path).

Suggested reviewers

  • shivam-MBZUAI
  • amiiir-sarfi

Poem

🐰 From shard two to shard four we hop,
Hyperparameters tuned—no need to stop!
Version 2.1.26 takes the stage,
A well-rehearsed and coordinated page! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description provides a bulleted summary of key changes but lacks the detailed 'Description' section explaining the what/why, and all template checkboxes remain unchecked, indicating incomplete form completion. Add a detailed description section explaining the purpose and rationale for switching to shard 4, and complete the relevant checkboxes (Type of Change, Branch Naming, Code Quality, etc.) to reflect actual work performed.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main change: switching to a new anneal shard (shard 4), which is the primary focus of the PR across miner, validator, hparams, and docs.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Jan 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

❌ Your project status has failed because the head coverage (57.69%) is below the target coverage (85.00%). You can increase the head coverage or adjust the target coverage.

Impacted file tree graph

@@           Coverage Diff           @@
##              dev     #683   +/-   ##
=======================================
  Coverage   57.69%   57.69%           
=======================================
  Files          27       27           
  Lines        4990     4990           
=======================================
  Hits         2879     2879           
  Misses       2111     2111           
Files with missing lines Coverage Δ
src/tplr/__init__.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@joellidin joellidin merged commit c104a64 into dev Jan 17, 2026
7 of 8 checks passed
This was referenced Jan 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant