Add Tinker SDK LLM router tutorial for Nemotron Nano 30B by vikalluru · Pull Request #155 · NVIDIA-NeMo/Nemotron

vikalluru · 2026-04-18T17:10:18Z

Summary

Adds a Jupyter notebook tutorial: DPO + RLVR fine-tuning of Nemotron Nano 30B as a production LLM router using Tinker SDK
Based on Glean's real production deployment: LoRA rank 32, RouterSearchEnv, reward = 0.7·R_search + 0.3·R_termination, vLLM on Vertex AI
Uses open-source training data: Salesforce/xlam-function-calling-60k
Covers full workflow: architecture overview → DPO fine-tuning → RLVR fine-tuning → vLLM serving config → offline eval
Production latency achieved: P50=250ms, P95=2.5s on H100 single-replica

Location

usage-cookbook/Nemotron-3-Nano/tinker_llm_router_tutorial.ipynb

Test plan

Notebook runs end-to-end on H100 with Tinker SDK installed
DPO training cell completes with LoRA rank 32 config
RLVR training cell completes with RouterSearchEnv reward function
vLLM serving config verified against production params
Offline eval cell produces router accuracy metrics

🤖 Generated with Claude Code

Joint NVIDIA × Glean tutorial demonstrating two-phase DPO + RLVR fine-tuning of Nemotron Nano 30B as a production LLM router using Tinker SDK. Based on Glean's production deployment (Salesforce/xlam dataset, vLLM on Vertex AI, 250ms P50 / 2.5s P95 latency). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Tinker SDK LLM router tutorial for Nemotron Nano 30B#155

Add Tinker SDK LLM router tutorial for Nemotron Nano 30B#155
vikalluru wants to merge 1 commit intoNVIDIA-NeMo:mainfrom
vikalluru:llm-router-tinker-tutorial

vikalluru commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vikalluru commented Apr 18, 2026

Summary

Location

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant