A Tinker workspace for experimenting with iterative writing.
🚧 Early development
Done:
- Tinker healthcheck
- REPL to access Qwen3 base models and checkpoints
- Training loop with checkpoint save/resume
Example datasets in example_data/ (JSONL format, each line has title and content):
tiny_poems.jsonl— small dataset for prototypingmore_poems.jsonl— larger public domain collection
- Python 3.13
- PDM
pdm installCreate a .env file at the project root with your Tinker API key:
TINKER_API_KEY=your_api_key_here
Verify your Tinker connection:
pdm run tinkercheckStart the interactive REPL:
pdm run repl # interactive model selection
pdm run repl checkpoint=tinker://... # load from checkpointSupports Qwen3-8B and Qwen3-32B with thinking mode. Commands: /clear, /debug, /exit.
Run the training loop:
pdm run train # uses defaults from train_config.py
pdm run train --help # see all optionspdm run format # Format code with ruff
pdm run lint # Lint and fix with ruff
pdm run typecheck # Type check with mypyUse the REPL (pdm run repl) to chat with base models or fine-tuned checkpoints. The model uses Qwen3's native thinking mode—you'll see <think> reasoning blocks before responses.
Key flags:
checkpoint=tinker://...— load a trained checkpoint (base model is inferred automatically)model_name=Qwen/Qwen3-8B— override model selection
The REPL is useful for spot-checking checkpoint quality, testing prompts, and interactive experimentation.
Use pdm run train to run the self-improving training loop. Each iteration generates candidate descriptions for poems, scores them by how well they help predict the target poem (lower loss = better), and trains on the top-K winners.
Key flags:
dataset_path=...— path to poems JSONLresume_from=tinker://...— resume from a checkpoint (includes optimizer state)max_iterations=N— number of passes through the dataset
Checkpoints are saved to logs/poetry-train/<run-name>/ with auto-generated run names. See docs/training.md for the full algorithm description.
The docs/ directory contains design notes and training writeups.
docs/training.md: The current architecture for self-improving poetry generation. This is where we track the end-to-end training loop (candidate description generation, scoring/selection, data construction, and how we train the finaldescription -> (title, poem)behavior).
MIT