Problem
Multiple large example files with overlapping functionality:
examples/chat_cli.py (990 lines)
examples/chat_cli_thinking.py (714 lines)
examples/chat_cli_moe.py (572 lines)
examples/demo_v02_full.py (633 lines)
examples/demo_v0210.py (593 lines)
Issues
- Duplicated code across examples
- Version-specific demos become stale
- Hard to maintain
Proposed Structure
examples/
├── chat/
│ ├── basic.py (simple chat)
│ ├── streaming.py (streaming output)
│ ├── thinking.py (thinking mode)
│ └── moe.py (MoE models)
├── inference/
│ ├── benchmark.py (performance testing)
│ └── batch.py (batch inference)
├── audio/
│ ├── stt.py (speech-to-text)
│ └── realtime.py (real-time processing)
└── common/
└── utils.py (shared utilities)
Benefits
- No duplicated code
- Clear categorization
- Easier to maintain
- Remove stale version-specific demos
Problem
Multiple large example files with overlapping functionality:
examples/chat_cli.py(990 lines)examples/chat_cli_thinking.py(714 lines)examples/chat_cli_moe.py(572 lines)examples/demo_v02_full.py(633 lines)examples/demo_v0210.py(593 lines)Issues
Proposed Structure
Benefits