Version: 2.0.0 → 3.0.0 Updated: October 17, 2025
This roadmap outlines the evolution of azlin from a single-VM provisioning tool to a comprehensive Azure development fleet platform with shared storage, distributed workflows, and advanced coordination features.
Primary Strategic Initiative: Enable multi-VM distributed development workflows through shared network storage (Azure Files NFS).
Goal: Enable multiple VMs to share home directories via Azure Files NFS.
Epic: Basic Azure Files NFS integration Effort: 3 weeks Value: HIGH - Enables all subsequent distributed features
Commands:
azlin storage create <name> [--tier Premium|Standard] [--size 1TB]
azlin storage list
azlin storage status <name>
azlin storage delete <name>Implementation:
- Create Storage Account with NFS v4.1
- Configure VNet integration
- Generate mount scripts
- Add to cloud-init templates
- Update cost tracking
Deliverables:
storage_manager.pymodule- Storage command group in CLI
- Unit and integration tests
- Documentation with examples
Epic: Provision VMs with shared storage Effort: 2 weeks Value: HIGH - Core user-facing feature
Commands:
azlin new --shared-home <storage-name>
azlin storage attach <storage-name> --vm <vm-name>
azlin storage detach --vm <vm-name>Implementation:
- Modify
vm_provisioning.pycloud-init - Add NFS mount to
/home/azureuser - Handle mount failures gracefully
- Ensure UID/GID consistency
Deliverables:
- Updated provisioning flow
- Mount configuration templates
- E2E tests for multi-VM scenarios
- Troubleshooting guide
Epic: Selective shared/local mounts Effort: 2 weeks Value: MEDIUM - Performance optimization
Commands:
azlin new --shared-mounts projects:shared-dev,data:shared-data
azlin storage mount <storage-name> --vm <vm> --path ~/projects
azlin storage unmount --vm <vm> --path ~/projectsImplementation:
- Support multiple mount points
- Keep .cache, build/ local for performance
- Add mount_manager.py module
- Document best practices
Deliverables:
- Flexible mount configuration
- Performance guidelines
- Example configurations
- Migration guide (full shared → hybrid)
Epic: Pre-configured storage layouts Effort: 1 week Value: MEDIUM - Improved UX
Built-in Templates:
ml-training: Large data mount, local cacheci-runner: Shared build cache, local workspacedev-team: Shared projects, local configtesting: Shared test data, local results
Commands:
azlin new --storage-template ml-training
azlin storage templates
azlin storage template create <name> --shared <paths> --local <paths>Deliverables:
- Template definitions
- Template documentation
- Custom template support
- Migration between templates
Phase 1 Total: 8 weeks Release: azlin v2.1.0 with Shared Storage
Goal: Advanced multi-VM orchestration and state management.
Epic: Coordinated fleet operations Effort: 2 weeks Value: HIGH - Enables distributed workflows
Commands:
azlin fleet exec --sync "python train.py" # Start together
azlin fleet exec --sequential "npm test" # One at a time
azlin fleet exec --aggregate "pytest --json" # Collect results
azlin fleet exec --with-leader "python coordinator.py"Implementation:
- Synchronization primitives (using shared storage locks)
- Leader election mechanism
- Result aggregation framework
- Progress tracking across fleet
Deliverables:
fleet_coordinator.pymodule- Distributed locking utilities
- Examples: distributed testing, parallel training
Epic: Manage VMs as cohesive fleets Effort: 2 weeks Value: HIGH - Organizational improvement
Commands:
azlin fleet create <name> --size 5
azlin fleet list
azlin fleet start/stop/restart <name>
azlin fleet env set <name> VAR=value
azlin fleet snapshot <name>Implementation:
- Fleet metadata storage (in config)
- Bulk operations on fleet members
- Fleet-wide environment variables
- Coordinated start/stop
Deliverables:
fleet_manager.pymodule- Fleet configuration schema
- Fleet tagging and discovery
- Documentation
Epic: Expose fleet as load-balanced service Effort: 2 weeks Value: MEDIUM - Web service support
Commands:
azlin fleet create web-servers --size 3 --load-balancer
azlin lb status <fleet-name>
azlin lb health-check <fleet> --path /health --port 8080
azlin lb add-backend <fleet> <vm-name>
azlin lb remove-backend <fleet> <vm-name>Implementation:
- Azure Load Balancer creation
- Backend pool management
- Health probe configuration
- Auto-registration of VMs
Deliverables:
load_balancer.pymodule- Health check templates
- Web service examples
- Load testing guide
Phase 2 Total: 6 weeks Release: azlin v2.2.0 with Fleet Management
Goal: Enhanced developer workflows and specialized VM types.
Epic: First-class GPU VM support Effort: 2 weeks Value: HIGH - ML/AI workflows
Commands:
azlin new --gpu [--vm-size Standard_NC6s_v3]
azlin gpu status <vm-name>
azlin gpu monitor <vm-name> [--interval 5s]
azlin gpu cost <vm-name>Implementation:
- NVIDIA driver installation
- CUDA toolkit setup
- GPU metrics collection
- Cost optimization (auto-stop)
Deliverables:
- GPU provisioning templates
- Driver installation scripts
- Monitoring dashboard
- ML framework examples (PyTorch, TensorFlow)
Epic: Automated Jupyter setup Effort: 1 week Value: MEDIUM - Data science workflows
Commands:
azlin new --jupyter [--port 8888]
azlin jupyter connect <vm-name> # Opens browser
azlin jupyter config <vm-name> --password
azlin jupyter token <vm-name>Implementation:
- JupyterLab installation
- HTTPS with self-signed cert
- SSH tunnel automation
- Browser auto-launch
Deliverables:
- Jupyter provisioning module
- Connection utilities
- Security configuration
- Example notebooks
Epic: VS Code dev containers on Azure VMs Effort: 2 weeks Value: MEDIUM - Modern dev workflows
Commands:
azlin new --dev-container --repo <github-url>
azlin dev-container list <vm-name>
azlin dev-container rebuild <vm-name>Implementation:
- Parse devcontainer.json
- Docker/Podman container setup
- Mount shared storage into containers
- VS Code Remote-SSH integration
Deliverables:
- Dev container provisioning
- Volume mount configuration
- VS Code integration guide
- Example dev containers
Phase 3 Total: 5 weeks Release: azlin v2.3.0 with Developer Features
Goal: Production-grade monitoring, logging, and backup.
Epic: Fleet-wide log aggregation Effort: 2 weeks Value: HIGH - Operations visibility
Commands:
azlin logs enable --fleet <name> --workspace <log-analytics>
azlin logs search "error" --fleet <name> [--since 1h]
azlin logs tail --fleet <name> [--follow]
azlin logs alert "out of memory" --email admin@example.comImplementation:
- Azure Log Analytics integration
- Log forwarding agents
- Query interface
- Alert configuration
Deliverables:
- Logging infrastructure setup
- Query examples
- Alert templates
- Troubleshooting playbooks
Epic: VM and application monitoring Effort: 2 weeks Value: MEDIUM - Performance insights
Commands:
azlin metrics <vm-name> [--since 1h]
azlin metrics --fleet <name>
azlin metrics export <vm-name> metrics.csv
azlin analyze <vm-name> # Get recommendationsImplementation:
- Azure Monitor integration
- Custom metrics collection
- Performance baselines
- Recommendation engine
Deliverables:
- Metrics collection setup
- Visualization examples
- Performance analysis tools
- Optimization guide
Epic: Automated backup workflows Effort: 2 weeks Value: HIGH - Data protection
Commands:
azlin backup schedule <vm> --daily --retain 7
azlin backup list <vm>
azlin backup restore <vm> --from <backup-id>
azlin backup replicate <vm> --to <region>Implementation:
- Scheduled snapshot creation
- Backup retention policies
- Cross-region replication
- One-click recovery
Deliverables:
- Backup automation
- Recovery procedures
- DR testing guide
- Compliance documentation
Phase 4 Total: 6 weeks Release: azlin v2.4.0 with Ops Features
Goal: Intelligent cost management and scaling.
Epic: Dynamic fleet scaling Effort: 3 weeks Value: HIGH - Cost savings
Commands:
azlin autoscale enable <fleet> --min 2 --max 10 --target-cpu 70
azlin autoscale schedule <fleet> --weekdays-only --hours 9-17
azlin autoscale disable <fleet>Implementation:
- Metric-based scaling
- Schedule-based scaling
- Azure VM Scale Sets integration
- Cost/performance tradeoffs
Deliverables:
- Auto-scaling policies
- Scheduling templates
- Cost impact analysis
- Best practices guide
Epic: Low-cost VMs with Spot instances Effort: 2 weeks Value: MEDIUM - 70-90% cost savings
Commands:
azlin new --spot --max-price 0.10
azlin spot status <vm-name>
azlin spot history --region westus2Implementation:
- Azure Spot VM integration
- Eviction handling
- Checkpoint/restart logic
- Price monitoring
Deliverables:
- Spot provisioning
- Eviction handlers
- Pricing analytics
- Fault-tolerant workflows
Epic: Proactive cost management Effort: 1 week Value: MEDIUM - Budget control
Commands:
azlin budget set 500 --per-month --alert-at 80%
azlin cost trends --last-30-days
azlin cost optimize # Get suggestionsImplementation:
- Azure Cost Management API
- Budget tracking
- Anomaly detection
- Optimization recommendations
Deliverables:
- Budget configuration
- Cost dashboards
- Optimization reports
- Savings calculator
Phase 5 Total: 6 weeks Release: azlin v2.5.0 with Cost Management
Goal: Enterprise security and compliance features.
Epic: Automated security baseline Effort: 2 weeks Value: HIGH - Security posture
Commands:
azlin security harden <vm-name>
azlin security audit <vm-name>
azlin security status <vm-name>
azlin security report --fleet <name>Implementation:
- CIS benchmark compliance
- Automatic security updates
- Vulnerability scanning
- Security configuration
Deliverables:
- Hardening playbooks
- Audit reports
- Compliance templates
- Security best practices
Epic: Azure Key Vault integration Effort: 2 weeks Value: MEDIUM - Credential security
Commands:
azlin secret set <vm> DATABASE_URL "postgres://..."
azlin secret list <vm>
azlin secret rotate <vm> API_KEY
azlin secret audit <vm>Implementation:
- Key Vault integration
- Managed identity authentication
- Secret injection as env vars
- Automatic rotation
Deliverables:
- Secret management module
- Rotation policies
- Audit logging
- Integration examples
Epic: Automated compliance checks Effort: 1 week Value: LOW - Enterprise requirements
Commands:
azlin compliance check <vm> --standard SOC2
azlin compliance report --fleet <name> --format pdf
azlin compliance export --to <path>Implementation:
- Compliance frameworks (SOC2, HIPAA, PCI-DSS)
- Evidence collection
- Report generation
- Continuous monitoring
Deliverables:
- Compliance checker
- Report templates
- Evidence archive
- Certification guide
Phase 6 Total: 5 weeks Release: azlin v3.0.0 - Enterprise Edition
| Feature | User Value | Technical Complexity | Strategic Importance | Priority |
|---|---|---|---|---|
| Shared Storage | 🔥 Critical | ⭐⭐⭐ Medium | 🎯 High | P0 |
| Fleet Management | 🔥 High | ⭐⭐⭐ Medium | 🎯 High | P1 |
| GPU Support | 🔥 High | ⭐⭐ Low | 🎯 Medium | P1 |
| Centralized Logging | 🔥 High | ⭐⭐⭐⭐ High | 🎯 Medium | P2 |
| Auto-Scaling | 🔥 High | ⭐⭐⭐⭐ High | 🎯 High | P2 |
| Jupyter Integration | 🔥 Medium | ⭐⭐ Low | 🎯 Low | P2 |
| Dev Containers | 🔥 Medium | ⭐⭐⭐ Medium | 🎯 Medium | P2 |
| Load Balancer | 🔥 Medium | ⭐⭐⭐ Medium | 🎯 Low | P3 |
| Backup/DR | 🔥 High | ⭐⭐⭐ Medium | 🎯 Medium | P3 |
| Security Hardening | 🔥 High | ⭐⭐⭐ Medium | 🎯 High | P3 |
| Spot Instances | 🔥 Medium | ⭐⭐⭐ Medium | 🎯 Medium | P4 |
| Secrets Management | 🔥 Medium | ⭐⭐ Low | 🎯 Medium | P4 |
Oct 2025 ├─ Start Phase 1: Shared Storage
│
Dec 2025 ├─ v2.1.0 Release: Shared Storage
│
Jan 2026 ├─ Start Phase 2: Fleet Management
│
Feb 2026 ├─ v2.2.0 Release: Fleet Management
│
Mar 2026 ├─ Start Phase 3: Developer Features
│
May 2026 ├─ v2.3.0 Release: GPU, Jupyter, Dev Containers
│
Jun 2026 ├─ Start Phase 4: Ops & Observability
│
Aug 2026 ├─ v2.4.0 Release: Logging, Metrics, Backup
│
Sep 2026 ├─ Start Phase 5: Cost Optimization
│
Nov 2026 ├─ v2.5.0 Release: Auto-Scaling, Spot, Budgets
│
Dec 2026 ├─ Start Phase 6: Security & Compliance
│
Feb 2027 └─ v3.0.0 Release: Enterprise Edition
- ✅ 90% of users can provision shared storage in <5 minutes
- ✅ Multi-VM workflows show 50% reduction in setup time
- ✅ NFS performance meets 90% of local disk performance
- ✅ Fleet operations support 10+ VMs without manual intervention
- ✅ Coordinated execution reduces testing time by 60%
- ✅ Load balancer setup automated in <2 minutes
- ✅ GPU VMs provision with drivers in <8 minutes
- ✅ Jupyter access available in <3 minutes
- ✅ Dev container adoption by 30% of users
- ✅ Log aggregation covers 100% of fleet
- ✅ Backup recovery time <15 minutes
- ✅ Performance recommendations improve VM efficiency by 20%
- ✅ Auto-scaling reduces costs by 40% on average
- ✅ Spot instances adopted by 50% of non-critical workloads
- ✅ Budget alerts prevent 90% of cost overruns
- ✅ 100% of VMs pass security audit baseline
- ✅ Secrets management eliminates exposed credentials
- ✅ Compliance reports automated for enterprise users
- Design Document: Write detailed design (similar to DESIGN.md)
- Tests First: Unit tests before implementation (TDD)
- Documentation: Update README, add examples
- Integration: Test with existing features
- Performance: Benchmark before/after
- Security Review: Check for vulnerabilities
- User Testing: Beta test with real workflows
- ✅ Type hints on all functions
- ✅ Unit test coverage >80%
- ✅ Integration tests for user flows
- ✅ Pre-commit hooks pass
- ✅ Documentation includes examples
- ✅ Error handling comprehensive
- All tests passing
- Documentation updated
- Changelog written
- Migration guide (if breaking changes)
- Performance benchmarks
- Security review complete
- Beta testing feedback incorporated
- Release notes published
-
NFS Performance: May not meet user expectations
- Mitigation: Benchmark early, offer hybrid mode
-
Azure API Rate Limits: Fleet operations may hit limits
- Mitigation: Implement exponential backoff, batch operations
-
Complexity Growth: Adding features increases maintenance burden
- Mitigation: Keep modules independent, comprehensive testing
-
Azure Pricing Changes: Storage costs could increase
- Mitigation: Support multiple tiers, cost optimization features
-
Competing Solutions: Microsoft may release similar tool
- Mitigation: Focus on developer UX, differentiate with fleet management
-
Support Load: More features = more support requests
- Mitigation: Comprehensive documentation, troubleshooting guides
-
Backwards Compatibility: Breaking changes frustrate users
- Mitigation: Semantic versioning, deprecation warnings, migration guides
This roadmap transforms azlin from a single-VM provisioning tool into a comprehensive Azure development fleet platform. The shared storage features (Phase 1) are the critical foundation enabling all subsequent distributed workflow capabilities.
Recommended Action: Begin Phase 1 implementation immediately. The shared storage feature is highly feasible (using Azure Files NFS), provides high user value, and enables the strategic vision for azlin as a fleet management platform.
Estimated Timeline: 18 months to complete all phases Total Effort: ~36 weeks of development Expected Outcome: azlin v3.0.0 Enterprise Edition
Roadmap created on October 17, 2025 Next Review: January 2026