-
Notifications
You must be signed in to change notification settings - Fork 22
feat: Enhance service management scripts and add Strix Halo support #249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
- Updated restart.sh, start.sh, and stop.sh to accept optional arguments for service management commands. - Enhanced services.py to support new compute mode 'strixhalo' for service deployment. - Modified wizard.py and related configuration files to include Strix Halo as a valid compute option. - Updated Docker Compose files to define services for Strix Halo, ensuring compatibility with AMD ROCm. - Improved documentation and comments for clarity on new configurations and service options.
- Simplified the installation of system dependencies by removing conditional package manager checks and directly using `dnf` for installation. - Cleaned up the Dockerfile to enhance readability and maintainability.
|
will test this |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
📝 WalkthroughWalkthroughThis PR extends the system to support AMD Strix Halo (NPU) as a new accelerator variant alongside existing CPU and NVIDIA GPU options. Changes span configuration files, build recipes, dependency management, service orchestration, and startup workflows across ASR services and speaker recognition components. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Fix all issues with AI Agents 🤖
In @extras/asr-services/pyproject.toml:
- Around line 89-92: The index entry named "rocm-gfx1151" currently points to
AMD nightlies and should be changed to a stable AMD PyPI index: update the URL
in the [[tool.uv.index]] block for name "rocm-gfx1151" to use the official
stable endpoint (e.g. https://pypi.amd.com/simple or a versioned path like
https://pypi.amd.com/rocm-7.0.2/simple) and, if you intentionally need
nightlies, add a brief comment nearby explaining that rationale and why nightly
packages are required for this project.
🧹 Nitpick comments (7)
extras/speaker-recognition/src/simple_speaker_recognition/api/routers/speakers.py (1)
79-81: Good workaround for ROCm OpenMP conflicts.The local import approach is reasonable for avoiding OpenMP/BLAS threading conflicts between PyTorch (ROCm) and scikit-learn/UMAP libraries. Python's import caching ensures minimal performance overhead.
However, please verify this actually prevents segmentation faults in your ROCm environment, as conflicts can still occur if both libraries are used in the same process.
For additional robustness, consider also setting these environment variables at service startup:
import os os.environ["OMP_NUM_THREADS"] = "1" os.environ["MKL_NUM_THREADS"] = "1"This can further reduce OpenMP conflicts when multiple libraries are linked against different OpenMP runtimes.
extras/asr-services/.env.template (1)
5-7: Comment could be misleading for AMD users.The instruction to check CUDA version with
nvidia-smidoesn't apply when selectingstrixhalofor AMD hardware.🔎 Suggested improvement
# Options: cu121 (CUDA 12.1), cu126 (CUDA 12.6), cu128 (CUDA 12.8), strixhalo (AMD strixhalo apu) -# Should match your system's CUDA version (check with: nvidia-smi) +# For NVIDIA: should match your system's CUDA version (check with: nvidia-smi) +# For AMD Strix Halo: use 'strixhalo' PYTORCH_CUDA_VERSION=cu126extras/speaker-recognition/Dockerfile.strixhalo (1)
14-15: Pin the uv version for reproducibility.While the base image is pinned to a specific SHA, the uv installation uses
:latest, which could lead to inconsistent builds over time. Consider pinning to a specific version like0.9.21.🔎 Suggested fix
-COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv +COPY --from=ghcr.io/astral-sh/uv:0.9.21 /uv /bin/uvwizard.sh (1)
1-1: Missing shebang line.The script lacks a shebang (
#!/bin/bashor#!/bin/sh), which could cause unexpected behavior depending on how it's executed. Other scripts in this PR (e.g.,stop.sh) include the shebang for consistency.🔎 Suggested fix
+#!/bin/bash uv run --with-requirements setup-requirements.txt wizard.py "$@"services.py (1)
154-174: Consider adding AMD device detection for Strix Halo.The logic correctly validates NVIDIA device nodes (
/dev/nvidiactl,/dev/nvidia0) before starting the NVIDIA variant, falling back to CPU if absent. For symmetry, consider checking for AMD ROCm device nodes (/dev/kfd) whenstrixhalois configured to provide a similar safeguard and user-friendly warning.🔎 Suggested enhancement
if command == 'up': if pytorch_variant == 'strixhalo': - cmd.extend(['--profile', 'strixhalo', 'up', '-d', 'parakeet-asr-strixhalo']) + # Check for AMD ROCm device nodes + if Path("/dev/kfd").exists(): + cmd.extend(['--profile', 'strixhalo', 'up', '-d', 'parakeet-asr-strixhalo']) + else: + console.print("[yellow]⚠️ asr-services configured for Strix Halo (AMD ROCm), but /dev/kfd not found; starting CPU variant instead.[/yellow]") + cmd.extend(['up', '-d', 'parakeet-asr']) elif pytorch_variant.startswith('cu'):extras/speaker-recognition/pyproject.toml (2)
61-83: Consider documenting thesys_platform == 'never'pattern more prominently.The marker hack to prevent uv from installing base-image-provided packages is clever but unconventional. Future maintainers may be confused by this pattern.
Also, note the OpenTelemetry version constraints:
opentelemetry-exporter-otlp>=1.34.0has a lower minimum thanopentelemetry-api/sdk>=1.39.0. While pip/uv will resolve to compatible versions, explicitly aligning these minimums (all to>=1.39.0) would be cleaner.🔎 Suggested alignment for OpenTelemetry versions
# OpenTelemetry dependencies for pyannote.audio "opentelemetry-api>=1.39.0", "opentelemetry-sdk>=1.39.0", - "opentelemetry-exporter-otlp>=1.34.0", + "opentelemetry-exporter-otlp>=1.39.0",
110-110: Pin git dependency to a specific commit or tag for reproducible builds.The pyannote.audio git source will pull from the default branch (main), which could change at any time. This makes builds non-reproducible and could introduce unexpected breaking changes.
🔎 Suggested fix to pin to a specific revision
-"pyannote.audio" = { git = "https://github.com/pyannote/pyannote-audio.git", extra = "strixhalo" } +"pyannote.audio" = { git = "https://github.com/pyannote/pyannote-audio.git", rev = "v3.3.2", extra = "strixhalo" }Replace
v3.3.2with the appropriate tag or commit hash that has been tested with the Strix Halo/ROCm setup.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
extras/speaker-recognition/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (18)
Docs/init-system.mdextras/asr-services/.env.templateextras/asr-services/Dockerfile_Parakeetextras/asr-services/docker-compose.ymlextras/asr-services/init.pyextras/asr-services/pyproject.tomlextras/speaker-recognition/Dockerfile.strixhaloextras/speaker-recognition/docker-compose.ymlextras/speaker-recognition/init.pyextras/speaker-recognition/pyproject.tomlextras/speaker-recognition/src/simple_speaker_recognition/api/routers/identification.pyextras/speaker-recognition/src/simple_speaker_recognition/api/routers/speakers.pyrestart.shservices.pystart.shstop.shwizard.pywizard.sh
🧰 Additional context used
🧬 Code graph analysis (1)
stop.sh (2)
extras/asr-services/init.py (1)
run(226-253)extras/speaker-recognition/init.py (1)
run(416-447)
🔇 Additional comments (27)
extras/speaker-recognition/src/simple_speaker_recognition/api/routers/identification.py (1)
575-577: Consistent application of the OpenMP conflict workaround.The local import pattern is applied consistently in both
analyze_annotation_segmentsandanalyze_segments_with_enrolled_speakers. This aligns well with the same approach used inspeakers.py.The comments clearly document the rationale, making it easy for future maintainers to understand why imports are deferred in these specific functions.
Also applies to: 710-712
Docs/init-system.md (1)
195-246: LGTM! Clear and comprehensive documentation for the startup flow.The Mermaid diagrams effectively illustrate the two-layer architecture (Setup vs Run) and the compute mode selection logic. The documentation accurately reflects the new
strixhalooption alongside existingcpuandgpumodes.extras/speaker-recognition/Dockerfile.strixhalo (2)
28-42: Well-structured dependency management for ROCm compatibility.The approach of exporting requirements while pruning torch-related packages preserves the base image's ROCm torch stack. Installing
pyannote.audiofrom git with--no-depscorrectly avoids pulling incompatible PyPI dependencies.
56-65: LGTM!Environment configuration is appropriate for ROCm with comprehensive
LD_LIBRARY_PATHsetup. The CMD correctly uses the absolute path to the entrypoint in the base image's venv.extras/asr-services/Dockerfile_Parakeet (2)
20-21: LGTM!The
--frozenflag ensures strict adherence to the lockfile, preventing unexpected dependency updates during builds.
14-16: No action needed. The ffmpeg installation is correctly placed only in the builder stage. Runtime audio processing relies onsoundfileandlibsndfile1(already installed in runtime at line 30), not ffmpeg. The application code contains no runtime usage of ffmpeg.start.sh (1)
1-6: LGTM!Clean implementation that preserves backward compatibility while enabling selective service management. Using
"$@"correctly handles arguments containing spaces.restart.sh (1)
2-6: LGTM!Correctly implements the same argument-forwarding pattern as
start.sh. The absence of--buildin the default case aligns with the documented behavior that restart should not rebuild images.wizard.py (3)
60-61: Verify the need for--prerelease=allow.Adding
--prerelease=allowto uv commands enables installation of prerelease package versions. This could introduce instability if unintentional.Is this flag required for specific dependencies in the setup process, or is it a precautionary measure for future compatibility?
221-223: LGTM!Correctly extends the compute mode validation to include
strixhalo, enabling configuration reuse for AMD Strix Halo setups.
229-231: LGTM!The CUDA version validation correctly includes
strixhaloalongside existing NVIDIA CUDA versions, maintaining consistency with the.env.templateandinit.pyoptions.stop.sh (1)
1-6: LGTM!The script correctly implements the conditional wrapper pattern: defaults to
--allwhen no arguments are provided and forwards all arguments otherwise. The use of"$@"properly preserves argument quoting.extras/asr-services/pyproject.toml (1)
39-42: LGTM!The
strixhalooptional dependency group follows the established pattern of the existing CUDA variants, maintaining consistency in the dependency configuration.extras/asr-services/init.py (1)
155-172: LGTM!The Strix Halo option is correctly integrated into both the interactive prompt and CLI argument handling. The choice mappings are consistent and the label "AMD Strix Halo (NPU)" clearly identifies the platform.
extras/asr-services/docker-compose.yml (2)
78-79: Security relaxation withseccomp:unconfined.Disabling seccomp filtering reduces container isolation. While this is often required for ROCm to function properly, it increases the attack surface. Consider documenting this security trade-off in the setup documentation so operators are aware.
59-88: LGTM - AMD ROCm service configuration is well-structured.The
parakeet-asr-strixhaloservice correctly configures:
- ROCm device passthrough (
/dev/kfd,/dev/dri)- Video group membership
- Appropriate profile labels
Note: All variants share the
parakeet-asr:latestimage tag. Ensure users rebuild when switching between CPU/NVIDIA/AMD variants to avoid running mismatched images.services.py (1)
126-145: LGTM!The speaker-recognition service handling correctly routes to
speaker-service-strixhalowhenCOMPUTE_MODE=strixhalo, maintaining consistency with the existing GPU/CPU selection pattern.extras/speaker-recognition/docker-compose.yml (2)
85-85: LGTM!Profile extension ensures the web UI starts alongside the Strix Halo backend variant.
58-77: LGTM - Strix Halo service configuration is correct.The
speaker-service-strixhaloservice correctly:
- Extends the base configuration with YAML anchors
- References the existing
Dockerfile.strixhalo- Configures AMD ROCm device passthrough (
/dev/kfd,/dev/dri) and security options appropriatelyextras/speaker-recognition/init.py (3)
196-208: LGTM! Compute mode selection logic is well-structured.The new "strixhalo" option is cleanly integrated into the existing choice flow. The if/elif/else chain correctly maps choices 1→cpu, 2→gpu, 3→strixhalo.
213-216: LGTM! CUDA version handling correctly routes strixhalo to its own pathway.The conditional chain properly separates strixhalo from the GPU/CUDA detection flow, which aligns with the distinct ROCm-based build path for Strix Halo.
455-457: LGTM! CLI arguments are consistent with interactive mode.The
--compute-modeargument now accepts "strixhalo" and the help text accurately reflects all three options.extras/speaker-recognition/pyproject.toml (5)
5-5: Verify Python version bump doesn't break existing deployments.The minimum Python version increased from 3.10 to 3.11. Ensure all deployment environments and CI pipelines support Python 3.11+, as this is a breaking change for users on Python 3.10.
11-11: LGTM! Conditional dependency correctly routes pyannote.audio source.The
extra != 'strixhalo'marker ensures the PyPI version is used for standard builds while allowing the git source (line 110) to provide it for the strixhalo pathway.
86-86: Verifyrequired-environmentsconstraint doesn't block local development.This restricts dependency resolution to Linux x86_64 only. If contributors develop on macOS or ARM Linux, they'll need a workaround (e.g., Docker). Ensure this is intentional and documented.
93-93: LGTM! Strixhalo correctly added to the mutually exclusive extras.The conflicts list ensures only one compute backend extra (cpu, cu121, cu126, cu128, strixhalo) can be active at a time, preventing dependency conflicts.
154-154: LGTM! isort profile aligned with Black.Setting
profile = "black"ensures import sorting is compatible with Black's formatting.
| [[tool.uv.index]] | ||
| name = "rocm-gfx1151" | ||
| url = "https://rocm.nightlies.amd.com/v2/gfx1151/" | ||
| explicit = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
AMD ROCm PyTorch gfx1151 stable release availability
💡 Result:
Short answer: Yes — AMD's ROCm/PyTorch builds include support for gfx1151 (Ryzen AI / MI300-series class) in the ROCm 7.x line, but some components (e.g., AOTriton/other newer toolchains) are listed as experimental or under investigation and there are kernel/workaround notes. (Status verified as of January 6, 2026.) [1][2][3]
Sources
- ROCm PyTorch compatibility (ROCm docs — shows gfx1151 listed and notes about experimental AOTriton support). [1]
- ROCm install / PyTorch install doc (calls out gfx1151 / Ryzen AI Max 300 kernel requirements and workarounds). [2]
- ROCm public issues (community reports/questions about gfx1151 support/build issues). [3]
🌐 Web query:
AMD ROCm official stable releases gfx1151 PyPI index
💡 Result:
Short answer:
- AMD's official ROCm release notes show gfx1151 (RDNA3.5 / Ryzen AI Max series) enabled in ROCm 7.x (e.g., 7.1.x). [1]
- AMD publishes an official PyPI package index for ROCm-hosted Python wheels (pypi.amd.com) and ROCm docs instruct using it via pip with --extra-index-url (examples: https://pypi.amd.com/simple or versioned paths like /rocm-7.0.2/simple). [2][3]
Sources:
[1] ROCm 7.1.0 / release notes — gfx1150 / gfx1151 enabled.
[2] hipDF installation (shows AMD PyPI and example --extra-index-url=https://pypi.amd.com/rocm-7.0.2/simple).
[3] Multiple ROCm docs (e.g., MONAI / CuPy) showing pip install --extra-index-url=https://pypi.amd.com/simple.
🌐 Web query:
rocm.nightlies.amd.com PyTorch nightly builds index
💡 Result:
The ROCm-hosted PyTorch nightly wheel index is available at https://rocm.nightlies.amd.com/ (per-variant subpaths like /v2/gfx120X-all/). You can install nightlies with pip using the index URL, for example:
python -m pip install --index-url https://rocm.nightlies.amd.com/v2/gfx120X-all/ --pre torch torchvision torchaudio
Sources: [1], [2].
Consider switching to AMD's official stable PyPI index instead of nightlies.
The rocm-gfx1151 index points to AMD's nightly builds (rocm.nightlies.amd.com), which may introduce unstable or breaking changes. Official stable PyPI releases for ROCm 7.x include gfx1151 support and are available at https://pypi.amd.com/simple or versioned paths like https://pypi.amd.com/rocm-7.0.2/simple. Use stable releases unless nightlies are specifically required for this project, and document the rationale if nightly builds are intentional.
🤖 Prompt for AI Agents
In @extras/asr-services/pyproject.toml around lines 89-92, The index entry named
"rocm-gfx1151" currently points to AMD nightlies and should be changed to a
stable AMD PyPI index: update the URL in the [[tool.uv.index]] block for name
"rocm-gfx1151" to use the official stable endpoint (e.g.
https://pypi.amd.com/simple or a versioned path like
https://pypi.amd.com/rocm-7.0.2/simple) and, if you intentionally need
nightlies, add a brief comment nearby explaining that rationale and why nightly
packages are required for this project.
Summary by CodeRabbit
New Features
Documentation
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.