-
Notifications
You must be signed in to change notification settings - Fork 18
dra: cel filter Performance Optimization #359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
wangqianqianjun
wants to merge
38
commits into
NexusGPU:dev-dra
Choose a base branch
from
wangqianqianjun:dra
base: dev-dra
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…contain permissions (NexusGPU#349) Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
* support dedicated gpus * support dedicated GPU * support dedicated GPU * fix test issue
…sGPU#350) * fix: skip gpu limiter not working issue * fix: avoid k8s QoS side effect for inject lib init container * fix: potential panic issues * fix: remove unused event
* support dedicated gpus * support dedicated GPU * support dedicated GPU * fix test issue * fix init pricing override vran * Revert "fix init pricing override vran" This reverts commit d0bea18. * fix init pricing override vram
* chore: lint issue * fix: kubernetes upgrade, fix scheduler deps issue * fix: upgrade k8s version to 1.34, use fixed operator version in helm chart * fix: update shm path * chore: comment & wording * fix: connection naming * fix: upgrade github action * fix: add test for dedicated gpu allocation mode
…k domain name, virtual cap calculation (NexusGPU#357) * fix: virtual tflops/vram not calculated bug * fix: extract GPU map update logic into separate method and fix webhook domain name * fix: nvidia device plugin compatible mode state consistent issue * fix: nvidia device plugin compatible mode issue
* fix: gpu info update * feat: preempt scheduling, fix metrics scheduling bugs, add evict protection * fix: unit test issue * fix: preempt unit testing * fix: lint issue, add qos to priorityClassName converting
…exusGPU#365) - Add double-check for TFLOPs and VRAM availability before allocation
ff9efd2
to
4fc9dc9
Compare
…ild and dra request build in the same logic
- Implemented DRA CEL filters in GPU allocation requests - Added benchmarks for basic and complex expressions - Updated the resource slice controller to support Kubernetes hostname labels
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Early Filtering Optimization
Pre-filters GPUs by phase and model before expensive CEL evaluation to reduce processing overhead.
CEL Parallel Processing
Automatically enables parallel evaluation for large datasets (≥2000 GPUs) using worker goroutines with dynamic chunking
Zero Memory Allocation
Eliminates map allocations through ZeroAllocActivation and lazy caching of GPU field values to minimize GC pressure.
goos: linux
goarch: amd64
pkg: github.com/NexusGPU/tensor-fusion/internal/gpuallocator/filter/cel_filter
cpu: 13th Gen Intel(R) Core(TM) i7-13700KF
BenchmarkFilterPerformance/OriginalFilters-24 39 30419082 ns/op 15205268 B/op 20 allocs/op
BenchmarkFilterPerformance/CELFilter_Basic-24 51 23382077 ns/op 8003896 B/op 8 allocs/op
BenchmarkFilterPerformance/CELFilter_Complex-24 12 94239518 ns/op 57643526 B/op 2471372 allocs/op
BenchmarkFilterPerformance/CELFilter_CacheMiss-24 10 112866842 ns/op 82081142 B/op 3528066 allocs/op