-
-
Notifications
You must be signed in to change notification settings - Fork 11.7k
[WIP][Feat][Sched] Support Balance Scheduling #29721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces an experimental 'balance scheduling' feature. My review identified a couple of critical issues that would cause runtime errors, preventing the feature from working as intended. Specifically, there's a method name mismatch between the caller and callee (running_gather vs. balance_gather) and an incorrect attempt to iterate over a method instead of a list (self.balance_gather vs. self.balance_queue). I have provided code suggestions to fix these critical bugs.
vllm/v1/core/sched/scheduler.py
Outdated
| break | ||
|
|
||
| if self.vllm_config.scheduler_config.balance_scheduling: | ||
| balance_flag = max(t.item() for t in self.balance_gather) == self.max_num_running_reqs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line will raise a TypeError at runtime because self.balance_gather is a method, not an iterable. The balance_gather method is designed to populate self.balance_queue, which is the list you should be iterating over here.
| balance_flag = max(t.item() for t in self.balance_gather) == self.max_num_running_reqs | |
| balance_flag = max(t.item() for t in self.balance_queue) == self.max_num_running_reqs |
vllm/v1/engine/core.py
Outdated
| ) | ||
|
|
||
| if self.vllm_config.scheduler_config.balance_scheduling: | ||
| self.scheduler.running_gather(self.dp_group) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ] | ||
|
|
||
| def balance_gather(self, dp_group): | ||
| runing_tensor = torch.tensor([len(self.running)], dtype=torch.int, device="cpu") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a typo in the variable name runing_tensor. It should be running_tensor to improve code clarity and maintainability.
| runing_tensor = torch.tensor([len(self.running)], dtype=torch.int, device="cpu") | |
| running_tensor = torch.tensor([len(self.running)], dtype=torch.int, device="cpu") |
Signed-off-by: GDzhu01 <809721801@qq.com>
8600cf3 to
8272787
Compare
Purpose
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.