Skip to content

Update CPU resourceProfile to use 4Gi mem and fix cpu model profiles #419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nstogner
Copy link
Contributor

Tested the CPU models to see what memory requirements they had (kubectl top pods). Updated the cpu resourceProfile to request 4Gi of memory per core. This matches common cloud machine shapes and avoids the need to specify a extremely high cpu:N resourceProfile for models to match their memory requirements.

Fixes #418 - Which turned out to be an OOM issue (error appears to have been suppressed).

@nstogner nstogner requested a review from samos123 February 21, 2025 00:33
@samos123
Copy link
Contributor

samos123 commented Mar 1, 2025

@nstogner can you rebase and re-trigger?

@nstogner
Copy link
Contributor Author

nstogner commented Mar 2, 2025

Looks like tests are failing with OOM, will debug soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix CPU vLLM image
2 participants