-
Notifications
You must be signed in to change notification settings - Fork 108
feat: make model metrics endpoints configurable #1000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: nayihz The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
I have some doubts about adding additional fields to InferencePool. cc @kfswain @ahg-g @robscott @danehans @elevran /hold for others to comment. |
ffad486
to
f57478d
Compare
@nirrozenbaum , thanks for you advise. I think you are right. |
@nayihz I would start with command-line args with default values (the existing ones). |
f57478d
to
9fa17f7
Compare
@nayihz @nirrozenbaum this introduces a fixed endpoint for all model servers in the pool. apiVersion: v1
kind: Pod
metadata:
name: my-app
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "8080" |
@elevran we already have a fixed endpoint, so this PR is not introducing it :). the intention was to make that endpoint configurable. |
/unhold |
9fa17f7
to
7466a28
Compare
528d53f
to
937f686
Compare
937f686
to
d86effa
Compare
@@ -110,6 +110,10 @@ var ( | |||
"vllm:lora_requests_info", | |||
"Prometheus metric for the LoRA info metrics (must be in vLLM label format).") | |||
|
|||
modelServerMetricsPort = flag.Int("modelServerMetricsPort", 0, "Port to scrape metrics from pods. "+ | |||
"Default value will be set to InferencePool.Spec.TargetPortNumber if not set.") | |||
modelServerMetricsPath = flag.String("modelServerMetricsPath", "/metrics", "Path to scrape metrics from pods") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be a pain, but we may want to consider making these env vars so that they can work more cleanly with our helm charts via https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/config/charts/inferencepool#install-with-custom-environment-variables
We could add support for flags as well, but this is an already established path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When user sets both env variables and the flags at the same time, which one has higher priority?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When both command-line flags and environment variables are used to configure the same setting, the standard behavior is to prioritize command-line flags. This is because command-line flags represent explicit, while environment variables often represent more general or persistent configuration. Viper is a good reference implementation of this precedence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL at https://gist.github.com/danehans/7eae063e205141323ffae5428acb736a for an example implementation snippet.
looks good for the most part, added a comment about switching to env vars to have better interfacing with our helm chart |
fix: #16