Skip to content
This repository was archived by the owner on Oct 15, 2025. It is now read-only.
This repository was archived by the owner on Oct 15, 2025. It is now read-only.

Inference pool name in HTTPRoute should be truncated to 63 characters #313

@mnmehta

Description

@mnmehta

Component

I don't know

Describe the bug

In templates/sample-application/httproutes.yaml we construct a reference to the inference pool name:

name: "{{ include "sampleApplication.sanitizedModelName" . }}-inference-pool"

However, the actual inference pool name has elsewhere been truncated to 63 characters because the name needs to be a valid DNS label. The net outcome is that the HTTPRoute object is unable to find the backend, so the the gateway returns cluster_not_found for any request.

Steps to reproduce

  1. Deploy llm-d with a longish model name, e.g. using these overrides:
sampleApplication:
  baseConfigMapRefName: basic-gpu-preset
  model:
    modelArtifactURI: pvc://model-pvc/neuralmagic/DeepSeek-R1-Distill-Llama-70B-FP8-dynamic
    modelName: "neuralmagic/DeepSeek-R1-Distill-Llama-70B-FP8-dynamic"
  1. Check HTTPRoute object, there will be this error:
Message:               InferencePool "neuralmagic-deepseek-r1-distill-llama-70b-fp8-dynamic-inference-pool" not found
  1. Check the inferencepool, you will see the name has been truncated to 63 characters:
kubectl get inferencepool
NAME                                                              AGE
neuralmagic-deepseek-r1-distill-llama-70b-fp8-dynamic-inference   3h21m

Additional context or screenshots

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions