Skip to content

Fix Hardcoded localhost URLs#11

Merged
MiguelCarpio merged 1 commit intorhos-vaf:mainfrom
MiguelCarpio:vllm_api_url
Jan 15, 2026
Merged

Fix Hardcoded localhost URLs#11
MiguelCarpio merged 1 commit intorhos-vaf:mainfrom
MiguelCarpio:vllm_api_url

Conversation

@MiguelCarpio
Copy link
Contributor

@MiguelCarpio MiguelCarpio commented Jan 14, 2026

What does this PR do?
Fix Hardcoded localhost URLs (IPv6/IPv4 Problem).

  • TASK [gpu-validation : Wait for vllm API to appear]
  • TASK [gpu-validation : Wait for vllm metrics endpoint to appear]
  • TASK [gpu-validation : Run the model performance check script]

Why do we need this PR?
The playbook had hardcoded URLs scattered across multiple files using http://localhost:8000, which caused the following problems:

  • localhost vs 127.0.0.1 resolution issue: localhost was resolving to IPv6 ::1, but vLLM was only listening on IPv4
  • Different tasks used different URL formats
  • Users couldn't easily change the API URL without editing multiple files

I added a new centralized variable gpu_validation_vllm_api_url

@MiguelCarpio MiguelCarpio force-pushed the vllm_api_url branch 2 times, most recently from 8c42f9e to 27c6f8f Compare January 14, 2026 19:03
Copy link
Contributor

@csibbitt csibbitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like the s/uri/command/ change to be reverted unless you can show that ansible.builtin.uri actually runs from the controller as claimed.

After investigating this issue manually in the live environment, I believe the only meaningful change is s/localhost/127.0.0.1/ due to localhost resolving to ::1 and vllm only listening on IPV4.

Thanks for finding that!

# [string] Additional environment variables
gpu_validation_workload_additional_env: "--env=HF_HUB_OFFLINE=0 --env=VLLM_NO_USAGE_STATS=1"
# [string] vLLM API URL
gpu_validation_vllm_api_url: "http://127.0.0.1:8000"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

ansible.builtin.uri:
url: http://localhost:8000/health
- name: Wait for vllm API to appear # noqa: command-instead-of-module
# Use curl command instead of uri module because uri runs on the controller,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT This is not true, so I do not approve of this change from ansible.builtin.uri to ansible.builtin.command

Using the new variable is fine.

Copy link
Contributor

@bogdando bogdando Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both will run on the same target host.
If you want another host, you should have it in the inventory, and use delegate_to

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@csibbitt Thanks for the review, ansible.builtin.uri works fine and runs on the target VM. The issue was always the localhost resolving to ::1 and vllm only listening on IPV4. I updated the PR.

@bogdando setting gpu_validation_vllm_api_url we can check any URI reachable from the target VM.

- name: Wait for vllm metrics endpoint to appear
ansible.builtin.uri:
url: http://localhost:8000/metrics/
- name: Wait for vllm metrics endpoint to appear # noqa: command-instead-of-module
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

@MiguelCarpio MiguelCarpio changed the title Check vllm health and metrics endpoints from the remote VM with the variable gpu_validation_vllm_api_url Fix Hardcoded localhost URLs Jan 15, 2026
Copy link
Contributor

@bogdando bogdando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MiguelCarpio MiguelCarpio merged commit dc6ff86 into rhos-vaf:main Jan 15, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants