F #-: Improves llm validation and AI ready k8s documentation #466

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

rsmontero merged 7 commits into OpenNebula:one-7.0-maintenance from aleixrm:f-#/improves-k8s-ai-ready

Dec 15, 2025

Member

aleixrm commented Dec 10, 2025

Description

Improves LLM validation and AI Ready kubernetes documentation with suggestions.

Branches to which this PR applies

master
one-7.0

Check this if this PR should not be squashed


          F #-: Improves llm validation and AI ready k8s documentation

a325f72

Signed-off-by: Aleix Ramírez <aramirez@opennebula.io>

aleixrm requested review from cmoralopennebula, imllorente, prisorue and rsmontero

December 10, 2025 09:25


          F #-: Improves gpu passthrough link reference

ece78b1

Signed-off-by: Aleix Ramírez <aramirez@opennebula.io>

mkomac suggested changes

View reviewed changes

mkomac left a comment

proposing to have same network name as on Scaleway deployment: admin_net

Member Author

aleixrm commented Dec 10, 2025

proposing to have same network name as on Scaleway deployment: admin_net

Hey @mkomac please, review the changes, this is already done, let me know if I forget to change it somewhere

aleixrm requested a review from mkomac

December 10, 2025 10:53

cmoralopennebula reviewed

View reviewed changes

content/solutions/deployment_blueprints/ai-ready_opennebula/ai_ready_k8s.md Outdated Show resolved Hide resolved

cmoralopennebula reviewed

View reviewed changes

content/solutions/deployment_blueprints/ai-ready_opennebula/ai_ready_k8s.md Show resolved Hide resolved

cmoralopennebula reviewed

View reviewed changes

content/solutions/deployment_blueprints/ai-ready_opennebula/ai_ready_k8s.md Outdated Show resolved Hide resolved

cmoralopennebula reviewed

View reviewed changes

content/solutions/deployment_blueprints/ai-ready_opennebula/ai_ready_k8s.md Outdated Show resolved Hide resolved

cmoralopennebula reviewed

View reviewed changes

content/solutions/deployment_blueprints/ai-ready_opennebula/ai_ready_k8s.md Outdated Show resolved Hide resolved

aleixrm added 4 commits

December 11, 2025 13:53


          F #-: Complete refactor of the ai-ready k8s guide

7c63485

Signed-off-by: Aleix Ramírez <aramirez@opennebula.io>


          F #-: More refactors for the llm inference certification and the nvid…

a45f360

…ia dynamo docs

Signed-off-by: Aleix Ramírez <aramirez@opennebula.io>


          F #-: Corrected some conventions in cd_cloud.md

37a97e5

Signed-off-by: Aleix Ramírez <aramirez@opennebula.io>


          F #-: More corrections

df7e3ca

Signed-off-by: Aleix Ramírez <aramirez@opennebula.io>

aleixrm requested a review from cmoralopennebula

December 11, 2025 16:07


          F #-: More fixes

85f7ecf

Signed-off-by: Aleix Ramírez <aramirez@opennebula.io>

rsmontero approved these changes

View reviewed changes

rsmontero merged commit da99372 into OpenNebula:one-7.0-maintenance

1 check passed

rsmontero pushed a commit that referenced this pull request


          F #-: Improves llm validation and AI ready k8s documentation (#466)

8c4aeb1

Signed-off-by: Aleix Ramírez <aramirez@opennebula.io>
(cherry picked from commit da99372)

rsmontero pushed a commit that referenced this pull request


          F #-: Improves llm validation and AI ready k8s documentation (#466)

c75b3b0

Signed-off-by: Aleix Ramírez <aramirez@opennebula.io>
(cherry picked from commit da99372)

cmoralopennebula reviewed

View reviewed changes

Contributor

cmoralopennebula left a comment

Added some minor changes in the documentation for the Cloud Deployment step, and for the AI k8s ready.

content/solutions/deployment_blueprints/ai-ready_opennebula/cd_cloud.md

    
                  # sysctl -w net.ipv4.ip_forward=1

                  # iptables -t nat -A POSTROUTING -s 192.168.100.0/24 -o enp129s0f0np0 -j MASQUERADE

                  # iptables-save | uniq | iptables-restore

                  netplan apply

Contributor

cmoralopennebula Dec 14, 2025

There is no copy/paste button for this section here, it is being shown as an exit, not as an input that can be copy/pasted.

content/solutions/deployment_blueprints/ai-ready_opennebula/ai_ready_k8s.md

    
              ```

              To access the Kubernetes API from your localhost, use the kubeconfig file that is located in the `/etc/rancher/rke2/rke2.yaml` file of the workload cluster control plane node. Also, modify the local kubeconfig file for pointing to localhost, where `host_ip` is the ip address of the OpenNebula frontend IP address. and `control_plane_vm_ip` the workload cluster control plane VM IP address.

              To access the Kubernetes API from your localhost, use the kubeconfig file that is located in the `/etc/rancher/rke2/rke2.yaml` file of the workload cluster control plane node.

Contributor

cmoralopennebula Dec 15, 2025

This command is not showing the connection from your localhost, it is from the OpenNebula frontend host.
We have to either change the phrase to 'OpenNebula frontend host' or add an extra jump on the ssh to connect first to the frontend, then to router and finally control-plane.

content/solutions/deployment_blueprints/ai-ready_opennebula/ai_ready_k8s.md

    
                  pods:               110

                  ```

              6. Finally, to use the PCI GPUs on the specific pod, add the `spec.runtimeClassName:nvidia` parameter in the pod/deploy manifest and set [`nvidia.com/gpu`](http://nvidia.com/gpu)`:1` as a requested resource.

Contributor

cmoralopennebula Dec 15, 2025

There are no command to help nor support this part.
Can we add it?

cmoralopennebula reviewed

View reviewed changes

Contributor

cmoralopennebula left a comment

More messages and suggestions on the Dynamo implementation.

content/solutions/deployment_blueprints/ai-ready_opennebula/nvidia_dynamo.md

    
              ```shell

              laptop$ cat <<EOF > storageClass.yaml

              cat <<EOF > storageClass.yaml

Contributor

cmoralopennebula Dec 16, 2025

Step 3 was not needed in my case.
The 'local-path' storage was already created from the previous step.

content/solutions/deployment_blueprints/ai-ready_opennebula/nvidia_dynamo.md

    
              EOF

              laptop$ kubectl apply -f hf-secret.yaml

              kubectl apply -f hf-secret.yaml

Contributor

cmoralopennebula Dec 16, 2025

Should we use 'token' for the variable name, or 'hf-token-secret' as explained in the setp below?

content/solutions/deployment_blueprints/ai-ready_opennebula/nvidia_dynamo.md

    
                          - -c

                        args:

                          - python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B  --is-prefill-worker 2>&1 | tee /tmp/vllm.log

              EOF

Contributor

cmoralopennebula Dec 16, 2025

Command kubectl -n dynamo-cloud get pods,svc does not appear as a command that can be copy/pasted.
It is included along the results that are generated from the command.
Line 319

content/solutions/deployment_blueprints/ai-ready_opennebula/nvidia_dynamo.md

    
              ```shell

              laptop$ kubectl port-forward svc/<frontend_service> <local_port>:8000 &

              kubectl port-forward svc/<frontend_service> <local_port>:8000 &

Contributor

cmoralopennebula Dec 16, 2025

This command won't work, we need to specify the namespace as this was not deployed in the default one.

Change to
kubectl port-forward -n dynamo-cloud service/vllm-v1-disagg-router-frontend 9000:8000 &

content/solutions/deployment_blueprints/ai-ready_opennebula/nvidia_dynamo.md

    
                  "stream": true,

                  "max_tokens": 300

                }

                }'

Contributor

cmoralopennebula Dec 16, 2025

Lines from 418 below are the output, we do not need copy/paste button on them.
Same happening on lines 374-393

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet