Add multi-host gpu tests to PyTorch/XLA#1011
Draft
vanbasten23 wants to merge 7 commits intoGoogleCloudPlatform:masterfrom
Draft
Add multi-host gpu tests to PyTorch/XLA#1011vanbasten23 wants to merge 7 commits intoGoogleCloudPlatform:masterfrom
vanbasten23 wants to merge 7 commits intoGoogleCloudPlatform:masterfrom
Conversation
vanbasten23
commented
Nov 8, 2023
| ], | ||
| }, | ||
| }, | ||
| subdomain: 'headless-svc-$(JOB_NAME)', # xw32: need to verify. |
Collaborator
Author
There was a problem hiding this comment.
hi @will-cromar , currently when I run the test it fails with error The Job "pt-nightly-resnet50-mp-func-v100-x4-699rm" is invalid: spec.template.spec.subdomain: Invalid value: "headless-svc-$(JOB_NAME)": a lowercase RFC 1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name', or '123-abc', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?').
The subdomain name (headless-svc-$(JOB_NAME)) is the kubernetes service name. Do you know the correct way to make it depends on the job name?
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Please include a summary of relevant context/issue and your changes.
Tests
Please describe the tests that you ran on TPUs to verify changes.
Instruction and/or command lines to reproduce your tests: ...
List links for your tests (use go/shortn-gen for any internal link): ...
Checklist
Before submitting this PR, please make sure (put X in square brackets):