Skip to content

Conversation

@vfdev-5
Copy link
Contributor

@vfdev-5 vfdev-5 commented Jul 24, 2025

This is a draft of multi-host tutorial, based on this gist: https://gist.github.com/vfdev-5/70f695e462443685a0922e79ce0ee899 and Chris Jones' mnist_xla.py code.

cc @melissawm

@vfdev-5 vfdev-5 force-pushed the docs-learn-multi-host-tpu branch from 4527a9f to 8c0d217 Compare July 24, 2025 09:54
Copy link
Contributor

@melissawm melissawm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @vfdev-5 ! A few very straightforward comments and one question (should we use TensorBoard or XProf for profiling?0

0 Training finished!
```

#### Profiler logs in TensorBoard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we want to use XProf instead of TensorBoard, but we should confirm.

@melissawm
Copy link
Contributor

Hello @pgmoka @bhavya01 - would you mind taking a look for correctness and scope of this tutorial? If you are happy with the general idea, we can remove this from draft and address any other feedback. Thank you!

@melissawm
Copy link
Contributor

Hi folks - gentle ping. If you have any feedback, we're happy to address. Thanks!

@vfdev-5 vfdev-5 force-pushed the docs-learn-multi-host-tpu branch from 8c0d217 to aac84ec Compare August 28, 2025 08:18
@vfdev-5 vfdev-5 force-pushed the docs-learn-multi-host-tpu branch from aac84ec to 9f8c996 Compare August 28, 2025 08:21
@vfdev-5 vfdev-5 marked this pull request as ready for review August 28, 2025 08:21
@melissawm
Copy link
Contributor

Hi all - is this something you are still interested in? I'm happy to help bring it over the finish line if so. Thanks!

@zhanyong-wan
Copy link
Collaborator

@bhavya01 , could you take a look and recommend whether we should proceed with this? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants