From 433e0f98278af371541b20756f387f8d84db888c Mon Sep 17 00:00:00 2001 From: Daniel King <43149077+dakinggg@users.noreply.github.com> Date: Thu, 29 May 2025 16:20:30 -0700 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index a3013d00..bc970539 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,7 @@ MegaBlocks dMoEs outperform MoEs trained with [Tutel](https://github.com/microso # :building_construction: Installation -NOTE: This assumes you have `numpy` and `torch` installed. +OTE: This assumes you have `numpy` and `torch` installed. **Training models with Megatron-LM:** We recommend using NGC's [`nvcr.io/nvidia/pytorch:23.09-py3`](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch/tags) PyTorch container. The [Dockerfile](Dockerfile) builds on this image with additional dependencies. To build the image, run `docker build . -t megablocks-dev` and then `bash docker.sh` to launch the container. Once inside the container, install MegaBlocks with `pip install .`. See [Usage](#steam_locomotive-usage) for instructions on training MoEs with MegaBlocks + Megatron-LM.