diff --git a/README.md b/README.md index a3013d00..bc970539 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,7 @@ MegaBlocks dMoEs outperform MoEs trained with [Tutel](https://github.com/microso # :building_construction: Installation -NOTE: This assumes you have `numpy` and `torch` installed. +OTE: This assumes you have `numpy` and `torch` installed. **Training models with Megatron-LM:** We recommend using NGC's [`nvcr.io/nvidia/pytorch:23.09-py3`](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch/tags) PyTorch container. The [Dockerfile](Dockerfile) builds on this image with additional dependencies. To build the image, run `docker build . -t megablocks-dev` and then `bash docker.sh` to launch the container. Once inside the container, install MegaBlocks with `pip install .`. See [Usage](#steam_locomotive-usage) for instructions on training MoEs with MegaBlocks + Megatron-LM.