Skip to content
This repository was archived by the owner on Dec 26, 2023. It is now read-only.
/ torchbraid Public archive
forked from Multilevel-NN/torchbraid

A layer-parallel neural network technology built on PyTorch

License

Notifications You must be signed in to change notification settings

eric-c-cyr/torchbraid

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

360 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

torchbraid

A braid interface to pytorch

Requirements:

  • python libs: cython mpi4py pytorch
  • build of xbraid
  • MPI compiler

Conda environments can be found in 'torchbraid/env' directories. These can be used too get a consistent conda enviroonement for using torchbraid. The one caveat, is that mpi4py should be installed consistenclty with the MPI compiler. In some cases doing a 'pip install mpi4py' is to be preferred to installing it through conda (conda installs an alternate MPI compiler and library. You might want mpi4py to use the native one on your platform).

Note, the cython version is pretty important, particularly if torch layers are shipped directly by braid.

Build xbraid:

  1. Download from git@github.com:XBraid/xbraid.git
  2. The master branch should work fine
  3. From the xbraid directory run make debug=no braid

Build torchbraid:

  1. Copy makefile.inc.example to makefile.inc
  2. Modify makefile.inc to include your build specifics
  3. Type make
  4. You will need to add export PYTHONPATH=${TORCHBRAID_DIR}/torchbraid:${TORCHBRAID_DIR}:${PYTHONPATH} to your environment, this makes sure that the python search path for modules is setup

Take look at code in the examples directory.

To test:

  1. make tests
  2. make tests-serial

To clean the directory:

make clean

To uninstall python (this may be a no-op)

make uninstall

Publications

  1. Cyr, Eric C., Stefanie Günther, and Jacob B. Schroder. "Multilevel Initialization for Layer-Parallel Deep Neural Network Training." arXiv preprint arXiv:1912.08974 (2019). Link
  2. Günther, Stefanie, Lars Ruthotto, Jacob B. Schroder, Eric C. Cyr, and Nicolas R. Gauger. "Layer-parallel training of deep residual neural networks." SIAM Journal on Mathematics of Data Science 2, no. 1 (2020): 1-23. Link

About

A layer-parallel neural network technology built on PyTorch

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.8%
  • Makefile 1.1%
  • Shell 0.1%