You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Code for testing the repo is in main.cu. Use Makefile to compile.
For detailed documentation navigate to documentation/html/index.html and open the file in a browser.
Workflow of main.cu:
MNIST Dataset class in initialized and data is parsed.
DataLoader object is created and initialized with MNIST Dataset.
Network specification is mentioned as a vector of strings.
seqNetwork object is created with network specification. (Option to specify memory budget and choose subbatch selection algorithm.)
Virtual memory manager is intialized with the memory mentioned takes the chunk of memory from the GPU.
Trainer function is called and training is done.
Current features :-
Forward Pass and Backward Pass implementations for Convolution(CuDNN), Fully Connected layers (CuBLAS), Softmax(CuDNN), Relu (CuDNN), Pooling and Flatten.
Support for offloading and prefetching buffers.
4 modes of training - Minimum memory, total memory, partial memory usage with prefetching and offloading heuristics(two kinds of prefetching heuristic).
DataLoader and Dataset classes for easy accessing and usage of data.
Virtual Memory Manager for timely defragmentation of device memory.
Mnist dataset class.
Clear documentation.
Setting Batch_size and learning rate :
Navigate to layers/layer.h and change the defined values in lines 32,33.
Changing the trainer algorithm :
Check line 81 in main.cu
Setting memory budget of network:
Check line 71 in main.cu
Setting memory size of the VMM:
Check line 77 in main.cu
About
Implementation of algorithms for memory optimized deep neural network training