This document provides a short description about producing ahead-of-time compiled executable bundles. The motivation for this work is to remove the cost of compile time by allowing the users of Glow to compile the package ahead of time.
A bundle is a self-contained compiled network model that can be used to execute the model in a standalone mode.
It is possible to use the Glow library to produce bundles. On the CPU, the bundles are object files that can be linked with some executable. On other architectures, the bundle may look completely different.
This document demonstrates how to produce a bundle for the host CPU using the
'loader' tool. We use the flag -emit-bundle to specify the output directory.
loader image_file -image_mode=0to1 -m network_model_name -cpu -emit-bundle output_directory_name
The command above would compile the neural network model from the
network_model_directory_name and generate a bundle consisting of two files in
the directory output_directory_name.
The first file is named network_model_name.o and contains the compiled code of
the network model. It is a regular object file that can be linked other files in
your project. The second file is named network_model_name.weights and
contains the weights required to run the compiled model.
This section describes the APIs that the CPU bundle exposes. Other targets may expose a completely different API.
Each bundle exposes two symbols named network_model_name and
network_model_name_config. The network_model_name is the name of the
auto-generated function that implements the network model. This symbol always
has the following signature:
extern "C" void network_model_name(uint8_t *constantWeightVars,
uint8_t *mutableWeightVars,
uint8_t *activations);The parameters of this function are the base addresses of the memory areas for constant weights variables, mutable weights variables (i.e. inputs and outputs) and activations.
The network_model_name_config is a symbol that contains the configuration of
the compiled network. The type of this symbol is always the following struct:
struct BundleConfig {
// Size of the constant weight variables memory area.
size_t constantWeightVarsMemSize;
// Size of the mutable weight variables memory area.
size_t mutableWeightVarsMemSize;
// Size of the activations memory area.
size_t activationsMemSize;
// Alignment to be used for weights and activations.
size_t alignment;
// Number of symbols in the symbol table.
size_t numSymbols;
// Symbol table.
const SymbolTableEntry *symbolTable;
};This configuration is supposed to be used by the client code to allocate the
required amounts of memory for each of the memory areas, before invoking the
network_model_name function to run the network.
Clients also use BundleConfig to perform the symbol table lookups when they
need to find information about an input or output variable.
The SymbolTableEntry always has the following structure:
struct SymbolTableEntry {
// Name of a variable.
const char *name;
// Offset of the variable inside the memory area.
size_t offset;
// The number of elements inside this variable.
size_t size;
// The kind of the variable. 1 if it is a mutable variable, 0 otherwise.
char kind;
};Offsets of constants are offsets inside the memory area for constant weights. Offsets of mutable variables are offsets inside the memory area for mutable weights.
This section describes the use of the CPU bundle. Other targets may have different interfaces.
To integrate the artifacts generated by the loader into your project, you generally need to do the following:
- You need to link with the generated object file
network_model_name.o. - You need to allocate the memory for constant weights variables,
mutable weights variables (i.e. inputs and outputs) and activations based on the
memory area sizes provided by
network_model_name_config. - You need to load the content of the auto-generated
network_model_name.weightsfile into the constant weights variables memory area. - And need to initialize the mutable weights area with inputs (e.g. image data)
- And finally, you need to invoke the
network_model_namefunction with 3 parameters that are base addresses of the memory areas for constant weights variables, mutable weights variables, and activations. - After
network_model_namehas returned, you can find the results of the mutable weights variables area.
There are concrete examples of integrating a network model with a project. You
can find it in the examples/compile_resnet50 directory in the Glow
repository. Makefile with appropriate targets is provided for your convenience.
To build and run the example, you just need to execute:
QUANTIZE=NO make run
You may need to adjust the environment variables at the top to match your setup, primarily LOADER and GLOW_SRC vars.
The makefile provides the following targets:
download_weights: it downloads the Resnet50 network model in the Caffe2 format.build/resnet50.o: it generates the bundle files using the Glow loader as described above. The concrete command line looks like this:loader tests/images/imagenet/cat_285.png -image_mode=0to1 -m resnet50 -cpu -emit-bundle buildIt reads the network model fromresnet50and generates theresnet50.oandresnet50.weightsfiles into thebuilddirectory.build/main.o: it compiles theresnet50_standalone.cppfile, which is the main file of the project. This source file gives a good idea about how to interface with an auto-generated bundle. It contains the code for interfacing with the auto-generated bundle.- It allocated the memory areas based on their memory sizes provided in
resnet50_config. - Then it loads the weights from the auto-generated
resnet50.weightsfile. - It loads the input image, pre-processes it and puts it into the mutable weight variables memory area.
- Once everything is setup, it invokes the compiled network model by calling the
resnet50function from theresnet50.oobject file.
- It allocated the memory areas based on their memory sizes provided in
resnet50: it links the user-definedresnet50_standalone.oand auto-generatedresnet50.ointo a standalone executable file calledresnet50_standalonerun: it runs this standalone executable with imagenet images as inputs and outputs the results of the network model execution.
To build and run the example, you just need to execute:
make run. By default, quantized Resnet50 example is going to be executed.
This run performs almost the same steps as non-quantized Resnet50 version
except it emits bundle based on the quantization profile:
loader tests/images/imagenet/cat_285.png -image_mode=0to1 -m resnet50 -load_profile=profile.yml -cpu -emit-bundle build
The profile.yml itself is captured at a prior step by executing loader with the dump_profile option:
loader tests/images/imagenet/*.png -image_mode=0to1 -m resnet50 -dump_profile=profile.yml.
See the makefile for details.