entropy is a C++ tokenisation library built off of information theory to generate efficient tokenisation schemes. Currently Huffman and Shannon Byte-Pair encoding algorithms.
Requires >=CMake3.10.0
mkdir entropy/build
cd entropy/build
cmake ..
make nn```