Toy project implementing huffman encoding. Written in C++, project can encode/decode any file format, while maintaining 100% decompression fidelity.
Backend Written in complete C++ using Drogon.
Since This is just a toy project, the initial version may not follow the best practices for drogon, however, with the time, the overall code structured will be improved, following a complete MVC architecture.
Why not? C++ being my first language, I always wanted to explore it more and more. And what else better than writing backend in C++ itself.
Also, I Wanted to explore the drogon framework and i had the huffman cli already present (here), so thought I would add some backend and frontend for others to use it effectively.
Since the end users are not aware of what is going behind the scenes, like even if the download the .huf file, they wont be able to understand it, its all encrypted. However, in terminal, users can see all the stuff hapenning, progress bar, huffman tree nodes with values, but, terminal (or CLI) is not a very good option to explore the stuff right? So, planning to generate a report or someting which users can download to see the encoded stuff, along huffman codes for each of the character in the attached file.
Sample terminal output:
Progress Bar:
Encoding file...
[======================================================================] 100.00 %
Decoding file...
[======================================================================] 100.00 %
Generated Huffman Codes:
Notice how each of the character has different length. This is because of our prefix tree.
--- Huffman Codes Generated ---
/ - 01000011
o - 0000
5 - 11100011000
u - 001010
e - 0001
d - 00100
.
.
.
) - 001011
i - 0011
> - 010000010
F - 01000000
x - 111000100
8 - 010000011
D - 01000010
Frequency Distribution:
-----------------------------------
(char) <frequency> x <bit length>
-----------------------------------
'R' 1 x 11
'z' 2 x 10
':' 60 x 5
'4' 25 x 7
't' 30 x 6
'S' 1 x 11
'K' 1 x 11
.
.
.
'M' 1 x 12
(10) 115 x 5
'J' 1 x 12
'C' 1 x 12
'W' 1 x 12
'r' 30 x 7
'2' 29 x 7
'N' 14 x 8
']' 8 x 9
' ' 3330 x 1
------------------
Compression Stats:
Total bits in compressed data: 11847 (1480.88 bytes)
Original size: 4755 bytes
Compression Ratio: 31.14% of original size
- Drogon (I have installed it using vcpkg cause it was simple to configure and use).
- CMake
- GCC
These may vary machine to machine, the internal architecture and all.
Strictly inside: x64 Native Tools Comand Prompt for VS 2022
cmake -B build -S . -DCMAKE_TOOLCHAIN_FILE=C:/vcpkg/vcpkg/scripts buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windowscmake --build build --config Release.\build\Release\compression.exe
Note:
- If you are using vcpkg then you have to clone the vcpkg repo, install the required stuff and then adjust the toolchain settings.
- Also, check CMakeLists.txt to adjust the source file compilation configuration if you want to.
The files present in WebFiles folder, which are main.html and sample.html are just boiler plate and made to test the POST endpoint written in main.cc.
Additionally, I'am planning to take the .huf file and generate the original file from it. So, the flow will be:
- Upload original file
- Download .huf file
- Upload .huf file
- Download original file (decoded-<filename>)
As of today, 2 options are given to user, which are
- Download .huf file
- Download decoded file
User cannot open .huf file because its not a valid / accpeted / widely recognized format.