Skip to content

anuragmuley09/File-Compression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

File Compression Tool


Toy project implementing huffman encoding. Written in C++, project can encode/decode any file format, while maintaining 100% decompression fidelity.

Backend Written in complete C++ using Drogon.

Since This is just a toy project, the initial version may not follow the best practices for drogon, however, with the time, the overall code structured will be improved, following a complete MVC architecture.

Why C++ for backend?

Why not? C++ being my first language, I always wanted to explore it more and more. And what else better than writing backend in C++ itself.

Also, I Wanted to explore the drogon framework and i had the huffman cli already present (here), so thought I would add some backend and frontend for others to use it effectively.

Upcoming Plans.

Since the end users are not aware of what is going behind the scenes, like even if the download the .huf file, they wont be able to understand it, its all encrypted. However, in terminal, users can see all the stuff hapenning, progress bar, huffman tree nodes with values, but, terminal (or CLI) is not a very good option to explore the stuff right? So, planning to generate a report or someting which users can download to see the encoded stuff, along huffman codes for each of the character in the attached file.

Sample terminal output:

Progress Bar:

Encoding file...
[======================================================================] 100.00 %
Decoding file...
[======================================================================] 100.00 %

Generated Huffman Codes:

Notice how each of the character has different length. This is because of our prefix tree.

--- Huffman Codes Generated ---
/ - 01000011
o - 0000
5 - 11100011000
u - 001010
e - 0001
d - 00100

.
.
.

) - 001011
i - 0011
> - 010000010
F - 01000000
x - 111000100
8 - 010000011
D - 01000010


Frequency Distribution:

-----------------------------------
(char) <frequency> x <bit length>
-----------------------------------
'R'     1 x 11
'z'     2 x 10
':'     60 x 5
'4'     25 x 7
't'     30 x 6
'S'     1 x 11
'K'     1 x 11

.
.
.

'M'     1 x 12
(10)    115 x 5
'J'     1 x 12
'C'     1 x 12
'W'     1 x 12
'r'     30 x 7
'2'     29 x 7
'N'     14 x 8
']'     8 x 9
' '     3330 x 1
------------------

Compression Stats:

Total bits in compressed data: 11847 (1480.88 bytes)
Original size: 4755 bytes
Compression Ratio: 31.14% of original size

Requirements

  1. Drogon (I have installed it using vcpkg cause it was simple to configure and use).
  2. CMake
  3. GCC

Commands

These may vary machine to machine, the internal architecture and all.

Strictly inside: x64 Native Tools Comand Prompt for VS 2022

  1. cmake -B build -S . -DCMAKE_TOOLCHAIN_FILE=C:/vcpkg/vcpkg/scripts buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows
  2. cmake --build build --config Release
  3. .\build\Release\compression.exe

Note:

  • If you are using vcpkg then you have to clone the vcpkg repo, install the required stuff and then adjust the toolchain settings.
  • Also, check CMakeLists.txt to adjust the source file compilation configuration if you want to.

About the webfiles

The files present in WebFiles folder, which are main.html and sample.html are just boiler plate and made to test the POST endpoint written in main.cc.

Additionally, I'am planning to take the .huf file and generate the original file from it. So, the flow will be:

  1. Upload original file
  2. Download .huf file
  3. Upload .huf file
  4. Download original file (decoded-<filename>)

As of today, 2 options are given to user, which are

  1. Download .huf file
  2. Download decoded file

User cannot open .huf file because its not a valid / accpeted / widely recognized format.

About

File encoding/decoding tool. Backend and core compression algorithm written in core C++.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors