Huffman Tree

Huffman Tree is an algorithm to compress text. It counts the frequencies of each character and based on the found character space, encodes the characters using lesser amount of bits, thus reducing the size.

So given a file, first a code is produced for each unique character. Then the file is encoded using the code. When the encoded file needs to be decoded, the same code is used for decoding. For simplicity, this program produces a separate code file. But for production use, the code part also should be included in the encoded file itself.

See further details here: https://en.wikipedia.org/w/index.php?title=Huffman_tree

Usage

Simply execute Application.java, it currently passes play.txt as input file, which is also included in the project. Change or pass your own file through args.

This is the output it produces:

Step 1 - Generate code file

play.txt, bytes - 196197
play.code, bytes - 964

Step 2 - Use code file to encode the input file

play.short, bytes - 109263

Step 3 - Decode the encoded file using the code file

play.decoded, bytes - 196197

So after encoding, the file size is (109263 + 964 = 110227), which is 56% of the original file.

Data structures used

Binary Tree: to store the bit pattern for each character, which gets traversed while decoding
Priority Queue: temp data structure to keep processing the least frequent characters and combining them
Stack: temp data structure for reading the code for each character

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src/main/java/com/purini/ht		src/main/java/com/purini/ht
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
play.txt		play.txt
pom.xml		pom.xml
test.txt		test.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Huffman Tree

Usage

Data structures used

About

Uh oh!

Releases

Packages

Languages

License

univermal/huffman-tree

Folders and files

Latest commit

History

Repository files navigation

Huffman Tree

Usage

Data structures used

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages