Malware classifier based on Imphash and Fuzzyhash
Malware classifier using Imphash and Fuzzyhash is a program guided by supervised learning.
The program takes a folder containing
files as training set and generates hash signatures for them. The hashes are related to each other via unique message digest value of file.
Once, the signature DB is created, classifier can run on unknown file set to check if they are similar to files in database.
This program can be used in malware operations to build a system to cluster similar malware or file samples. These cluster files then can be segregated and string signatures like yara, AV can be devised for cluster.
Supports: Unix/Windows, tested on Unix
Dependencies:
Python module: pefile
Python module: ssdeep
Usage: program.py command directory
Commands: 'build' - build imphash signatures
'match' - compare files with signature
format: md5, imphash, fuzzyhash
ffb0b9b5b610191051a7bdf0806e1e47,3f243f8268f79d4c3bb161fd3cd38b5c,192:jIhG67ccuvuj7CJj5VQOLX0G1Qc/9HdPvlw3+KHsuyB95oTGB3Mm7:lEg1VJLPV/9HVvlwO6s59yTWJ
ffad870f291acccbe148673f579689db,1f5e76572fad36553733428ca3571f53,3072:TtnUNALmVZvvGBeQYejpaIAq2tn2TBfki43y97FozS4Oq1sqH73oGC:p4LvkwejpAqun2TB8i4i0zLOosqHkG
fff8f98add7ceda78ceee6721e4ef6c0,4a9f23a52f3195615c9216f8ad3b09de,1536:km9wpbv8KurqjSLlixpAz4c3a/2eLaNjzcb6ULaDyROnt:km9wpbver2iiy3a/2BtobTLaDcK
Match: IMP hash for file /root/samples/executables/ffb0b9b5b610191051a7bdf0806e1e47 matched with file having MD5 = ffb0b9b5b610191051a7bdf0806e1e47
Match: IMP hash for file /root/samples/executables/ffad870f291acccbe148673f579689db matched with file having MD5 = ffad870f291acccbe148673f579689db
Fuzzy hash for file /root/samples/executables/newmatched with file having MD5 = ffb0b9b5b610191051a7bdf0806e1e47 Similarity 100
Match: IMP hash for file /root/samples/executables/fff8f98add7ceda78ceee6721e4ef6c0 matched with file having MD5 = fff8f98add7ceda78ceee6721e4ef6c0
Note: This classification depends on training set, so choose wisely the samples for signatures which are mostly unpacked and import multiple APIs