Skip to content

I try an experiment where I take binary image files of 25 different malware families, create embeddings using an image transformer an then generate new embeddings using a condintional wgan-gp.

Notifications You must be signed in to change notification settings

efar301/synthetic_malware_generation

Repository files navigation

Synthetic Malware Generation

Overview

  1. Embedding Generation – A vision transformer is used to create embeddings from binary images of malware samples.
  2. Synthetic Data Generation – A Conditional WGAN-GP is trained to produce high-quality synthetic embeddings.
  3. Classification Experiment – A classifier is trained on the synthetic embeddings and tested on real malware data to assess generalization.

Conclusion

  1. The experiment aims to determine whether a classifier trained on synthetic malware embeddings can effectively classify real malware samples.
  2. The classifier achieved a 94.9% classification accuracy, showing that synthetic data can be used to train classifiers.

About

I try an experiment where I take binary image files of 25 different malware families, create embeddings using an image transformer an then generate new embeddings using a condintional wgan-gp.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published