Skip to content

Add scripts to generate reproducible test data #95

@Schamper

Description

@Schamper

Currently, almost all test data across all Dissect projects are either made by hand, or some dirty one-time-use scripts that aren't checked into the repository. This makes it difficult to make changes to existing test data, or to produce a consistent set of tests across similar projects (e.g. filesystems).

This issue can serve as a general tracking issue. For now, I've created a list of all projects with some generic notes. To continue, tickets should be made for each project with further notes.

  • dissect.archive
    • VBK, VMA and XVA will be tricky
    • WIM could be made with wimlib or PowerShell scripts
  • dissect.btrfs
    • Easy, but needs alignment with other filesystems
  • dissect.cim
    • Tricky, but doable with PowerShell
  • dissect.clfs
    • Tricky
  • dissect.cramfs
    • Easy, but needs alignment with other filesystems
  • dissect.database
    • ESE would be tricky, maybe with some PowerShell or C#
    • SQLite3 would be easy, can even be done purely in Python
    • Berkeley would be a bit more tricky, but doable (maybe either some Python 2.7, or C code)
  • dissect.etl
    • Tricky
  • dissect.eventlog
    • Doable with PowerShell I believe
  • dissect.evidence
    • Medium/hard - maybe a script that generates a disk image, followed by instructions for FTK?
  • dissect.executable
    • Shouldn't be too hard to include source and Makefiles or Visual Studio projects
  • dissect.extfs
    • Easy, but needs alignment with other filesystems
  • dissect.fat
    • Easy, but needs alignment with other filesystems
  • dissect.ffs
    • Easy, but needs alignment with other filesystems
  • dissect.fve
    • Medium, some PowerShell for BitLocker and some Bash for LUKS
  • dissect.hypervisor
    • Medium/hard as some implementations are mainly GUI based, but maybe we can get a long way with generating disk images with QEMU
  • dissect.jffs
    • Easy, but needs alignment with other filesystems
  • dissect.ntfs
    • Easy, but needs alignment with other filesystems (and preferably PowerShell based)
  • dissect.ole
    • Needs some unit tests to begin with, but probably tricky? I imagine it's easier to just take some files from a default Windows installation
  • dissect.qnxfs
    • Easy, but needs alignment with other filesystems (and specifically be run on a QNX system)
  • dissect.regf
    • Probably very doable, I believe there's ways to create and interact with separate hives
  • dissect.shellitem
    • Probably doable? Maybe a PowerShell script that creates some shortcuts
  • dissect.squashfs
    • Easy, but needs alignment with other filesystems
  • dissect.util
    • Currently only CPIO, should be easy
  • dissect.vmfs
    • Very involved, needs alignment with other filesystems (and specifically be run on an ESXi system)
  • dissect.volume
    • All except Dell RAID should be easy, it'd just be tedious and a lot
  • dissect.xfs
    • Easy, but needs alignment with other filesystems

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions