IOCX — Static IOC Extraction for Binaries, Text, and Artifacts

Fast, safe, deterministic IOC extraction for DFIR, SOC automation, and large-scale threat analysis.

IOCX is a lightweight, extensible engine for extracting Indicators of Compromise (IOCs) using pure static analysis. No execution. No sandboxing. No risk.

Built for:

DFIR workflows
SOC automation
Threat-intel pipelines
CI/CD security checks
Large‑scale batch processing

This project is the foundation of the MalX Labs ecosystem for scalable, modern threat‑analysis tooling.

Why IOCX?

IOCX is designed for environments where safety, determinism, and automation matter. Unlike extractors that operate only on raw text, IOCX includes binary‑aware static analysis, a plugin-friendly rule system, and a stable JSON schema.

Key advantages

Static‑only design — never executes untrusted code
Binary parsing — extracts IOCs from Windows PE files in addition to raw text
Deterministic behaviour — stable output and predictable performance, ideal for pipelines
Extensible rule engine — custom detectors, parsers, and plugins
Consistent JSON schema — clean integration with SIEM/SOAR
Low dependency footprint — safe for enterprise environments
Pipeline-ready — fast start‑up, fast throughput

What IOCX Is Not

To avoid confusion:

Not a sandbox
Not a malware emulator
Not a behavioural analysis tool
Not an enrichment engine (that lives in the MalX Cloud platform)

IOCX is static extraction only, by design.

Use Cases

SOC & Incident Response

Extract indicators from emails, alerts, or analyst clipboard text
Parse IOCs from reports into structured JSON
Safely inspect malware samples without execution

Threat Intelligence Processing

Normalize indicators from feeds
Batch‑process unstructured text
Build enrichment pipelines on top of deterministic output

CI/CD & DevSecOps

Scan binaries for embedded indicators before publishing
Integrate IOC extraction into automated checks
Detect accidental inclusion of URLs or addresses in builds

Bulk Automation & Scripting

Pipe logs or artifacts through IOCX
Use the Python API for ETL or batch workflows
Extend with custom detectors for internal patterns

Version Highlights

v0.3.0 — Stronger Architecture, New Crypto IOC Detection

Ethereum & Bitcoin wallet detection
Improved architecture for long-term extensibility
Same blazing performance on multi-MB inputs

v0.2.0 — High‑Reliability IP Detection

Significant improvements to IPv4/IPv6 extraction in noisy, malformed, mixed-content environments

Real CLI Output (Chaos Corpus Sample)

$ iocx chaos_corpus.json
{
  "file": "examples/samples/structured/chaos_corpus.json",
  "type": "text",
  "iocs": {
    "urls": [
      "http://[2001:db8::1]:443"
    ],
    "domains": [],
    "ips": [
      "2001:db8::1",
      "2001:db8::1:443",
      "10.0.0.1",
      "192.168.1.10",
      "fe80::dead:beef%eth0",
      "1.2.3.4",
      "fe80::1%eth0",
      "192.168.1.110",
      "fe80::1%eth0fe80",
      "::2%eth1",
      "2001:db8::"
    ],
    "hashes": [],
    "emails": [],
    "filepaths": [],
    "base64": []
  },
  "metadata": {}
}

Chaos Corpus: Input → Extracted Output → Explanation

Input	Extracted Output	Explanation
fe80::dead:beef%eth0/garbage	fe80::dead:beef%eth0	Salvaged valid IPv6, junk ignored.
xxx192.168.1.10yyy	192.168.1.10	IPv4 inside junk text.
DROP:client=10.0.0.1;;;ERR	10.0.0.1	IPv4 from noisy log field.
[2001:db8::1]::::443	2001:db8::1	IPv6 and IPv6+port extracted.
	2001:db8::1:443
GET http://[2001:db8::1]:443/index	http://[2001:db8::1]:443	URL with IPv6 parsed correctly.
udp://[fe80::1%eth0]::::53	fe80::1%eth0	Concatenated IPv6 split up.
192.168.1.110.0.0.1	192.168.1.110	Combined IP segment salvaged.
fe80::1%eth0fe80::2%eth1	fe80::1%eth0fe80, ::2%eth1	Concatenated IPv6 split up.
2001:db8::12001:db8::2	2001:db8::	Longest valid IPv6 prefix found.
256.256.256:256	—	Invalid indicator ignored.

Performance Benchmarks (v0.2.0)

All measurements from the latest performance suite:

Sample Type	Time
1 MB mixed‑content sample	0.0053s
Pathological IPv6 blob	0.0055s
100 KB sample	0.0006s
300 KB sample	0.0017s
600 KB sample	0.0031s
1 MB sample	0.0055s

Throughput: ~200 MB/s
Worst‑case IPv6 blob: ~0.5 ms
Linear scaling: almost perfect from 100 KB → 1 MB

Performance Benchmarks (v0.3.0)

All measurements from the latest performance suite:

Sample Type	Time
IP
1 MB mixed‑content sample	0.0070s
Pathological IPv6 blob	0.0004s
100 KB sample	0.0008s
300 KB sample	0.0021s
600 KB sample	0.0038s
1 MB sample	0.0068s
Filepath
1 MB mixed‑content sample	0.0040s
Pathological deep unix path	0.0237s
300 KB sample	0.0011s
600 KB sample	0.0022s
1000 KB sample	0.0038s
1500 KB sample	0.0055s
Crypto
1 MB mixed‑content sample	0.0021s
Pathological ETH-like blob	0.0012s
300 KB sample	0.0006s
600 KB sample	0.0012s
1000 KB sample	0.0020s
1500 KB sample	0.0031s

Throughput: ~200 MB/s
Worst‑case IPv6 blob: ~0.5 ms
Worst‑case filepath blob: ~23 ms
Worst‑case crypto blob: ~1 ms
Linear scaling: almost perfect from 100 KB → 1 MB

Features

IOC Extraction

Windows PE files (.exe, .dll)
Raw text
Extracted strings from binaries
Caching for increased performance

Detections

URLs
Domains
IPv4 / IPv6 addresses
File paths
Hashes (MD5 / SHA1 / SHA256 / SHA512 / Generic Hex)
Email addresses
Base64
Crypto wallets (Ethereum / Bitcoin)

Static PE Parsing

Imports
Sections
Resources
Metadata

Developer‑Friendly

Clean JSON output
CLI + Python API
Modular, extensible rule system
Minimal dependency footprint

Security‑First

Zero malware execution
Safe for untrusted input
Deterministic behaviour

Why Static Only?

Static analysis ensures safety, determinism, and CI‑friendly operation. No sandboxing, no execution, and no risk of triggering malware behaviour.

Quickstart

Install

pip install iocx

Extract IOCs from a file

iocx suspicious.exe

Extract from text

echo "Visit http://bad.example.com" | iocx -

Extract from a log file

iocx alerts.log

Python API

from iocx.engine import Engine

engine = Engine()
results = engine.extract("suspicious.exe")
print(results)

Show Example JSON Output

{
  "file": "suspicious.exe",
  "type": "PE",
  "iocs": {
    "urls": ["http://malicious.example.com"],
    "domains": ["malicious.example.com"],
    "ips": ["45.77.12.34"],
    "hashes": ["d41d8cd98f00b204e9800998ecf8427e"],
    "emails": ["attacker@example.com"],
    "filepaths": [
      "c:\\windows\\system32\\cmd.exe",
      "d:\\temp\\payload.bin"
    ],
    "base64": []
  },
  "metadata" : {
    "file_type": "PE",
    "imports": [
      "KERNEL32.dll",
      "msvcrt.dll"
    ],
    "sections": [
      ".text",
      ".data",
      ".rdata",
      ".pdata",
      ".xdata",
      ".bss",
      ".idata",
      ".CRT",
      ".tls",
      ".reloc"
    ],
    "resource_strings": [
      "C:\\Windows\\System32\\cmd.exe",
      "\\\\SERVER01\\share\\dropper.exe",
      "/home/alice/.config/evil.sh@%APPDATA%\\Microsoft\\Windows\\Start Menu\\Programs\\Startup\\evil.lnk"
    ]
  }
}

Architecture


iocx/
│
├── examples/        # Sample files + generators
├── docs/            # Detector contracts, overlap suppression rules, and plugin authoring guidelines
├── tests/           # Unit, integration, fuzz, robustness, and performance tests
├── iocx
    ├── detectors/   # Regex-based IOC detectors
    ├── parsers/     # PE parsing, string extraction
    ├── plugins/     # Plugin API and registry
    ├── cli/         # Command-line interface

The engine is intentionally modular so components can be extended or replaced easily.

Extending IOCX

See docs/specs/ for:

Detector contracts
Overlap suppression rules
Plugin authoring guidelines

Safe Testing (No Malware Required)

All test samples are:

Synthetic
Benign
Publicly safe (EICAR, GTUBE)
Designed to avoid accidental malware handling

Contributing

We welcome:

New IOC detectors
Parser improvements
Bug reports
Documentation updates
Synthetic test samples

See CONTRIBUTING.md for full guidelines.

Security

If you discover a security issue, do not open a GitHub issue. Please follow the instructions in SECURITY.md.

License

Licensed under the MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 161 Commits
.github/workflows		.github/workflows
docs/specs		docs/specs
examples		examples
iocx		iocx
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README-pypi.md		README-pypi.md
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

Folders and files

Latest commit

History

Repository files navigation

IOCX — Static IOC Extraction for Binaries, Text, and Artifacts

Why IOCX?

Key advantages

What IOCX Is Not

Use Cases

SOC & Incident Response

Threat Intelligence Processing

CI/CD & DevSecOps

Bulk Automation & Scripting

Version Highlights

v0.3.0 — Stronger Architecture, New Crypto IOC Detection

v0.2.0 — High‑Reliability IP Detection

Real CLI Output (Chaos Corpus Sample)

Features

IOC Extraction

Detections

Static PE Parsing

Developer‑Friendly

Security‑First

Why Static Only?

Quickstart

Install

Extract IOCs from a file

Extract from text

Extract from a log file

Python API

Architecture

Extending IOCX

Safe Testing (No Malware Required)

Contributing

Security

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages