Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,23 @@

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

# CRITICAL RULES

- Scan the existing codebase and reuse existing functions wherever possible.
- Keep all imports within functions unless they must be mocked in a test.
- If an import is small, performative, and significantly reduces needs for new code, use the library.
- Write short Sphinx docstrings as a single line description, a single line for each parameter, and no empty lines.
- On first line of docstrings use \n instead of line break.
- Variable names must be `snake_case` sequence of descriptive words <=5 letters long
- Keep labels consistent across the entire project.
- In commit messages: use `+` for code adds, `-` for code subtractions, `~` for refactors/fixes.
- Write full variable names at all times. No abbreviations.
- Use descriptive variable names instead of comments.
- No inline comments.
- No emoji.
- No global variables.
- No semantic commit messages.

## Commands

```bash
Expand Down
51 changes: 34 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
language:
- en
library_name: nnll
library_name: negate
license_name: MPL-2.0 + Commons Clause 1.0
compatibility:
- macos
Expand All @@ -20,6 +20,27 @@ A scanning, training, and research library for detecting the origin of digital i
[<img src="https://img.shields.io/badge/feed_me-__?logo=kofi&logoColor=white&logoSize=auto&label=donate&labelColor=maroon&color=grey&link=https%3A%2F%2Fko-fi.com%2Fdarkshapes">](https://ko-fi.com/darkshapes)<br>
<br>

### About

Negate is a modular system of image processing and feature extraction pipelines that measure machine aptitude of differentiating between synthetic and human-origin illustrations.

### Included Methods

| Texture | Color | VAE Loss | Residual | Perturbation | Noise/Jitter |
| ----------------------------- | --------------------------------- | -------- | ------------- | ------------------ | ----------------- |
| local binary pattern | histogram oriented gradient (hog) | l1 | spectral | haar wavelet | snr/noise entropy |
| gray lvl co-occurrence matrix | variance | mse | laplacian | random resize crop | stroke features |
| energy | kurtosis | k1 | gaussian diff | patchification | |
| complexity | skew | bce | sobel | | |
| microtexture | palette features | | | | |

### Feature Processing Options

- Decision Tree + PCA
- SVM (RBF)
- MLP
- LR

## Quick Start

![MacOS](https://darkshapes.org/img/macos.svg)<sup> Terminal</sup>
Expand Down Expand Up @@ -55,7 +76,9 @@ Train a new model with the following command:

## Technical Details & Research Results

<details><summary> Expand</summary>
### Abstract

Previous research has demonstrated the possibility of identifying deepfakes, synthetic images, illustrations and photographs. Yet generative models have since undergone dramatic improvements, challenging past identification research and calling into question the future efficacy of these developments. Most methods chose images easily discernible as synthetic by the naked eye of a trained artist, or evaluated their success against open models exclusively. In this work, we create a comprehensive analysis suite for decomposition and feature extraction of digital images to study the effectiveness of these methods. Then, using an ensemble of previous techniques, we train simple decision trees and SVM models on these features to achieve >70% accuracy in detecting synthetic vs. genuine illustrations. Our methods of training and inference require only consumer-grade hardware, use exclusively consensual datasets provided by artists and Creative-Commons sources, and provide reliable estimates against the modern image products of both open and black-box closed-source models.

### Structure

Expand All @@ -71,28 +94,22 @@ Directories are located within `$HOME\.local\bin\uv\tools` or `.local/bin/uv/too

---

| Module | Summary | Purpose |
| ------------ | ------------------- | ------------------------------------------------------------------------------------------------------------------------------ |
| negate | core module | Root source code folder. Creates CLI arguments and interprets commands. |
| →→ decompose | image processing | Random Resize Crop and Haar Wavelet transformations - [arxiv:2511.14030](https://arxiv.org/abs/2511.14030) |
| →→ extract | feature processing | Laplace/Sobel/Spectral analysis, VIT/VAE extraction, cross‑entropy loss - [arxiv:2411.19417](https://arxiv.org/abs/2411.19417) |
| →→ io | load / save / state | Hyperparameters, image datasets, console messages, model serialization and conversion. |
| →→ metrics | evaluation | Graphs, visualizations, model performance metadata, and a variety of heuristics for results interpretation. |
| → inference | predictions | Detector functions to determine origin from trained model predictions. |
| → train | XGBoost | PCA data transforms and gradient-boosted decision tree model training. |

### Research
| Module | Summary | Purpose |
| ------------ | ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
| negate | core module | Root source code folder. Creates CLI arguments and interprets commands. |
| →→ decompose | image processing | RRC, Wavelet Transform - [arxiv:2511.14030](https://arxiv.org/abs/2511.14030) [https://arxiv.org/abs/2504.07078](https://arxiv.org/abs/2504.07078) |
| →→ extract | feature processing | Residual analysis, VIT/VAE extraction, cross‑entropy loss - [arxiv:2411.19417](https://arxiv.org/abs/2411.19417) |
| →→ io | load / save / state | Hyperparameters, image datasets, console messages, model serialization and conversion. |
| →→ metrics | evaluation | Graphs, visualizations, model performance metadata, and a variety of heuristics for results interpretation. |
| → inference | predictions | Detector functions to determine origin from trained model predictions. |
| → train | XGBoost | PCA data transforms and gradient-boosted decision tree model training. |

<div align="center">

<img src="results/tail_plot.png" style="width:50%; max-width:500px;" alt="Visualization of Fourier Image Residual variance for the DinoViTL Model">

<img src="results/vae_plot.png" style="width:50%; max-width:500px;" alt="Visualization of VAE mean loss results for the Flux Klein model"></div>

The ubiqity of online services, connected presence, generative models, and the proliferate digital output that has accompanied these nascent developments have yielded a colossal and simultaneous disintegration of trust, judgement and ecological welfare, exacerbating prevailing struggles in all species of life. While the outcome of these deep-seated issues is beyond the means of a small group of academic researchers to determine, and while remediation efforts will require far more resources than attention alone, we have nevertheless taken pause to reconsider the consequences of our way of life while investigating the prospects of new avenues that may diminish harm.

</details>

```bib
@misc{darkshapes2026,
author={darkshapes},
Expand Down
17 changes: 15 additions & 2 deletions config/config.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ feat_ext_path = "" # Path to save the model in or null for default: [$HO

[datasets]
eval_data = ["tellif/ai_vs_real_image_semantically_similar"]
genuine_data = ["KarimSayed/cat-breed-fiass-index"]
genuine_data = ["huggan/wikiart"] #["KarimSayed/cat-breed-fiass-index"]
genuine_local = []
synthetic_data = ["exdysa/nano-banana-pro-generated-1k-clone", "ash12321/seedream-4.5-generated-2k"]
synthetic_data = ["exdysa/nano-banana-pro-generated-1k-clone"] #, "ash12321/seedream-4.5-generated-2k"]
synthetic_local = []

[vae.library]
Expand Down Expand Up @@ -57,3 +57,16 @@ n_components = 0.95 # Number of components for dimensionality reduction
num_boost_round = 200 # Number of boosting rounds
test_size = 0.2 # 80/20 training split default
verbose_eval = 20

[ensemble]
sample_size = 100
n_folds = 5
abstain_threshold = 0.3
svm_c = 10.0
mlp_hidden_layers = 100
mlp_activation = "relu"
mlp_max_iter = 1000
cv = 3
method = "sigmoid"
gamma = "scale"
kernel = "rbf"
253 changes: 253 additions & 0 deletions negate/decompose/surface.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,253 @@
# SPDX-License-Identifier: MPL-2.0 AND LicenseRef-Commons-Clause-License-Condition-1.0
# <!-- // /* d a r k s h a p e s */ -->

"""Extended frequency analysis branch (FFT/DCT) that captures spectral fingerprints left by generative models.

Features are grouped into 6 categories:
- Brightness (2): mean, entropy
- Color (23): RGB/HSV histogram statistics
- Texture (6): GLCM + LBP
- Shape (6): HOG + edge length
- Noise (2): noise entropy, SNR
- Frequency (10): FFT/DCT spectral analysis
"""

from __future__ import annotations

from typing import Any
import numpy as np
from numpy.typing import NDArray
from PIL.Image import Image
from scipy.stats import skew, kurtosis
from skimage.feature import graycomatrix, graycoprops, local_binary_pattern


class NumericImage:
image: Image
TARGET_SIZE = (255, 255)

def __init__(self, image: Image) -> None:
self._image = image
self.to_gray()
self.to_rgb()
self.rgb2hsv()

@property
def gray(self) -> np.ndarray[tuple[Any, ...], np.dtype[np.float64]]:
return self.shade

@property
def color(self):
return self.rgb

@property
def hsv(self):
return self._hsv

def to_gray(self) -> NDArray:
"""Resize and convert to float64 grayscale."""
img = self._image.convert("L").resize(self.TARGET_SIZE, Image.BICUBIC)
self.shade = np.asarray(img, dtype=np.float64) / 255.0

def to_rgb(self) -> NDArray:
"""Resize and convert to float64 RGB [0,1]."""
img = self._image.convert("RGB").resize(self.TARGET_SIZE, Image.BICUBIC)
self.rgb = np.asarray(img, dtype=np.float64) / 255.0

def rgb2hsv(self) -> NDArray:
"""Convert RGB [0,1] array to HSV [0,1]."""
from colorsys import _hsv_from_rgb as hsv_from_rgb

rgb = self.rgb.copy()
rgb = rgb / 255.0 if rgb.max() > 1 else rgb
h, w, c = rgb.shape
flat = rgb.reshape(-1, 3)
result = np.array([hsv_from_rgb(r, g, b) for r, g, b in flat])
self._hsv = result.T.reshape(h, w, 3)


class SurfaceFeatures:
"""Extract artwork features for AI detection.

Usage:
>>> img = NumericImage(pil_image)
>>> extractor = VisualFeatures(img)
>>> features = extractor()
>>> len(features)
"""

def __init__(self, image: NumericImage):
self.image = image

def __call__(self) -> dict[str, float]:
"""Extract all features from the NumericImage.

:returns: Dictionary of scalar features.
"""
gray = self.image.gray
rgb = self.image.color

features: dict[str, float] = {}
features |= self.brightness_features(gray)
features |= self.color_features(rgb)
features |= self.texture_features(gray)
features |= self.shape_features(gray)
features |= self.noise_features(gray)
features |= self.frequency_features(gray)

return features

def entropy(counts: NDArray) -> float:
"""Compute Shannon entropy from histogram counts."""
probs = counts / counts.sum()
probs = probs[probs > 0]
return -np.sum(probs * np.log2(probs))

def brightness_features(self, gray: NDArray) -> dict[str, float]:
"""Mean and entropy of pixel brightness."""
return {
"mean_brightness": float(gray.mean()),
"entropy_brightness": float(self.entropy(np.histogram(gray, bins=256, range=(0, 1))[0] + 1e-10)),
}

def color_features(self, rgb: NDArray) -> dict[str, float]:
"""RGB and HSV histogram statistics"""
features: dict[str, float] = {}

for i, name in enumerate(("red", "green", "blue")):
channel = rgb[:, :, i].ravel()
features[f"{name}_mean"] = float(channel.mean())
features[f"{name}_variance"] = float(channel.var())
features[f"{name}_kurtosis"] = float(kurtosis(channel))
features[f"{name}_skewness"] = float(skew(channel))

rgb_flat = rgb.reshape(-1, 3)
rgb_hist = np.histogramdd(rgb_flat, bins=32)[0]
features["rgb_entropy"] = float(self.entropy(rgb_hist.ravel() + 1e-10))

hsv = self.image.hsv
for i, name in enumerate(("hue", "saturation", "value")):
channel = hsv[:, :, i].ravel()
features[f"{name}_variance"] = float(channel.var())
features[f"{name}_kurtosis"] = float(kurtosis(channel))
features[f"{name}_skewness"] = float(skew(channel))

hsv_flat = hsv.reshape(-1, 3)
hsv_hist = np.histogramdd(hsv_flat, bins=32)[0]
features["hsv_entropy"] = float(self.entropy(hsv_hist.ravel() + 1e-10))

return features

def shape_features(self, gray: NDArray) -> dict[str, float]:
"""HOG statistics and edge length."""
from skimage.feature import hog
from PIL import Image as PilImage
import numpy as np

hog_features = hog(gray, pixels_per_cell=(16, 16), cells_per_block=(2, 2), feature_vector=True)

features: dict[str, float] = {
"hog_mean": float(hog_features.mean()),
"hog_variance": float(hog_features.var()),
"hog_kurtosis": float(kurtosis(hog_features)),
"hog_skewness": float(skew(hog_features)),
"hog_entropy": float(self.entropy(np.histogram(hog_features, bins=50)[0] + 1e-10)),
}

gray_uint8 = (gray * 255).astype(np.uint8)
edges_array = np.asarray(PilImage.fromarray(gray_uint8).convert("L").point(lambda x: 0 if x < 128 else 255, "1"))
features["edgelen"] = float(edges_array.sum())

return features

def noise_features(self, gray: NDArray) -> dict[str, float]:
"""Noise entropy and signal-to-noise ratio."""
from skimage.restoration import estimate_sigma

sigma = estimate_sigma(gray)
noise = gray - np.clip(gray, gray.mean() - 2 * sigma, gray.mean() + 2 * sigma)

noise_hist = np.histogram(noise.ravel(), bins=256)[0]
noise_ent = float(self.entropy(noise_hist + 1e-10))

signal_power = float(gray.var())
noise_power = float(sigma**2) if sigma > 0 else 1e-10
snr = float(10 * np.log10(signal_power / noise_power + 1e-10))

return {
"noise_entropy": noise_ent,
"snr": snr,
}

def texture_features(self, gray: NDArray) -> dict[str, float]:
"""GLCM and LBP texture features."""
gray_uint8 = (gray * 255).astype(np.uint8) if gray.max() <= 1 else gray.astype(np.uint8)

glcm = graycomatrix(gray_uint8, distances=[1], angles=[0], levels=256, symmetric=True, normed=True)

features: dict[str, float] = {
"contrast": float(graycoprops(glcm, "contrast")[0, 0]),
"correlation": float(graycoprops(glcm, "correlation")[0, 0]),
"energy": float(graycoprops(glcm, "energy")[0, 0]),
"homogeneity": float(graycoprops(glcm, "homogeneity")[0, 0]),
}

lbp = local_binary_pattern(gray_uint8, P=8, R=1, method="uniform")
features["lbp_entropy"] = float(self.entropy(np.histogram(lbp, bins=10)[0] + 1e-10))
features["lbp_variance"] = float(lbp.var())

return features

def frequency_features(self, gray: NDArray) -> dict[str, float]:
"""FFT and DCT spectral analysis features meant to capture upsampling layers and attention patterns."""

from scipy.fft import dctn
from numpy.fft import fftfreq

height, width = gray.shape

fft_2d = np.fft.fft2(gray)
fft_shift = np.fft.fftshift(fft_2d)
magnitude = np.abs(fft_shift)
log_mag = np.log(magnitude + 1e-10)
phase = np.angle(fft_shift)

center_h, center_w = height // 2, width // 2

y, x = np.ogrid[:height, :width]
radius = np.sqrt((x - center_w) ** 2 + (y - center_h) ** 2)
max_r = np.sqrt(center_h**2 + center_w**2)

low_mask = radius < max_r * 0.2
mid_mask = (radius >= max_r * 0.2) & (radius < max_r * 0.6)
high_mask = radius >= max_r * 0.6

total_energy = float((magnitude**2).sum() + 1e-10)
low_energy = float((magnitude[low_mask] ** 2).sum())
mid_energy = float((magnitude[mid_mask] ** 2).sum())
high_energy = float((magnitude[high_mask] ** 2).sum())

row_freqs = fftfreq(height)[:, None] * np.ones((1, width))
col_freqs = np.ones((height, 1)) * fftfreq(width)[None, :]
spectral_centroid = float((np.sum(log_mag * np.abs(row_freqs)) + np.sum(log_mag * np.abs(col_freqs))) / (log_mag.sum() * 2 + 1e-10))

dct_coeffs = dctn(gray, type=2, norm="ortho")
dct_mag = np.abs(dct_coeffs)

flat_dc_energy = float(dct_mag[0, 0] ** 2)
detail_ac_energy = float((dct_mag**2).sum() - flat_dc_energy)

phase_coherence = float(phase.std())

return {
"fft_low_energy_ratio": low_energy / total_energy,
"fft_mid_energy_ratio": mid_energy / total_energy,
"fft_high_energy_ratio": high_energy / total_energy,
"fft_spectral_centroid": spectral_centroid,
"fft_log_mag_mean": float(log_mag.mean()),
"fft_log_mag_std": float(log_mag.std()),
"fft_phase_std": phase_coherence,
"dct_ac_dc_ratio": detail_ac_energy / (flat_dc_energy + 1e-10),
"dct_high_freq_energy": float((dct_mag[height // 2 :, width // 2 :] ** 2).sum() / (dct_mag**2).sum()),
"dct_sparsity": float((dct_mag < 0.01 * dct_mag.max()).mean()),
}
Loading
Loading