Skip to content

Feature request: Add fenn.vision module for image preprocessing #25

@blkdmr

Description

@blkdmr

Introduce a new fenn.vision module that provides a small, composable toolbox for image dataset inspection and preprocessing, built on top of NumPy, Pillow and OpenCV. The focus should be ergonomic helpers for common early-stage image EDA (Exploratory Data Analysis) and preprocessing patterns, not re-implementing full computer vision libraries.

Goal

Create an initial version of fenn.vision with a clear, well-documented API to simplify early-stage image exploration and preprocessing for ML workflows. The module should make it easy to quickly inspect, summarize, sanity-check, and standardize image datasets represented as numpy.ndarray tensors or directory-based image collections.

Proposed features

Ideas for a first iteration (not all are mandatory for a single PR):

  • image_summary(array): one-shot overview for a batch of images, combining shape distribution (H, W, C), dtype, value range, per-channel mean/std, and simple NaN/inf checks.
  • image_dir_summary(path): scan a folder (optionally recursive) and summarize counts, common resolutions, formats, and a few representative examples; does not load all pixels into memory at once.
  • check_image_batch(array): fast sanity checks for typical issues (mixed dtypes, unexpected channel counts, extreme outliers, non-finite values) returning a structured report rather than raising immediately.
  • ensure_color_mode(array, mode="RGB"): convert grayscale / RGBA to the desired channel layout (e.g. expand gray to 3 channels, drop alpha) for NumPy images.
  • resize_batch(array, size, interpolation="bilinear"): convenience wrapper to resize a batch of images to a target size, preserving channel order and dtype where possible.
  • normalize_batch(array, mode="0_1" | "minus1_1" | "imagenet_stats" | "zscore"): simple, opinionated normalization helpers with explicit, documented behavior and no hidden global state.
  • standard_preprocess(array, *, size=None, color_mode="RGB", normalize="0_1"): compose common operations (dtype cast, channel fix, optional resize, normalization) into a deterministic, testable pipeline; returns a new array without in-place modification.
  • augment_debug_view(array, augment_fn, n=8): run a user-provided augmentation function on a small sample and return the original/augmented pairs as a tiny batch for quick visual sanity checks (no plotting, just data).

These functions should be pure utilities (no global state) and must not alter the inputs in place; all outputs are new arrays or small summary objects.

Tasks

  • Create the fenn/vision/__init__.py module and basic package structure, mirroring the style of fenn.tabular but tailored to images.
  • Implement a first subset of utilities (for example: image_summary, check_image_batch, standard_preprocess).
  • Add type hints and docstrings with small usage examples showing typical NumPy batch shapes like (N, H, W, C) and (H, W, C).
  • Add unit tests with small synthetic arrays: tiny RGB/gray images, mixed dtypes, arrays containing NaNs/infs, and non-uniform shapes for directory-based helpers.
  • Integrate the new module into the public API and update the documentation with a short “Image preprocessing and EDA” section, including a minimal end‑to‑end example.

Contributing

If you want to work on this:

  • Read the project’s CONTRIBUTING guide before starting.
  • Comment on the issue to claim a subset and join the Discord server to coordinate scope and API choices for the first iteration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions