Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,8 @@ jobs:
with:
python-version: '3.12'

- name: Install uv (fast Python installer)
run: |
pip install uv
- name: Install uv
uses: astral-sh/setup-uv@v5

- name: Install dependencies (including dev)
run: |
Expand All @@ -29,7 +28,7 @@ jobs:

- name: Lint with ruff
run: |
ruff check src/microfactual
ruff check src/microfactual

- name: Run tests with coverage
run: |
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:
branches: [main, master]

workflow_dispatch:

permissions:
contents: write

Expand Down
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -84,4 +84,4 @@ output/

# Keep dataset files
!Dataset/*.txt
!README.md
!README.md
65 changes: 65 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Pre-commit hooks for code quality and consistency
# Install with: pre-commit install
# Run manually with: pre-commit run --all-files

repos:
# Ruff linting and formatting
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.2
hooks:
- id: ruff
name: ruff (linting)
args: [--fix, --exit-non-zero-on-fix]
- id: ruff-format
name: ruff (formatting)

# Type checking with mypy
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.11.1
hooks:
- id: mypy
name: mypy (type checking)
additional_dependencies:
- pandas-stubs
- types-setuptools
- numpy
args: [--strict, --ignore-missing-imports]
files: ^src/

# General code quality
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: trailing-whitespace
name: trim trailing whitespace
- id: end-of-file-fixer
name: fix end of files
- id: check-yaml
name: check yaml files
- id: check-toml
name: check toml files
- id: check-merge-conflict
name: check for merge conflicts
- id: check-added-large-files
name: check for large files
args: ['--maxkb=1000']
- id: debug-statements
name: check for debug statements

# Security checks
- repo: https://github.com/PyCQA/bandit
rev: 1.7.9
hooks:
- id: bandit
name: bandit (security)
args: ['-r', 'src/', '-f', 'json']
files: ^src/

# Documentation checks
- repo: https://github.com/pycqa/pydocstyle
rev: 6.3.0
hooks:
- id: pydocstyle
name: pydocstyle (docstring style)
files: ^src/
args: [--convention=numpy]
50 changes: 50 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.1.1] - 2026-01-26

### Fixed
- **Sklearn Compatibility**: Fixed `ValueError` in `MicrobiomeClassifier` by correcting inheritance order (MRO) with `ClassifierMixin` and `BaseEstimator`.
- **Dashboard Feature Names**: Fixed `ValueError` in `RandomForestClassifier` when launching dashboard, by automatically converting DataFrame inputs to numpy arrays in `predict` methods to bypass strict feature name checks (caused by dashboard sanitization).
- **Dashboard Indexing**: Fixed `IndexingError` in `AbundanceFilter` and `PrevalenceFilter` when column names are renamed (e.g., by dashboard), by using robust boolean masking.
- **Notebooks**: Patched `01_Quickstart` and `03_Dashboard` to use `get_feature_names_out()` for feature importance plotting.
- **Transforms**: Added `get_feature_names_out()` to `AbundanceFilter`, `PrevalenceFilter`, and `CLRTransform` for pipeline compatibility.


## [0.2.0] - 2025-06-25

Major architectural overhaul introducing a modular, sklearn-compatible design.

### Added

#### Core Architecture
- `MicrobiomeDataset`: Central data container with `X`, `y` properties and provenance tracking.
- `MicrobiomeClassifier`: "Batteries-included" classifier wrapper with built-in preprocessing.
- `mf.classify()`: High-level one-liner API for quick classification workflows.

#### Visualization Module (`microfactual.visualization`)
- `plot_roc()`: Generate ROC curves with AUC scores.
- `plot_confusion_matrix()`: Generate confusion matrices with custom class labels.
- `plot_feature_importance()`: Horizontal bar charts for feature importance.
- `launch_dashboard()`: Helper to launch interactive ExplainerDashboard.
- All visualization functions return `matplotlib.Figure` objects for flexibility.

#### Explainability Module (`microfactual.explainability`)
- `BaseExplainer`: Abstract base class for decoupling explainability frameworks.
- `DiCEExplainer`: Adapter for [DiCE](https://github.com/interpretml/DiCE) to generate counterfactual explanations.

#### Preprocessing (`microfactual.preprocessing`)
- `AbundanceFilter`: Sklearn-compatible transformer for abundance filtering.
- `PrevalenceFilter`: Sklearn-compatible transformer for prevalence filtering.
- `CLRTransform`: Sklearn-compatible transformer for Centered Log-Ratio transformation.

### Deprecated
- `run_pipeline()`: Deprecated in favor of `mf.classify()`. Will be removed in v1.0.

### Changed
- Refactored project structure to separate concerns (core, models, preprocessing, visualization, explainability).
- Updated documentation with new API references and usage examples.
12 changes: 12 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
include README.md
include LICENSE
include CHANGELOG.md
include pyproject.toml
include Makefile

recursive-include src/microfactual *.py
recursive-include datasets *.tsv *.txt

global-exclude *.pyc
global-exclude __pycache__
global-exclude .DS_Store
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -41,4 +41,4 @@ help:
@echo " install - Install dependencies and the package"
@echo " run - Run the package with default or specified parameters"
@echo " test - Run tests"
@echo " clean - Clean the output directory"
@echo " clean - Clean the output directory"
Loading
Loading