tda-mapper is a Python library built around the Mapper algorithm, a core technique in Topological Data Analysis (TDA) for extracting topological structure from complex data. Designed for computational efficiency and scalability, it leverages optimized spatial search methods to support high-dimensional datasets. The library is well-suited for integration into machine learning pipelines, unsupervised learning tasks, and exploratory data analysis.
Further details in the documentation and in the paper.
-
Efficient construction
Leverages optimized spatial search techniques and parallelization to accelerate the construction of Mapper graphs, supporting the analysis of high-dimensional datasets.
-
Scikit-learn integration
Provides custom estimators that are fully compatible with scikit-learn's API, enabling seamless integration into scikit-learn pipelines for tasks such as dimensionality reduction, clustering, and feature extraction.
-
Flexible visualization
Multiple visualization backends supported (Plotly, Matplotlib, PyVis) for generating high-quality Mapper graph representations with adjustable layouts and styling.
-
Interactive app
Provides an interactive web-based interface (via Streamlit) for dynamic exploration of Mapper graph structures, offering real-time adjustments to parameters and visualizations.
The Mapper algorithm extracts topological features from complex datasets, representing them as graphs that highlight clusters, transitions, and key structural patterns. These insights reveal hidden data relationships and are applicable across diverse fields, including social sciences, biology, and machine learning. For an in-depth overview of Mapper, including its mathematical foundations and practical applications, read the original paper.
Step 1 | Step 2 | Step 3 | Step 4 |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
Choose lens | Cover image | Run clustering | Build graph |
If you use tda-mapper in your work, please consider citing both the library, archived in a permanent Zenodo record, and the paper, which provides a broader methodological overview. We recommend citing the specific version of the library used in your research, along with the paper. For citation examples, please refer to the documentation.
To install the latest version uploaded on PyPI
pip install tda-mapper
Here's a minimal example using the circles dataset from scikit-learn
to
demonstrate how to use tda-mapper. This example demonstrates how to apply
the Mapper algorithm on a synthetic dataset (concentric circles). The goal is
to extract a topological graph representation using PCA
as a lens and
DBSCAN
for clustering. We proceed as follows:
import matplotlib.pyplot as plt
from sklearn.datasets import make_circles
import numpy as np
from sklearn.decomposition import PCA
from sklearn.cluster import DBSCAN
from tdamapper.learn import MapperAlgorithm
from tdamapper.cover import CubicalCover
from tdamapper.plot import MapperPlot
# Generate toy dataset
X, labels = make_circles(n_samples=5000, noise=0.05, factor=0.3, random_state=42)
plt.figure(figsize=(5, 5))
plt.scatter(X[:,0], X[:,1], c=labels, s=0.25, cmap="jet")
plt.axis("off")
plt.show()
# Apply PCA as lens
y = PCA(2, random_state=42).fit_transform(X)
# Mapper pipeline
cover = CubicalCover(n_intervals=10, overlap_frac=0.3)
clust = DBSCAN()
graph = MapperAlgorithm(cover, clust).fit_transform(X, y)
# Visualize the Mapper graph
fig = MapperPlot(graph, dim=2, seed=42, iterations=60).plot_plotly(colors=labels)
fig.show(config={"scrollZoom": True})
Original Dataset | Mapper Graph |
---|---|
![]() |
![]() |
Left: the original dataset consisting of two concentric circles with noise, colored by class label. Right: the resulting Mapper graph, built from the PCA projection and clustered using DBSCAN. The two concentric circles are well identified by the connected components in the Mapper graph.
More examples can be found in the documentation.
Use our Streamlit app to visualize and explore your data without writing code.
You can run a live demo directly on
Streamlit Cloud,
or locally on your machine. The first time you run the app locally, you may
need to install the required dependencies from the requirements.txt
file by
running
pip install -r app/requirements.txt
then run the app locally with
streamlit run app/streamlit_app.py