Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
d2f4369
Addition of a function that cleans cluster population for 2D labels.
ChiaraLionello Jan 13, 2026
6cf5161
cleaning_cluster_population now cleans also 3D labels.
ChiaraLionello Jan 14, 2026
ff137be
Documentation added to cleaning_cluster_population.
ChiaraLionello Jan 22, 2026
2314e3f
removed default for assigned_env.
ChiaraLionello Jan 22, 2026
5e67c2c
better explained function doc.
ChiaraLionello Jan 22, 2026
86646a3
Now get_onion_analysis returns labels.
ChiaraLionello Jan 22, 2026
dcc2fea
Added excluded_env.
ChiaraLionello Jan 22, 2026
f78d066
get_onion_analysis return list_of_labels.
ChiaraLionello Jan 22, 2026
8d78027
Added a warning for excluded_env.
ChiaraLionello Jan 22, 2026
f31af59
Added test for cleaning_cluster_population.
ChiaraLionello Jan 22, 2026
124167c
mypy fixes.
ChiaraLionello Jan 22, 2026
178cfd3
Added warning function.
ChiaraLionello Jan 27, 2026
72db13d
test_cluster uses np.array_equal.
ChiaraLionello Jan 27, 2026
1f857db
Merge branch 'GMPavanLab:main' into devel-cleaning-cluster-pop
ChiaraLionello Feb 3, 2026
4b8a45d
change default of logger auto_recording to false.
ChiaraLionello Feb 5, 2026
cb9149b
fixed variable name in doctest.
ChiaraLionello Feb 9, 2026
d440983
changed format of labels in get_onion_analysis.
ChiaraLionello Feb 9, 2026
917b609
Added tutorial for cleaning_cluster_population in tips_and_tricks sec…
ChiaraLionello Feb 9, 2026
89d45a4
Update docs/source/logs.rst.
ChiaraLionello Feb 10, 2026
e11fd64
Merge branch 'GMPavanLab:main' into devel-cleaning-cluster-pop
ChiaraLionello Feb 10, 2026
3910ce6
Changed logo for tips and tricks tutorial section.
ChiaraLionello Feb 10, 2026
2410328
Small changes.
ChiaraLionello Feb 11, 2026
a50df17
Updates tutorial section removing tips_and_tricks and dividing fundam…
ChiaraLionello Feb 11, 2026
e6f2ff2
Small changes (typo).
ChiaraLionello Feb 11, 2026
6d281cc
changed check in dimensions and added elif.
ChiaraLionello Feb 11, 2026
a4dd2b6
reformatted file.
ChiaraLionello Feb 11, 2026
856bd52
Better explenation of how to obtain cleaned_onion_output.
ChiaraLionello Feb 12, 2026
8ad9263
Update docs/source/logs.rst
ChiaraLionello Feb 13, 2026
cfad777
typo.
ChiaraLionello Feb 13, 2026
7d42933
All the import are in the first python code-block.
ChiaraLionello Feb 13, 2026
6be5091
Added explenation about the colors in cluster_population.png.
ChiaraLionello Feb 13, 2026
258e6a4
Removed usless comment.
ChiaraLionello Feb 13, 2026
ee1b61f
Added general path to load the trajectory.
ChiaraLionello Feb 13, 2026
9dbc76e
Added general path to load the trajectory.
ChiaraLionello Feb 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions docs/source/_static/recipes/cleaning_cluster_population.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
"""Code from the Cleaning Cluster Population tutorial."""

from pathlib import Path

import numpy as np

import dynsight
from dynsight.data_processing import cleaning_cluster_population
from dynsight.trajectory import Trj


def main() -> None:
"""Code from the Spatial Denoising tutorial."""
# Loading an example trajectory
files_path = Path.cwd()
trj = Trj.init_from_xtc(
traj_file=files_path / "ice_water_ox.xtc",
topo_file=files_path / "ice_water_ox.gro",
)

# Computing TimeSOAP descriptor
_, tsoap = trj.get_timesoap(
r_cut=10,
n_max=8,
l_max=8,
n_jobs=4, # Adjust n_jobs according to your computer capabilities
)

# Applying Spatial Denoising
sliced_trj = trj.with_slice(slice(0, -1, 1))
sp_denoised_tsoap = tsoap.spatial_average(
trj=sliced_trj,
r_cut=10,
n_jobs=4, # Adjust n_jobs according to your computer capabilities
)

# Performing Onion Clustering on the descriptor computed
delta_t_list, n_clust, unclass_frac, labels = (
sp_denoised_tsoap.get_onion_analysis(
delta_t_min=2,
delta_t_num=20,
fig1_path=files_path / "denoised_onion_analysis.png",
fig2_path=files_path / "cluster_population.png",
)
)

# Saving Onion output in an array
onion_output = np.array([delta_t_list, n_clust, unclass_frac]).T

# Assigning clusters with population <5% to the unclassified environment
# (label=-1)
cleaned_labels = cleaning_cluster_population(
labels,
threshold=0.05,
assigned_env=-1,
)

# Updating the data and plotting the cleaned number of clusters and
# unclassified fraction.
# Since unchanged, windows can be copied from above.
delta_t_list = onion_output[:, 0]

n_clust = np.zeros(delta_t_list.shape[0], dtype=np.int64)
unclass_frac = np.zeros(delta_t_list.shape[0])
for i in range(delta_t_list.shape[0]):
n_clust[i] = np.unique(cleaned_labels[:, :, i]).size - 1
unclass_frac[i] = np.sum(cleaned_labels[:, :, i] == -1) / np.size(
cleaned_labels[:, :, i]
)

cleaned_onion_output = np.array([delta_t_list, n_clust, unclass_frac]).T

dynsight.onion.plot_smooth.plot_time_res_analysis(
files_path / "cleaned_onion_analysis.png", cleaned_onion_output
)


if __name__ == "__main__":
main()
1 change: 0 additions & 1 deletion docs/source/_static/style.css
Original file line number Diff line number Diff line change
Expand Up @@ -44,4 +44,3 @@
font-weight: 600;
text-align: center;
}

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 8 additions & 0 deletions docs/source/data_processing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,3 +33,11 @@ Classification
savereferences <_autosummary/dynsight.data_processing.savereferences>
getreferencesfromdataset <_autosummary/dynsight.data_processing.getreferencesfromdataset>
applyclassification <_autosummary/dynsight.data_processing.applyclassification>

Clustering
----------

.. toctree::
:maxdepth: 1

cleaning_cluster_population <_autosummary/dynsight.data_processing.cleaning_cluster_population>
15 changes: 8 additions & 7 deletions docs/source/logs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,20 @@ Logs

dynsight logging system.

.. warning::
.. note::

A default instance of :class:`Logger` is **automatically created** when importing the ``dynsight`` package.
This instance is available as ``dynsight.logs.logger``.

You can configure it, for example to disable the automatic recording of datasets, using:
A default :class:`Logger` is used, it writes to file every step processed by ``dynsight`` (e.g. comptuting a descriptor, performing clustering, etc.).

An option of :class:`Logger` that automatically saves and records the dataset
can be activated after importing the ``dynsight`` package by using:

.. code-block:: python

import dynsight
dynsight.logs.logger.configure(auto_recording=False)
dynsight.logs.logger.configure(auto_recording=True)

You can also access all its attributes and methods described in the Logs page below.
The automatically stored datasets can be extracted using the
:meth:`Logger.extract_datasets` method. A complete list of available attributes and methods is provided below.

-----
Usage
Expand Down
215 changes: 215 additions & 0 deletions docs/source/tutorials/cleaning_cluster_population.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
Cleaning Cluster Population
===========================

Sometimes, clusters obtained with Onion Clustering analysis can be very small.
To better interpret the results, it can be useful to remove those ones by assigning them to
the cluster of the unclassified particles.
This is achieved through the class, :class:`.data_processing.cleaning_cluster_population()`, which
assign the cluster under a certain population threshold to a specific cluster selected by the user.

At the end of every section, you will find links to download the full ``python`` scripts
and its relevant input files.

As an example, we consider the ouput of the analysis computed in the `spatial denoising tutorial <./spatial_denoising.html>`_.
Briefly, we consider the denoised ``TimeSOAP`` descriptor that can be obtained from:

.. code-block:: python

import numpy as np
from pathlib import Path
import dynsight
from dynsight.trajectory import Trj
from dynsight.data_processing import cleaning_cluster_population

files_path = Path.cwd()
trj = Trj.init_from_xtc(
traj_file=files_path / "ice_water_ox.xtc",
topo_file=files_path / "ice_water_ox.gro",
)

_, tsoap = trj.get_timesoap(
r_cut=10,
n_max=8,
l_max=8,
n_jobs=4, # Adjust n_jobs according to your computer capabilities
)

sliced_trj = trj.with_slice(slice(0, -1, 1))
sp_denoised_tsoap = tsoap.spatial_average(
trj=sliced_trj,
r_cut=10,
n_jobs=4, # Adjust n_jobs according to your computer capabilities
)

delta_t_list, n_clust, unclass_frac, labels = sp_denoised_tsoap.get_onion_analysis(
delta_t_min=2,
delta_t_num=20,
fig1_path=files_path / "denoised_onion_analysis.png",
fig2_path=files_path / "cluster_population.png",
)

.. testcode:: cleaning_cluster_population_test
:hide:

from pathlib import Path
from dynsight.trajectory import Trj

files_path = Path("source/_static/simulations")
trj = Trj.init_from_xtc(
traj_file=files_path / "ice_water_ox.xtc",
topo_file=files_path / "ice_water_ox.gro",
)

assert trj.n_atoms == 2048
assert trj.n_frames == 1001

.. testcode:: cleaning_cluster_population_test
:hide:

import numpy as np

trj_test = trj.with_slice(slice(0, 2, 1))

expected_tests = Path("source/_static/tutorials/spatial_denoising/doctests")

soap_test = trj_test.get_soap(
r_cut=10,
n_max=8,
l_max=8,
n_jobs=1, # Adjust n_jobs according to your computer capabilities
)

_, tsoap_test = trj.get_timesoap(
soap_insight=soap_test,
)

assert tsoap_test.meta["r_cut"]==10
assert tsoap_test.meta["n_max"]==8
assert tsoap_test.meta["l_max"]==8

reference_tsoap = np.load(expected_tests / "test_tsoap.npy")
assert np.allclose(tsoap_test.dataset, reference_tsoap, atol=1e-6)

sliced_trj_test = trj.with_slice(slice(0, 1, 1))
sp_denoised_tsoap_test = tsoap_test.spatial_average(
trj=sliced_trj_test,
r_cut=10,
n_jobs=1,
)

reference_denoised_tsoap = np.load(expected_tests / "test_denoised_tsoap.npy")
assert np.allclose(sp_denoised_tsoap_test.dataset, reference_denoised_tsoap, atol=1e-6)

For further details users should refer to `spatial denoising tutorial <./spatial_denoising.html>`_.

Figure ``cluster_population.png`` shows the population of every cluster, each color is a different cluster and
blue refers to the unclassified fraction:

.. image:: ../_static/tutorials/cleaning_cluster_population/cluster_population.png
:scale: 15%
:align: center

Before cleaning the cluster we have to save the output from the Onion analysis in an array:

.. code-block:: python

onion_output = np.array([delta_t_list, n_clust, unclass_frac]).T

The small clusters can be removed and assigned to the unclassified fraction using the
class :class:`.data_processing.cleaning_cluster_population()`:

.. code-block:: python

cleaned_labels = cleaning_cluster_population(labels, threshold=0.05, assigned_env=-1)

where ``cleaned_labels`` has the same dimensions as ``labels``. Now we can reproduce the plot with the number
of clusters and the unclassified fraction after re-organizing the data. In particular,
:class:`.onion.plot_smooth.plot_time_res_analysis()`, which gives the plot that we want to obtain,
requires and array with the list of the time windows, the number of clusters at every ∆t, and the unclassified
fraction. Therefore, before plotting the graph, we need to create it by copying the list of time windows from
the one given by the Onion analysis, and calculate the number of clusters and the unclassified fraction from the
cleaned labels:

.. code-block:: python

delta_t_list = onion_output[:, 0] # Since unchanged, windows can be copied from above.

n_clust = np.zeros(delta_t_list.shape[0],dtype=np.int64)
unclass_frac = np.zeros(delta_t_list.shape[0])
for i in range(delta_t_list.shape[0]):
n_clust[i] = np.unique(cleaned_labels[:, :, i]).size - 1
unclass_frac[i] = np.sum(cleaned_labels[:, :, i] == -1) / np.size(cleaned_labels[:, :, i])

cleaned_onion_output = np.array([delta_t_list, n_clust, unclass_frac]).T

dynsight.onion.plot_smooth.plot_time_res_analysis("cleaned_onion_analysis.png", cleaned_onion_output)

.. testcode:: cleaning_cluster_population_test
:hide:

from dynsight.data_processing import cleaning_cluster_population

expected_tests = Path("../tests/data_processing/cluster/test_cluster")

labels = np.zeros((4, 10, 3), dtype=int)

labels[:, :, 0] = np.array(
[
[0, 0, 0, 1, 1, 1, 2, 2, 2, 2],
[0, 0, 0, 1, 1, 1, 2, 2, 2, 2],
[0, 0, 0, 1, 1, 1, 2, 2, 2, 3],
[0, 0, 0, 1, 1, 1, 2, 2, 2, 3],
]
)

labels[:, :, 1] = np.array(
[
[0, 0, 0, 0, 0, 1, 1, 1, 1, 4],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 4],
[0, 0, 0, 0, 0, 1, 1, 1, 4, 4],
[0, 0, 0, 0, 0, 1, 1, 1, 4, 4],
]
)

labels[:, :, 2] = np.array(
[
[0, 0, 9, 9, 0, 1, 1, 1, 1, 9],
[0, 0, 0, 9, 0, 1, 1, 1, 1, 9],
[0, 0, 0, 9, 0, 1, 1, 1, 1, 9],
[0, 9, 0, 0, 0, 1, 1, 1, 1, 9],
]
)

test_clean_pop = cleaning_cluster_population(
labels,
threshold=0.05,
assigned_env=99,
)

exp_clean_pop = np.load(expected_tests / "c0_clean_pop_th5_ass99_exNone.npy")
assert np.array_equal(exp_clean_pop, test_clean_pop)

On the left are reported the results from Onion clustering on the denoised time-series (`denoised_onion_analysis.png`
from `spatial denoising tutorial <./spatial_denoising.html>`_), while on the rigth is reported the figure
``cleaned_onion_analysis.png``.

.. image:: ../_static/tutorials/spatial_denoising/denoised_onion_analysis.png
:scale: 8%
:align: left

.. image:: ../_static/tutorials/cleaning_cluster_population/cleaned_onion_analysis.png
:scale: 8%
:align: right

.. raw:: html

<div style="clear: both;"></div>

Full scripts and input files
----------------------------

.. raw:: html

<a class="btn-download" href="../_static/simulations/ice_water_ox.gro" download>⬇️ Download the .gro file</a> <br>
<a class="btn-download" href="../_static/simulations/ice_water_ox.xtc" download>⬇️ Download the .xtc file</a> <br>
<a class="btn-download" href="../_static/recipes/cleaning_cluster_population.py" download>⬇️ Download Python Script</a>
2 changes: 1 addition & 1 deletion docs/source/tutorials/spatial_denoising.rst
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,7 @@ Full scripts and input files
assert soap_test.meta["l_max"]==8
assert np.allclose(soap_test.dataset, reference_soap, atol=1e-6)

_, tsoap_test = trj.get_timesoap(
_, tsoap_test = trj_test.get_timesoap(
soap_insight=soap_test,
)

Expand Down
Loading