generated from lukasturcani/python-template
-
Notifications
You must be signed in to change notification settings - Fork 10
Devel cleaning_cluster_population #152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ChiaraLionello
wants to merge
34
commits into
GMPavanLab:main
Choose a base branch
from
ChiaraLionello:devel-cleaning-cluster-pop
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
34 commits
Select commit
Hold shift + click to select a range
d2f4369
Addition of a function that cleans cluster population for 2D labels.
ChiaraLionello 6cf5161
cleaning_cluster_population now cleans also 3D labels.
ChiaraLionello ff137be
Documentation added to cleaning_cluster_population.
ChiaraLionello 2314e3f
removed default for assigned_env.
ChiaraLionello 5e67c2c
better explained function doc.
ChiaraLionello 86646a3
Now get_onion_analysis returns labels.
ChiaraLionello dcc2fea
Added excluded_env.
ChiaraLionello f78d066
get_onion_analysis return list_of_labels.
ChiaraLionello 8d78027
Added a warning for excluded_env.
ChiaraLionello f31af59
Added test for cleaning_cluster_population.
ChiaraLionello 124167c
mypy fixes.
ChiaraLionello 178cfd3
Added warning function.
ChiaraLionello 72db13d
test_cluster uses np.array_equal.
ChiaraLionello 1f857db
Merge branch 'GMPavanLab:main' into devel-cleaning-cluster-pop
ChiaraLionello 4b8a45d
change default of logger auto_recording to false.
ChiaraLionello cb9149b
fixed variable name in doctest.
ChiaraLionello d440983
changed format of labels in get_onion_analysis.
ChiaraLionello 917b609
Added tutorial for cleaning_cluster_population in tips_and_tricks sec…
ChiaraLionello 89d45a4
Update docs/source/logs.rst.
ChiaraLionello e11fd64
Merge branch 'GMPavanLab:main' into devel-cleaning-cluster-pop
ChiaraLionello 3910ce6
Changed logo for tips and tricks tutorial section.
ChiaraLionello 2410328
Small changes.
ChiaraLionello a50df17
Updates tutorial section removing tips_and_tricks and dividing fundam…
ChiaraLionello e6f2ff2
Small changes (typo).
ChiaraLionello 6d281cc
changed check in dimensions and added elif.
ChiaraLionello a4dd2b6
reformatted file.
ChiaraLionello 856bd52
Better explenation of how to obtain cleaned_onion_output.
ChiaraLionello 8ad9263
Update docs/source/logs.rst
ChiaraLionello cfad777
typo.
ChiaraLionello 7d42933
All the import are in the first python code-block.
ChiaraLionello 6be5091
Added explenation about the colors in cluster_population.png.
ChiaraLionello 258e6a4
Removed usless comment.
ChiaraLionello ee1b61f
Added general path to load the trajectory.
ChiaraLionello 9dbc76e
Added general path to load the trajectory.
ChiaraLionello File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
79 changes: 79 additions & 0 deletions
79
docs/source/_static/recipes/cleaning_cluster_population.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,79 @@ | ||
| """Code from the Cleaning Cluster Population tutorial.""" | ||
|
|
||
| from pathlib import Path | ||
|
|
||
| import numpy as np | ||
|
|
||
| import dynsight | ||
| from dynsight.data_processing import cleaning_cluster_population | ||
| from dynsight.trajectory import Trj | ||
|
|
||
|
|
||
| def main() -> None: | ||
| """Code from the Spatial Denoising tutorial.""" | ||
| # Loading an example trajectory | ||
| files_path = Path.cwd() | ||
| trj = Trj.init_from_xtc( | ||
| traj_file=files_path / "ice_water_ox.xtc", | ||
| topo_file=files_path / "ice_water_ox.gro", | ||
| ) | ||
|
|
||
| # Computing TimeSOAP descriptor | ||
| _, tsoap = trj.get_timesoap( | ||
| r_cut=10, | ||
| n_max=8, | ||
| l_max=8, | ||
| n_jobs=4, # Adjust n_jobs according to your computer capabilities | ||
| ) | ||
|
|
||
| # Applying Spatial Denoising | ||
| sliced_trj = trj.with_slice(slice(0, -1, 1)) | ||
| sp_denoised_tsoap = tsoap.spatial_average( | ||
| trj=sliced_trj, | ||
| r_cut=10, | ||
| n_jobs=4, # Adjust n_jobs according to your computer capabilities | ||
| ) | ||
|
|
||
| # Performing Onion Clustering on the descriptor computed | ||
| delta_t_list, n_clust, unclass_frac, labels = ( | ||
| sp_denoised_tsoap.get_onion_analysis( | ||
| delta_t_min=2, | ||
| delta_t_num=20, | ||
| fig1_path=files_path / "denoised_onion_analysis.png", | ||
| fig2_path=files_path / "cluster_population.png", | ||
| ) | ||
| ) | ||
|
|
||
| # Saving Onion output in an array | ||
| onion_output = np.array([delta_t_list, n_clust, unclass_frac]).T | ||
|
|
||
| # Assigning clusters with population <5% to the unclassified environment | ||
| # (label=-1) | ||
| cleaned_labels = cleaning_cluster_population( | ||
| labels, | ||
| threshold=0.05, | ||
| assigned_env=-1, | ||
| ) | ||
|
|
||
| # Updating the data and plotting the cleaned number of clusters and | ||
| # unclassified fraction. | ||
| # Since unchanged, windows can be copied from above. | ||
| delta_t_list = onion_output[:, 0] | ||
|
|
||
| n_clust = np.zeros(delta_t_list.shape[0], dtype=np.int64) | ||
| unclass_frac = np.zeros(delta_t_list.shape[0]) | ||
| for i in range(delta_t_list.shape[0]): | ||
| n_clust[i] = np.unique(cleaned_labels[:, :, i]).size - 1 | ||
| unclass_frac[i] = np.sum(cleaned_labels[:, :, i] == -1) / np.size( | ||
| cleaned_labels[:, :, i] | ||
| ) | ||
|
|
||
| cleaned_onion_output = np.array([delta_t_list, n_clust, unclass_frac]).T | ||
|
|
||
| dynsight.onion.plot_smooth.plot_time_res_analysis( | ||
| files_path / "cleaned_onion_analysis.png", cleaned_onion_output | ||
| ) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -44,4 +44,3 @@ | |
| font-weight: 600; | ||
| text-align: center; | ||
| } | ||
|
|
||
Binary file added
BIN
+220 KB
...source/_static/tutorials/cleaning_cluster_population/cleaned_onion_analysis.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+125 KB
docs/source/_static/tutorials/cleaning_cluster_population/cluster_population.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,215 @@ | ||
| Cleaning Cluster Population | ||
| =========================== | ||
|
|
||
| Sometimes, clusters obtained with Onion Clustering analysis can be very small. | ||
| To better interpret the results, it can be useful to remove those ones by assigning them to | ||
| the cluster of the unclassified particles. | ||
| This is achieved through the class, :class:`.data_processing.cleaning_cluster_population()`, which | ||
| assign the cluster under a certain population threshold to a specific cluster selected by the user. | ||
|
|
||
| At the end of every section, you will find links to download the full ``python`` scripts | ||
| and its relevant input files. | ||
|
|
||
| As an example, we consider the ouput of the analysis computed in the `spatial denoising tutorial <./spatial_denoising.html>`_. | ||
| Briefly, we consider the denoised ``TimeSOAP`` descriptor that can be obtained from: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| import numpy as np | ||
| from pathlib import Path | ||
| import dynsight | ||
| from dynsight.trajectory import Trj | ||
| from dynsight.data_processing import cleaning_cluster_population | ||
|
|
||
| files_path = Path.cwd() | ||
| trj = Trj.init_from_xtc( | ||
| traj_file=files_path / "ice_water_ox.xtc", | ||
| topo_file=files_path / "ice_water_ox.gro", | ||
| ) | ||
|
|
||
| _, tsoap = trj.get_timesoap( | ||
| r_cut=10, | ||
| n_max=8, | ||
| l_max=8, | ||
| n_jobs=4, # Adjust n_jobs according to your computer capabilities | ||
| ) | ||
|
|
||
| sliced_trj = trj.with_slice(slice(0, -1, 1)) | ||
| sp_denoised_tsoap = tsoap.spatial_average( | ||
| trj=sliced_trj, | ||
| r_cut=10, | ||
| n_jobs=4, # Adjust n_jobs according to your computer capabilities | ||
| ) | ||
|
|
||
| delta_t_list, n_clust, unclass_frac, labels = sp_denoised_tsoap.get_onion_analysis( | ||
| delta_t_min=2, | ||
| delta_t_num=20, | ||
| fig1_path=files_path / "denoised_onion_analysis.png", | ||
| fig2_path=files_path / "cluster_population.png", | ||
| ) | ||
|
|
||
| .. testcode:: cleaning_cluster_population_test | ||
| :hide: | ||
|
|
||
| from pathlib import Path | ||
| from dynsight.trajectory import Trj | ||
|
|
||
| files_path = Path("source/_static/simulations") | ||
| trj = Trj.init_from_xtc( | ||
| traj_file=files_path / "ice_water_ox.xtc", | ||
| topo_file=files_path / "ice_water_ox.gro", | ||
| ) | ||
|
|
||
| assert trj.n_atoms == 2048 | ||
| assert trj.n_frames == 1001 | ||
|
|
||
| .. testcode:: cleaning_cluster_population_test | ||
| :hide: | ||
|
|
||
| import numpy as np | ||
|
|
||
| trj_test = trj.with_slice(slice(0, 2, 1)) | ||
|
|
||
| expected_tests = Path("source/_static/tutorials/spatial_denoising/doctests") | ||
|
|
||
| soap_test = trj_test.get_soap( | ||
| r_cut=10, | ||
| n_max=8, | ||
| l_max=8, | ||
| n_jobs=1, # Adjust n_jobs according to your computer capabilities | ||
| ) | ||
|
|
||
| _, tsoap_test = trj.get_timesoap( | ||
| soap_insight=soap_test, | ||
| ) | ||
|
|
||
| assert tsoap_test.meta["r_cut"]==10 | ||
| assert tsoap_test.meta["n_max"]==8 | ||
| assert tsoap_test.meta["l_max"]==8 | ||
|
|
||
| reference_tsoap = np.load(expected_tests / "test_tsoap.npy") | ||
| assert np.allclose(tsoap_test.dataset, reference_tsoap, atol=1e-6) | ||
|
|
||
| sliced_trj_test = trj.with_slice(slice(0, 1, 1)) | ||
| sp_denoised_tsoap_test = tsoap_test.spatial_average( | ||
| trj=sliced_trj_test, | ||
| r_cut=10, | ||
| n_jobs=1, | ||
| ) | ||
|
|
||
| reference_denoised_tsoap = np.load(expected_tests / "test_denoised_tsoap.npy") | ||
| assert np.allclose(sp_denoised_tsoap_test.dataset, reference_denoised_tsoap, atol=1e-6) | ||
|
|
||
| For further details users should refer to `spatial denoising tutorial <./spatial_denoising.html>`_. | ||
|
|
||
| Figure ``cluster_population.png`` shows the population of every cluster, each color is a different cluster and | ||
| blue refers to the unclassified fraction: | ||
|
|
||
| .. image:: ../_static/tutorials/cleaning_cluster_population/cluster_population.png | ||
| :scale: 15% | ||
| :align: center | ||
|
|
||
| Before cleaning the cluster we have to save the output from the Onion analysis in an array: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| onion_output = np.array([delta_t_list, n_clust, unclass_frac]).T | ||
|
|
||
| The small clusters can be removed and assigned to the unclassified fraction using the | ||
| class :class:`.data_processing.cleaning_cluster_population()`: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| cleaned_labels = cleaning_cluster_population(labels, threshold=0.05, assigned_env=-1) | ||
|
|
||
| where ``cleaned_labels`` has the same dimensions as ``labels``. Now we can reproduce the plot with the number | ||
| of clusters and the unclassified fraction after re-organizing the data. In particular, | ||
| :class:`.onion.plot_smooth.plot_time_res_analysis()`, which gives the plot that we want to obtain, | ||
| requires and array with the list of the time windows, the number of clusters at every ∆t, and the unclassified | ||
| fraction. Therefore, before plotting the graph, we need to create it by copying the list of time windows from | ||
| the one given by the Onion analysis, and calculate the number of clusters and the unclassified fraction from the | ||
| cleaned labels: | ||
|
|
||
| .. code-block:: python | ||
andrewtarzia marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| delta_t_list = onion_output[:, 0] # Since unchanged, windows can be copied from above. | ||
|
|
||
| n_clust = np.zeros(delta_t_list.shape[0],dtype=np.int64) | ||
| unclass_frac = np.zeros(delta_t_list.shape[0]) | ||
| for i in range(delta_t_list.shape[0]): | ||
| n_clust[i] = np.unique(cleaned_labels[:, :, i]).size - 1 | ||
| unclass_frac[i] = np.sum(cleaned_labels[:, :, i] == -1) / np.size(cleaned_labels[:, :, i]) | ||
|
|
||
| cleaned_onion_output = np.array([delta_t_list, n_clust, unclass_frac]).T | ||
|
|
||
| dynsight.onion.plot_smooth.plot_time_res_analysis("cleaned_onion_analysis.png", cleaned_onion_output) | ||
|
|
||
| .. testcode:: cleaning_cluster_population_test | ||
| :hide: | ||
|
|
||
| from dynsight.data_processing import cleaning_cluster_population | ||
|
|
||
| expected_tests = Path("../tests/data_processing/cluster/test_cluster") | ||
|
|
||
| labels = np.zeros((4, 10, 3), dtype=int) | ||
|
|
||
| labels[:, :, 0] = np.array( | ||
| [ | ||
| [0, 0, 0, 1, 1, 1, 2, 2, 2, 2], | ||
| [0, 0, 0, 1, 1, 1, 2, 2, 2, 2], | ||
| [0, 0, 0, 1, 1, 1, 2, 2, 2, 3], | ||
| [0, 0, 0, 1, 1, 1, 2, 2, 2, 3], | ||
| ] | ||
| ) | ||
|
|
||
| labels[:, :, 1] = np.array( | ||
| [ | ||
| [0, 0, 0, 0, 0, 1, 1, 1, 1, 4], | ||
| [0, 0, 0, 0, 0, 1, 1, 1, 1, 4], | ||
| [0, 0, 0, 0, 0, 1, 1, 1, 4, 4], | ||
| [0, 0, 0, 0, 0, 1, 1, 1, 4, 4], | ||
| ] | ||
| ) | ||
|
|
||
| labels[:, :, 2] = np.array( | ||
| [ | ||
| [0, 0, 9, 9, 0, 1, 1, 1, 1, 9], | ||
| [0, 0, 0, 9, 0, 1, 1, 1, 1, 9], | ||
| [0, 0, 0, 9, 0, 1, 1, 1, 1, 9], | ||
| [0, 9, 0, 0, 0, 1, 1, 1, 1, 9], | ||
| ] | ||
| ) | ||
|
|
||
| test_clean_pop = cleaning_cluster_population( | ||
| labels, | ||
| threshold=0.05, | ||
| assigned_env=99, | ||
| ) | ||
|
|
||
| exp_clean_pop = np.load(expected_tests / "c0_clean_pop_th5_ass99_exNone.npy") | ||
| assert np.array_equal(exp_clean_pop, test_clean_pop) | ||
|
|
||
| On the left are reported the results from Onion clustering on the denoised time-series (`denoised_onion_analysis.png` | ||
| from `spatial denoising tutorial <./spatial_denoising.html>`_), while on the rigth is reported the figure | ||
| ``cleaned_onion_analysis.png``. | ||
|
|
||
| .. image:: ../_static/tutorials/spatial_denoising/denoised_onion_analysis.png | ||
| :scale: 8% | ||
| :align: left | ||
|
|
||
| .. image:: ../_static/tutorials/cleaning_cluster_population/cleaned_onion_analysis.png | ||
| :scale: 8% | ||
| :align: right | ||
|
|
||
| .. raw:: html | ||
|
|
||
| <div style="clear: both;"></div> | ||
|
|
||
| Full scripts and input files | ||
| ---------------------------- | ||
|
|
||
| .. raw:: html | ||
|
|
||
| <a class="btn-download" href="../_static/simulations/ice_water_ox.gro" download>⬇️ Download the .gro file</a> <br> | ||
| <a class="btn-download" href="../_static/simulations/ice_water_ox.xtc" download>⬇️ Download the .xtc file</a> <br> | ||
| <a class="btn-download" href="../_static/recipes/cleaning_cluster_population.py" download>⬇️ Download Python Script</a> | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.