█████████ █████ ██████ ██████ ██████████
███░░░░░███ ░░███ ░░██████ ██████ ░░███░░░░███
████████ █████ ████ ░███ ░███ ███████ ░███░█████░███ ░███ ░░███
░░███░░███░░███ ░███ ░███████████ ███░░███ ░███░░███ ░███ ░███ ░███
░███ ░███ ░███ ░███ ░███░░░░░███ ░███ ░███ ░███ ░░░ ░███ ░███ ░███
░███ ░███ ░███ ░███ ░███ ░███ ░███ ░███ ░███ ░███ ░███ ███
░███████ ░░███████ █████ █████░░████████ █████ █████ ██████████
░███░░░ ░░░░░███ ░░░░░ ░░░░░ ░░░░░░░░ ░░░░░ ░░░░░ ░░░░░░░░░░
░███ ███ ░███
█████ ░░██████
░░░░░ ░░░░░░
The MDeNM (Molecular Dynamics with excited Normal Modes) method consists of multiple-replica short MD simulations in which motions described by a given subset of low-frequency NMs are kinetically excited. This is achieved by adding additional atomic velocities along several randomly determined linear combinations of NM vectors, thus allowing an efficient coupling between slow and fast motions.
This new approach, Adaptive Molecular Dynamics with excited Normal Modes (aMDeNM), automatically controls the energy injection and take the natural constraints imposed by the structure and the environment into account during protein conformational sampling, which prevent structural distortions all along the simulation.Due to the stochasticity of thermal motions, NM eigenvectors move away from the original directions when used to displace the protein, since the structure evolves into other potential energy wells. Therefore, the displacement along the modes is valid for small distances, but the displacement along greater distances may deform the structure of the protein if no care is taken. The advantage of this methodology is to adaptively change the direction used to displace the system, taking into account the structural and energetic constraints imposed by the system itself and the medium, which allows the system to explore new pathways.
This document will give an overview of the aMDeNM method and help to properly setup and run a simulation.
- Adaptive Molecular Dynamics with Python
- Method Overview
- pyAdMD Applications
- Configuration
- Input Requirements
- Analysis
- Usage Examples
- Dependencies
- Citing
- Contact
The aMDeNM is a enhanced sampling molecular dynamics method that uses normal modes as collective variables in order do increase the conformational space explored during MD simulation. This is done by injecting an incremental energy to the system, thus assigning additional atomic velocities along the direction of a given NM (or a combination of a NM set). The combination of the velocities from MD and those provided by the NM vectors properly couple slow and fast motions, allowing one to obtain large time scale movements, such as domain transitions in a feasible simulation time. The additional energy injection and the excitation direction are constantly evaluated and updated to ensure the sampling validity.
The aMDeNM simulations are done using NAMD 3.0 and CHARMM36m forcefield. We implemented the code so the molecular dynamics are computed exclusively on GPU. We strongly recommend that all system preparation be done with CHARMM-GUI.
This is a prerequired step to perform aMDeNM simulations. It consists in performing a short equilibration MD to store the final atomic velocities and positions.
If using CHARMM-based normal modes, it is also necessary to compute the modes from the last MD coordinates and store the vectors from the low-frequency end of the vibrational spectrum on a binary file.
The program computes Cα or heavy atoms Elastic Network Model using the same algorithms as the software available at our ENM github repository.
The program generates uniformly distributed points on N-dimensional hypersphere through a repulsion-based algorithm which employs a physics-inspired approach where points behave as charged particles confined to the spherical manifold, interacting through a dimensionally-scaled potential function. The generated points are then used as scaling factors to be used in normal modes combinations. The PDIM algorithm was firstly built to the same purpose. Here, we present a different design with a faster and concise implementation.
Given an N-dimensional hypersphere
The algorithm considers the geometric optimization as an energy minimization problem, modeling it as a repulsive potential function
The implementation uses an inverse power law potential scaled with dimensionality:
-
Harmonic Properties: In N dimensions, the fundamental solution to Laplace's equation scales as
$1/r^{N-2}$ . -
Volume Scaling: The surface area of an N-dimensional hypersphere scales approximately as
$({2 \pi e}/N)^{N/2}$ , requiring stronger repulsion in higher dimensions to overcome concentration of measure phenomena. - Computational Stability: The exponent ensures numerical stability by preventing excessively large or small force values in high dimensions
The generated points are then used as scalar factors that multiply each normal mode vector used in the linear combination that makes up the excitation direction vector
The additional kinetic energy injected in the system has a fast dissipation rate. Therefore, the program constantly checks the injection energy level and rescale the velocities along the excited direction whenever it is necessary. The kinetic energy along the normalized excitation vector
where
With this procedure, the system is kept in a continuous excited state, allowing an effective small, "adiabatic-like" energy injection. The energy injection control is done by projecting the velocities computed during the simulation onto the excited vector, thus obtaining and rescaling the kinetic energy corresponding to it.
Since the excitation vector is obtained from the initial conformation, it is dependent of this configuration. As the system is displaced along this direction and change its conformation, the motion loses its directionality due to mainly anharmonic effects. To prevent the structural distortions produced by the displacement along a vector that is no longer valid, the program update the excitation directions based on the trajectory evolution during the previous excitation steps. This procedure allows the system to adaptively find a relaxed path to follow during the next aMDeNM excitations.
If we consider the nth simulation, the next excitation vector,
The second parameter relates to the relative deviation of the vector
A precise rule is followed to decide whether to modify the excitation vector direction after every short simulation run. The excitation vector is changed as soon as
The default values for
At the end of each replica, a de-excitation molecular dynamics is submited in order to recover the equilibrated thermodynamics of the system. This final step aims to remove any residual MDeNM additional energy from the system and provide further sctructural and dynamical exploration.
The Adaptive MDeNM method takes self-computed Cα or heavy atoms ENM modes or CHARMM-computed normal modes as collective variables to improve Molecular Dynamics sampling.
Uses simpified force-field based on particles and springs computed automatically by the program. A given normal mode (or a linear combination of several modes) is used to excite the system during the molecular dyamics simulation.
Uses physical force-field based normal modes computed in CHARMM. A given normal mode (or a linear combination of several modes) is used to excite the system during the molecular dyamics simulation.
pyAdMD is distributed as a Python handler script that compute ENM modes, uniformly distributes linear combinations of modes in the N-dimensional hypersphere space, manage NAMD simulations and compute the projections along the excitation direction and apply the correction whenever necessary. An additional tools folder includes scripts not used by NAMD, such as the one to write down the CHARMM normal modes.
One can easily setup and run an Adaptive MDeNM simulation by using this script. The configuration process is straightforward. Some technical aspects will be covered in this section in order to facilitate the method comprehension.
The excitation time of Adaptive MDeNM is 0.2 ps. This means that every 0.2 ps the system receives the additional amount of energy defined by the user. Therefore, when studying large scale motions, it is advised to inject small amounts of energy in order to avoid structural distortions caused by an excessive energy injection. Usually, an excitation energy of 0.125 kcal/mol is sufficient to achieve a large exploration of the conformational space (0.5 kcal/mol if Cα-only ENM).
The total simulation time may require a tuning depending on the system size, energy injection and nature of the motion being excited. Considering a large scale global motion, there is a trade-off between the energy injection and the total simulation time. Larger amounts of energy allows a shorter simulation time, however, this may not be advised as discussed above.
As described above, the direction is updated after the system has traveled a distance of 0.5 Å along the excitation vector and its real displacement has a deviation of 60° with respect to the theoretical one. The update can also be affected by the amount of energy injected, since higher energy values leads to larger motions. In addition, after each correction the new vector loses directionality due to anharmonic effects. This means that, at a given point, the new vectors are so diffuse that there is no point in proceed the simulation. When this ponit is reached, it is necessary to recompute the normal modes and start again. This is one more reason to not inject high energy values and let the system undergoes the changes slowly. Alternatively, one can recompute ENM modes instead of change the excitation vector direction (only when the original model type is ENM).
The program do a linear combination of the supplied normal modes to compute the excitation direction. This imply that the more modes are provided, the more replicas will be necessary to cover the hyperspace described by these modes.
Create an atom selection to apply the energy injection using MDAnalysis selection language. Must be written between quotes.
-
-psf/--psffile: PSF structure file containing system molecule-specific information (required) -
-pdb/--pdbfile: PDB structure file in Protein Data Bank format (required) -
-xsc/--xscfile: NAMD eXtended System Configuration file (required) -
-coor/--coorfile: NAMD binary coordinates file (required) -
-vel/--velfile: NAMD binary velocities file (required) -
-str/--strfile: CHARMM-style stream file, which contains force field parameters and definitions (required) -
-mod/--modefile: Binary file containg CHARMM normal mode vectors (optional. Required only if-type = CHARMM)
-
-type/--modeltype: Normal modes model type,CAENM,HEAVYAtoms ENM orCHARMM(required. Default:CA) -
-nm/--modes: Normal modes to excite (optional. Default:7,8,9) -
-ek/--energy: Excitation energy injection (optional. Default:0.125kcal/mol) -
-t/--time: Simulation time (optional. Default:250ps) -
-sel/--selection: Atom selection to apply the energy injection (optional. Default:"protein") -
-rep/--replicas: Number of replicas to run (optional. Default:10)
-
-n/--no_correc: Disable excitation vector direction correction and compute standard MDeNM -
-f/--fixed: Disable excitation vector correction and keep constant excitation energy injections -
-r/--recalc: Recompute ENM modes instead of correcting the excitation vector direction
-t/--time: Simulation time (optional. Default:250ps)
-r/--rough: Perform rough analysis (optional. Analyze every5ps instead of every frame.)
The PyAdMD analysis module provides comprehensive analysis capabilities for molecular dynamics simulations performed using the aMDeNM method. This module processes simulation trajectories and generates detailed structural analysis, visualizations, and summary reports.
-
Root Mean Square Deviation (RMSD): Measures structural deviation from the initial conformation.
-
Radius of Gyration (RoG): Measures the compactness of the protein structure. Useful for identifying folding/unfolding events.
-
Solvent Accessible Surface Area (SASA): Calculates the surface area accessible to solvent molecules. Uses the Shrake-Rupley algorithm implemented in Bio.PDB.
-
Hydrophobic Exposure: Measures the percentage of hydrophobic residues exposed to solvent. Useful for identifying folding/unfolding events.
-
Root Mean Square Fluctuation (RMSF): Calculates per-residue flexibility using Cα atoms. Identifies flexible and rigid regions in the protein structure.
-
Secondary Structure Content: Calculates secondary structure elements using DSSP. Tracks helix, sheet, coil, turn, and other structural elements over time and reports the number of residues in each secondary structure type.
- Analyzes every frame of the trajectory
- Provides the highest resolution data
- May be computationally intensive
- Analyzes frames at 5ps intervals
- Significantly reduces computation time
- Suitable for quick overviews or large systems
The analysis module reads simulation parameters from the pyAdMD_params.json file, which includes:
- Number of replicas
- Total simulation time
- Atom selection criteria
- Input file paths
analysis/
├── analysis_results.csv # Combined analysis data from all replicas
├── rmsf.csv # Combined RMSF data from all replicas
├── analysis_summary.html # HTML summary report
├── rmsd_plot.png # RMSD plot for all replicas
├── radius_gyration_plot.png # Radius of gyration plot
├── sasa_plot.png # SASA plot
├── hydrophobic_exposure_plot.png # Hydrophobic exposure plot
├── rmsf_average.png # Average RMSF plot
├── secondary_structure_average.png # Average secondary structure plot
└── rep[1-N]/ # Replica-specific directories
├── analysis_results.csv # Replica-specific analysis data
├── rmsf.csv # Replica-specific RMSF data
├── rmsd_plot.png # Replica-specific RMSD plot
├── radius_gyration_plot.png # Replica-specific RoG plot
├── sasa_plot.png # Replica-specific SASA plot
├── hydrophobic_exposure_plot.png # Replica-specific hydrophobic exposure plot
├── rmsf_plot.png # Replica-specific RMSF plot
└── secondary_structure.png # Replica-specific secondary structure plot
- CSV Files
analysis_results.csv: Time-series data for RMSD, RoG, SASA, hydrophobic exposure, and secondary structure contentrmsf.csv: Per-residue RMSF values for all replicas
- Plot Files
- Individual property plots for each replica
- Combined plots showing all replicas
- Average plots across all replicas
- HTML Summary
- Interactive summary report with tables and embedded plots
- Statistics for each replica and averages across all replicas
- Easy navigation and visualization of results
Furthermore, some basic analyses are written inside each replica folder at the end of the simulation, they can be found as follows:
- coor-proj.out: projection of the MD coordinates onto the normal mode space described by the excitation vector
- rms-proj.out: the system RMSD displacement along the excitation vectors
- vp-proj.out: projection of the MD velocities onto the normal mode space described by the excitation vector
- ek-proj.out: displays the additional kinetic energy at each MD step
Example files are available at the tutorial folder (T4-lysozyme). We encourage users to test the multiple usages of pyAdMD using these files to get familiar with the method.
python pyAdMD.py run -type CHARMM -mod tutorial/modes.mod -psf tutorial/setup.psf -pdb tutorial/system.pdb -coor tutorial/system.coor -vel tutorial/system.vel -xsc tutorial/system.xsc -str tutorial/system.str
python pyAdMD.py run -type CA -psf tutorial/setup.psf -pdb tutorial/system.pdb -coor tutorial/system.coor -vel tutorial/system.vel -xsc tutorial/system.xsc -str tutorial/system.str -nm 7,15,20 -ek 0.5 -t 100 -sel "protein and resid 1 to 60" -rep 60
python pyAdMD.py run -type HEAVY -psf tutorial/setup.psf -pdb tutorial/system.pdb -coor tutorial/system.coor -vel tutorial/system.vel -xsc tutorial/system.xsc -str tutorial/system.str --no_correc
python pyAdMD.py restart
python pyAdMD.py append -t 100
python pyAdMD.py analyze -r
python pyAdMD.py clean
A conda enviroment can be easily setup with the provided pyAdMD.yaml file containing the necessary Python dependencies, which are:
python=3.12numpy=2.2.5scipy=1.16.0mdanalysis=2.9.0matplotlib=3.10.0openmm=8.3.1numba=0.61.2cupy=13.6.(optional, for GPU-accelerated ENM calculation. Note: CuPy requires matching CUDA toolkit.)dssp=4.x(for secondary structure analysis, refer to the DSSP official GitHub repository for building details.)
Type the following command to create the environment:
conda env create -f pyAdMD.yaml
Please cite the following paper if you are using any Adaptive MDeNM application in your work:
If you experience a bug or have any doubt or suggestion, feel free to contact: