Skip to content

Efficiently running MLIPs on Database Structures #3

@mstapelberg

Description

@mstapelberg

Why this is needed:
For AL, we need to identify/quantify uncertainty in our current models. This uncertainty guides where we add new training data.

For a given model, or set of models (ensemble approach), we would like to:

  1. select structures within the database
  2. run an MLIP type calculation (usually a static calculation

Composition Exploration Workflow:

  1. Identify holes in the composition space within the database
  2. Create structures with those compositions
  3. Run MCMC (using an MLIP) to get a more realistic SRO within the system
  4. Run 1 ps long MD with NVT ensemble using MLIP at 1000K, 2000K, and 3000K
  5. Ignore first 1-2 structures, and sample every other structure for a total of 5 structures. (including the final structure)

Adversarial Attack Workflow:

  1. Calculate the force variance on every structure in the dataset using ensemble of models
  2. Select the 25-100 highest structures with high uncertainty
  3. compute adversarial attacks, running 25 steps or until variance is maximized
  4. save every 3-5 structures

Both of these workflows require loading in structures, and computing force variance on them. Probably best to batch this, so that we are not loading in 10,000s of structures at once.

Adversarial Attack hasn't been implemented yet, but I have the past implementation that I can transfer over. Clustering and composition analysis is implemented, we have MD (forge/forge/workflows/md.py), but combining both hasn't been done yet.

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions