Skip to content

Add experimental.graph2mat architecture#979

Open
pfebrer wants to merge 5 commits intomainfrom
graph2mat
Open

Add experimental.graph2mat architecture#979
pfebrer wants to merge 5 commits intomainfrom
graph2mat

Conversation

@pfebrer
Copy link
Contributor

@pfebrer pfebrer commented Dec 15, 2025

From the creators of experimental.mace...

This one is quite experimental and I'm not sure it will ever aim at being stable, let's see.

What it is useful for

This architecture predicts sparse matrices in spherical basis from the output of any model.

Implementation

The idea of the architecture is simple enough: it takes any architecture in metatrain, asks for a spherical per-atom output, and then applies graph2mat on top of it. This is basically what I had in mind when I developed graph2mat, so I'm very happy that metatrain standardizes everything in a way that implementing this is "trivial" (you know, once you have spent a whole year in COSMO 😆).

The architecture supports multiple matrix targets, and for each target a separate graph is built, since different matrices can have different sparsity patterns. As a consequence of this, the graph used to construct each matrix is not shared by the base (featurizer) model. E.g. the base model can have a higher/lower/adaptive cutoff for neighbors, or even not have a graph at all.

The main point of friction with metatrain is that graph2mat works with a completely flattened array as the batch (due to the sparsity/raggedness of the target), while metatrain in general is more suited for uniform targets (e.g. the supported TargetInfos). For now I solved the problem by using a DiskDataset that makes metatrain happy, but then converting to graph2mat batch in a callable of the collate function.

I tested this with soap_bpnn, PET and MACE and it is working fine.

Future perspective

There is a case for thinking that the architecture is unnecessary, since one can just add graph2mat as a head in the architectures where it makes sense. However, I think it is good to first test things in this experimental architecture because things are probably going to change fast.

There are also architectures for which other approaches are probably much more efficient and using graph2mat for matrices would make things unnecessarily complex. For example, @jwa7 is working on a more native way of doing this in PET.

Still, since graph2mat is very modular and easy to modify, it is nice to have it to quickly test new approaches before moving into modifying the other architectures (non-goal of metatrain, I know haha).

Things missing.

For a proof of concept, I made up a target type (basis) that allows me to play with things. This will be changed to adapt to the target type that Joe is using in his PET implementation, since after all for graph2mat the target type is just a tool to trick metatrain into allowing its running. Therefore, this PR is likely to stay as a draft until Joe finishes his implementation.

Generating inputs

The architecture requires mainly two non-trivial inputs: the disk dataset and the basis specification. Both will be creatable using graph2mat tools, although the disk dataset will be general enough that it could be generated with whatever other tool.

To test it

[not tested on GPU, will test soon!]

The architecture can be tested with this subset of 100 QM9 structures: https://drive.google.com/file/d/1gV4QP4ZwW_BDXdSe0K-UPvu2G3NPg2Nt/view?usp=sharing, which contains the density_matrix, hamiltonian, energy_density_matrix and overlap.

Then run mtt train with the typical options yaml:

architecture:
  name: experimental.graph2mat
  model:
    # Graph2mat model options
    basis_yaml: qm9_basis.yaml
    basis_grouping: basis_shape
    # Featurizer options, this is the same as the full architecture options
    # used to train a model, i.e. you can have featurizer_architecture.model
    featurizer_architecture:
      name: soap_bpnn
  training:
    batch_size: 10
    checkpoint_interval: 20
    optimizer_kwargs:
        lr: 0.005
    loss: mae

training_set:
  systems:
    read_from: qm9_100.zip
    length_unit: angstrom
  targets:
    density_matrix:
      type: basis
    # Uncomment the following to train on more targets
    #hamiltonian:
    #  type: basis
    #overlap:
    #  type: basis
    #energy_density_matrix:
    # type: basis
  # This is needed because to rearrange things in the collate function we
  # need to know the system indices.
  extra_data:
    system_index:
      type: scalar
      per_atom: False

validation_set: 0.2

With the qm9_basis.yaml file containing the basis specification:

- type: 9
  R: [1.5940, 1.1662, 1.8989, 1.8989, 1.8989, 1.1957, 1.1957, 1.1957, 1.8989, 1.8989,
    1.8989, 1.8989, 1.8989]
  basis:
  - [2, 0, 1]
  - [2, 1, -1]
  - [1, 2, 1]
  basis_convention: siesta_spherical
- type: 8
  R: [1.7490, 1.3119, 2.0835, 2.0835, 2.0835, 1.3451, 1.3451, 1.3451, 2.0835, 2.0835,
    2.0835, 2.0835, 2.0835]
  basis:
  - [2, 0, 1]
  - [2, 1, -1]
  - [1, 2, 1]
  basis_convention: siesta_spherical
- type: 7
  R: [1.9495, 1.5182, 2.2650, 2.2650, 2.2650, 1.5373, 1.5373, 1.5373, 2.2650, 2.2650,
    2.2650, 2.2650, 2.2650]
  basis:
  - [2, 0, 1]
  - [2, 1, -1]
  - [1, 2, 1]
  basis_convention: siesta_spherical
- type: 1
  R: [2.4919, 1.9896, 2.4919, 2.4919, 2.4919]
  basis:
  - [2, 0, 1]
  - [1, 1, -1]
  basis_convention: siesta_spherical
- type: 6
  R: [2.1635, 1.7712, 2.5773, 2.5773, 2.5773, 1.8389, 1.8389, 1.8389, 2.5773, 2.5773,
    2.5773, 2.5773, 2.5773]
  basis:
  - [2, 0, 1]
  - [2, 1, -1]
  - [1, 2, 1]
  basis_convention: siesta_spherical

Hope you think this is nice, and looking forward to having this one merged :)

Contributor (creator of pull-request) checklist

  • Add your architecture to the experimental or stable folder. See the
    [docs/src/dev-docs/architecture-life-cycle.rst](Architecture life cycle)
    document for requirements. src/metatrain/experimental/<architecture_name>
  • Document and provide defaults for the hyperparameters of your model.
  • Added tests for your architecture. See https://docs.metatensor.org/metatrain/latest/dev-docs/new-architecture.html#testing-tests
  • Added test run to the CI (file .github/workflow/architecture-tests.yml)
  • Add a new dependencies entry in the optional-dependencies section in the
    pyproject.toml
  • Add maintainers as codeowners in CODEOWNERS
  • Trigger a GPU test by asking a maintainer to comment "cscs-ci run".

Reviewer checklist

New experimental architectures

  • Capability to fit at least a single quantity and predict it, verified through CI
    tests.
  • Compatibility with JIT compilation using TorchScript <https://pytorch.org/docs/stable/jit.html>_.
  • Provision of reasonable default hyperparameters.
  • A contact person designated as the maintainer, mentioned in __maintainers__ and the CODEOWNERS file
  • All external dependencies must be pip-installable. While not required to be on
    PyPI, a public git repository or another public URL with a repository is acceptable.

New stable architectures

  • Provision of regression prediction tests with a small (not exported) checkpoint
    file.
  • Comprehensive architecture documentation
  • If an architecture has external dependencies, all must be publicly available on
    PyPI.
  • Adherence to the standard output infrastructure of metatrain, including
    logging and model save locations.

📚 Documentation preview 📚: https://metatrain--979.org.readthedocs.build/en/979/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant