Skip to content

Add PhACE architecture#921

Open
frostedoyster wants to merge 303 commits intomainfrom
phace-hartmut
Open

Add PhACE architecture#921
frostedoyster wants to merge 303 commits intomainfrom
phace-hartmut

Conversation

@frostedoyster
Copy link
Collaborator

@frostedoyster frostedoyster commented Nov 18, 2025

Adds the PhACE architecture (#434 was closed)

Contributor (creator of pull-request) checklist

  • Add your architecture to the experimental or stable folder. See the
    [docs/src/dev-docs/architecture-life-cycle.rst](Architecture life cycle)
    document for requirements. src/metatrain/experimental/<architecture_name>
  • Document and provide defaults for the hyperparameters of your model.
  • Add your architecture to the CI in .github/workflow/architecture-tests.yml
  • Add a new dependencies entry in the optional-dependencies section in the
    pyproject.toml
  • Add tests:
    • checking that the code is compatible with TorchScript
    • checking the basic functionality (invariance, fitting, prediction)
    • checking that the checkpoints are properly versionned (see the existing
      test_checkpoint.py in other architectures)
  • Add maintainers as codeowners in CODEOWNERS
  • Trigger a GPU test by asking a maintainer to comment "cscs-ci run".

Reviewer checklist

New experimental architectures

  • Capability to fit at least a single quantity and predict it, verified through CI
    tests.
  • Compatibility with JIT compilation using TorchScript <https://pytorch.org/docs/stable/jit.html>_.
  • Provision of reasonable default hyperparameters.
  • A contact person designated as the maintainer, mentioned in __maintainers__ and the CODEOWNERS file
  • All external dependencies must be pip-installable. While not required to be on
    PyPI, a public git repository or another public URL with a repository is acceptable.

📚 Documentation preview 📚: https://metatrain--921.org.readthedocs.build/en/921/

@frostedoyster
Copy link
Collaborator Author

cscs-ci run

pfebrer
pfebrer previously requested changes Dec 6, 2025
Copy link
Contributor

@pfebrer pfebrer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for more change requests :)

class RadialBasisHypers(TypedDict):
"""In some systems and datasets, enabling long-range Coulomb interactions
might be beneficial for the accuracy of the model and/or
its physical correctness."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are not the long range hypers :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

"""Metric used to select the best model checkpoint."""

loss: str | dict[str, LossSpecification] = "mse"
"""Loss function used for training."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we copy the trainer hypers that are common to PET from the PET trainer, so that they have more extensive documentation?

This would also rename fixed_composition_weights to atomic_baseline, so the trainer and the checkpoint will require changes.

@pfebrer
Copy link
Contributor

pfebrer commented Dec 6, 2025

just dropping in to say that i have had some quite bad experiences with torch_geometric as a dependency and i would +1 avoiding it. a graph dataloader is indeed useful, but it's also very hard to make generic. i think it's not complex enough to warrant generalisation.

Ok, I just think that if there's no reason for the PhACE dataloader to be different from the MACE one (which is torch_geometric, they just copied the necessary code from the torch_geometric package), they should be the same/compatible. I can't think right now of an experiment where both models are trained simultaneously, but it could be something interesting in the future and sharing batches would make things more efficient.

Not that they need to be the same for the PR to be merged, but something that I think would be nice to keep in mind for the future.

Copy link
Member

@jwa7 jwa7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments for now. Looking great though!

might be beneficial for the accuracy of the model and/or
its physical correctness."""

max_eigenvalue: float = 25.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this needs tuning with respect to the max L of the target, some useful heuristics on how to set this would be useful. For instance, with all other parameters as default, I found I had to set it to max_eigenvalue: 200.0 to reach L=8.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the laplacian eigenvalues paper uses nmax as the only parameter, and then finds lmax as a consequence. That is far more user-friendly then asking a mysterious eigenvalue. IDK if this applies also with the physical weighting, but it should

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on this one @frostedoyster ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea because there are many exact truncations that you can't get (or are ambiguous) if you move to n_max or l_max. I would rather add more documentation to the eigenvalue

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I second this - I think we need to have fast and more minimal CG code out of featomic. Paolo and I have temporarily included featomic as a dependency for PET-H as we need it for Blocks2Matrix but it seems to be a bit of a pain for torch versioning etc. Maybe we should discuss moving the CG subpackage to metatomic.

warmup_fraction: float = 0.01
"""Fraction of training steps for learning rate warmup."""

gradient_clipping: Optional[float] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed - shall we set a default here?

Suggested change
gradient_clipping: Optional[float] = None
gradient_clipping: Optional[float] = 1.0

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so. I'm running some more experiments to understand what a good default could be

@frostedoyster
Copy link
Collaborator Author

Just to be clear, we can't use any existing implementation of CG products here, because we use a different form of the CG iterations

model(
[system],
model.outputs,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the torchscript tests, I kept them as they were, but because small differences between models (like in your case deleting these attributes, or in MACE compiling with e3nn's function), I wonder if it is better to just test the exporting functionality instead of doing torch.jit.script explicitly, since every model will have its own needed things in model.export()

@frostedoyster
Copy link
Collaborator Author

cscs-ci run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants