[DNM] add experiment of training torsion energies#163
[DNM] add experiment of training torsion energies#163lilyminium wants to merge 7 commits intomainfrom
Conversation
|
This is amazing!! cheers. I'll dive into it now. |
|
Hi Lily, thanks again for doing this - it's been a massive help. I just wanted to ask a verify/ask a (nooby) question about how the c_ij's are calculated for a given molecule. My current understanding is that for How exactly are the c_ij values calculated? Is it a unique representation of the dihedral using e.g. all the local environments of the atoms in the dihedral? Initially I thought maybe _SymmetricPoolingLayer(PoolingLayer)'s _get_pooled_representations was doing it, but that seems to do the whole molecule. Is there a function somewhere that takes 4 indices of a dihedral and generates a c_ij value for it? This is probably a very obvious question, so apologies! I've just started validating all the geometry/maths, and otherwise the notebook you provided looks like it's doing exactly what we want. Cheers. |
|
Hi Ben, sorry for the delay, I was on leave. You're totally right _SymmetricPoolingLayer(PoolingLayer)'s _get_pooled_representations takes an entire molecule and generates representations for each dihedral term. This representation gets passed through the dense feed-forward readout module layers which learns the c_ij tensor that arrives as input to
Hopefully yes, the representations pooled in |
Do not merge: this is only a very partial, bare-bones experiment to see how much work it would be to use NAGL to a) train to non-atom properties (not so bad) and b) incorporate geometric information (more work...) and c) incorporate another trained parameter (quite a lot). My general conclusion is that b) and c) is likely not worth porting to the overall library, although open to anything as usual.
All the maths is very ad-hoc and absolutely needs double-checking and testing, especially the geometry functions. Everything currently is only implemented with the DGL backend.
cc @BenCree -- there's an example notebook here that runs through a very short example of what I think you and Danny were describing to me in our call, if you're interested. Happy to chat more if you want to discuss!
P.S. the Dataset code was a huge pain and very repetitive -- it probably needs refactoring.
PR Checklist