Item | Notes |
---|
Questions on submission process? | IC – You mentioned that there were problems with iodine-containing molecules. How should we take care of that? DD – We considered adding a step that removed iodine-containing molecules, but haven’t built that yet. If you could identify iodine-containing molecules in your README, then that would let us filter the datasets. (General) – It will save everyone’s time to add a workflow step to remove iodine-containing molecules AG – I don’t see much difficulty in doing a grep to remove iodines. DD – We’ll have the developers huddle after this and decide whether to supply a simple grep | rm command, or to make a more general substructure filter.
AG – I need to go to my lawyers about sharing this data, once I have the final set of data to share. This is regardless of other steps.
|
Upcoming: Schrodinger command tree demo | AG – Which host is used for FFBuilder jobs? CS – Are these defined in SCHRODNGER_HOSTS file? AG – Which version of schrodinger? CS – This is just a wrapper, so if I’ve pre-computed my ff builder jobs, I could just take from those? AG – Has someone discussed with Schrodinger about publishing these results? DH – This is on the slides, each partner will need to discuss with Sch before sharing. AG – I’d recommend that we not share these results until we’ve discussed as a group about what to do. Neither publicly nor with schrodinger
|
Upcoming: Torsiondrive one-shot command | Github link macro |
---|
link | https://github.com/openforcefield/openff-benchmark/pull/69 |
---|
|
|
Discussion: extension of analysis features | AG – Could we look for the global minimum in a QM calculation, and then ask how many times is there a low-energy MM conformation within a given RMSD of that minimum. dE and RMSD cutoff could vary and the result could be something in 2 dimensions GT – Was somehting like this done in the original Lim+Hahn work? DH – This sounds similar to the original match-minima analysis, with a custom cutoff in the RMSD. AG – The goal would be to identify whether there’s a MM local minimum in the neighborhood of a QM minimum. AG – Could either start all optimizations from unoptimized generated conformers, or start MM minimizations at QM minima.
BS – Possibility of doing dipole moments? Look for difference in classical vs. quantum dipole moments. This was discussed last fall. I think the information is in the psi4 output, but we’d also need to figure out how to get the dipole moments from MM. GT – Wonder if we could extract outliers in torsional fingerprint. Would be useful internally to identify the worst-offending torsions. Then we could share those substructures with OpenFF without sharing the whole molecule. XL – Could look at which SMIRKS correspond to the worst energies compared to QM. DH – That’s tentatively planned, as point 5. AG – If we shared some form of this data, it would be valuable to submit it for subsequent round of FF development.
|
Discussion: Season 2 | If we ran a season 2, what kinds of questions would you want to answer? XL – More focused analysis on torsions. GT – Interested to know how OpenFF can work for covalent interactions. Aromatic-aromatic, dimeric fragments, etc. JW – Bulk property fitting? This is planned for Sage GT – Not exactly bulk properties we’re looking at. If you look at what DFT people are doing, they’re looking at noncovalent interactions for dimers, aromatic rings, etc.
CS – INTRAmolecular interactions, like hbonds. DH – This hasn’t been our interest so far, since we haven’t reparamaterized AG – Begdb.org has lots of QM datasets for things LD – IOCHem-bd.org
Coordinated season or rolling development? CS + TF + XL – I like deadlines and discrete seasons. AG – Solvation free energies? AG – Conformation generation for macrocycles XL + CS – Agree JW – Conformer generation is outside our scope at the moment, but we could couple with an existing method and do ranking better CS – Could run high temperature MD
Parameters for season 2? DD – For example, number of molecules? optimizations? torsion drives, which FFs, ML potentials? BS – Some measure of stiffness of molecules at low energy confs. So some analysis around entropy, vibrational frequencies, hessians, etc. If molecules are floppier, they may bind better, but if they’re stiff they may not. CS – In terms of dataset size, I liked having a range of dataset size (100-1000 molecules). Also ML potentials. DD – Dataset size? AG – We’ve run larger sets. I can report on the differences in distribution later. CS – I ran my 1000 and it took longer than expected, so that’s a good limit XL – We could go bigger, but we’re also OK with 1000 DH – If we include torsiondrives, things will get a lot more expensive CS – Could counteract this by having more constraints on dataset composition (like molecule size/rotatable bonds to make torsiondrives manageable) TF – Could we have a tool to select diverse torsions from a larger set? I narrowed down my set using random selection but a diversity filter would be good. CS – Same. Agree XL – Could pick molecules that use a maximum diversity of FF parameters TF – Random selection of 1000 from a million should be quite diverse. AG – Random selection will probably miss sampling torsions involving things like S and F. TF – Having a tool for selection of diverse torsions would be great. DD – Could couple this to coverage report step.
JW – Is it a concern that we’re using the same QM method for training and testing the FF? GT – I’d like to see a comparison of the method we’re using to “ground truth” CS + XL – This is a good point. This is somewhat concerning now. DD – Would it be fair to select a different commonly used QM method for comparison? TF – Could have each partner run a small internal benchmark of different QM methods. XL – Could take molecules from the public set and run them with a more detailed method AG – DL-PNO method is recommended and only has 3x the computational cost. JW – Would running on the public set with a more detailed level of theory as well as our default method of theory be appropriate to test this, or would folks also want to do internal tests?
|
Post: Remaining roadmap for season 1 | |
Post: Personnel for season 2 | JW – Personnel assignments are partly up to PIs (for DD and JW), partly up to Janssen (DH and LD). DH – Will be participating as a partner in season 2, can advise but won’t be doing direct implementation. LD – Will talk to DH and GT, somewhat cautious about doing production coding. JH – As a Cole lab member, I’m interested in including QUBE in benchmarking, so I’m interested in its health overall. DD – This has been enjoyable, the work has significant impact. I’ll keep thinking about this. In general we should continue doing this kind of benchmarking. So maybe we come back to it after a break? Eg season 2 in the Fall, with nicely-implemented torsiondrives and other new features. This would also put less pressure on personnel allocation.
|