Identify and address development issues encountered
Identify and address project risks
Discussion topics
Item
Presenter
Notes
TorsionDrive performance
Xavier Lucas
XL – Torsiondrives of simple molecules take a long time
DD – This seems like something to do with the refresh rate. If we crank this up it should go faster.
Updates from team members
Josh+Jeff+David+David
JH: was working on coverage reporter; Jeff is finishing that up
Also created PR for conformer generation performance
JW: plugging back into these PRs; eager to merge
JH: on
JW+JH – How to handle molecules that will fail AM1BCC?
Option 1: Run AM1BCC in coverage step and also in MM energy evaluation step
Pro – We identify molecules that will fail AM1BCC sooner
Con – We run AM1BCC twice on each good molecule (once during validation and again during energy evaluation)
Option 2: Don’t check during initial steps, just let it fail in energy evaluations
Pro – Only run AM1BCC on good molecules once
Con – We’ll submit wasteful QM jobs on molecules for which we can’t get MM energies
[decision] Option 2 is preferable at this time
DH: Got first three compute options working; couldn’t get fourth working.
Over Christmas wanted to run a set, had issues with jobs running for ages without getting a result
DD: we’ll do a working session right after this call
Continued on analysis step; implemented conformer-matching step
takes reference from QM minimized structure, uses best RMS
there will be MM-minimized conformers that will not match any QM with this method
JW: Nothing major to report; 0.8.2 release is out for toolkit; takes care of majority of good molecules that were being marked invalid
still 2 other areas of stereochemistry issues
80% of bad marking was double-bond stereochemistry
went from 5% error rate to 1-2% error rate
DD: worked with Bill Swope to develop out compute approaches
need to put together public submissions this week, next week
Issues to address
Discuss series field addition.
COM-SER-XXXXX-YY
Could this work? Are there fundamental issues with this?
Basically, this is meant to accommodate adding molecules after an initial submission, since this appears likely from at least one user
DD: the series approach tries to address an analogous case of conda envs being more deterministic to create new than to update/evolve
JW – Could treat series identifier like group name, but allow validate to have a running mode where MORE data is added
JW: Could see two strategies for lookahead needed
a bit of common code that does
JH – validate should have an add option, which check new inputs against all existing output graphs, and flags duplicates as failures
conf gen, coverage checking, and optimization will look ahead to all output prefixes and not overwrite anything (COM-MMMMM) that’s already been run – We should add tests to all steps for this
[decision] – This is the approach we will take. There won’t be a series identifier.
Determine where indices are losing leading zeros in last field.
JH: will make it so single-conformers submitted with ds.add_molecule don’t experience any id mangling
Can't get smirnoff99Frosst-1.1.0 spec to work.
JH: think it’s a validation issue in QCSubmit; will follow up
Project risks
DD: updated schedule; aiming for:
1/15 protocol feature-complete
1/22 present protocol to partners
2/1 start partners up with production approach, get them set up for support
Add Comment