2020-03-23 Roadmap meeting notes
Date
Mar 23, 2020
Participants
Participants:
@David Mobley
@John Chodera
@Lee-Ping Wang
@Michael Gilson
@Michael Shirts
@Daniel Smith (Deactivated)
@Jeffrey Wagner
@Karmen Condic-Jurkic
@Matt Thompson
@David Cerutti (Deactivated)
@Simon Boothroyd
Concept
Discuss the ideas and problems at a high level and treat possible solutions as a black box. The goal is to define the most important features, inputs and outputs, restrictions, timelines. The exact solutions (product development plans) are to be proposed and devised by smaller designated teams.
Goals
Define the main outcomes for the next year for science and infrastructure:
The number of major force fields to be released (for small molecules and biomolecules) and the most important features defining each release (release vs. generation);
The most important infrastructure components to be developed in the next year and desirable features (data management);
Key science issues that need to be addressed to support future plans
These need to proceed independently of major release plans
Science to be incorporated into release plans when ready
What will NOT be included in the next FF releases and infrastructure support
Other important scientific efforts and outcomes (papers, collaborations, data collection, etc)
Community building – OpenFF affiliates?
Additional materials
Discussion topics
Item | Notes |
---|---|
| |
Aspirational planning | MKG – I’m seeing our current path as extending FF in nthe infra and daa on-hand that we need, so that it’s easy to fit new FFs of differing functional form to existing data, or vice versa. Democratizing science around FFs. So I want BCCs, off site charges, polariziability, and make it easy to mix and match those with datasets, and crank out FFs. Then we can experiment and learn about how these factors combine into speed/accuracy tradeoffs. DC – The value of what David is talking about is presence of datasets from which we can just pull values, that would be extremely valuable. Always thought that having a library would be extremely useful, even if it means picking a single molecular property. Datasets include sensible, clean molecule sets that cover chemical space. There is one general parameter set that is for all intents and purposes good enough and adding more data and more parameters just doesn’t help.My experience is that common parameters DO work and ARE possible (eg protein backbone torsions+charges are nearly identical).Highly flexible atom typing will let us cover a much larger part of chemical space, and do so well. MKG – Concurs to everything said. Systemization of parameterization is central, so we don’t do things that don’t make a difference. We don’t want to get lost in parameter space, wants to improve optimization process.(Concerned about the size of parameter space, and how we may easily be fooled into falling into local minima) LPW – Largely agree with everything said so far, I would like to see development of unified and systematic approach that works for small molecules and biomolecules, systematic optimization and automated generation of new parameters, some real innovation in functional forms. This comes to mind right now. Wants to see automated generation of NEW parameters. Optimization of parameters based on WBO, material improvements to electrostatic models, etc. MS – Automated benchmarking is huge, emphasizing data drive approach. Molecules are molecules – small molecules, lipids, sugars, proteins… This effort makes it possible to do. JW – Excited to build new things and automate everything. SB – Also excited about benchmarking and data driven approach, including systematic benchmarking. DLM – automated benchmarking is a key part of getting where we need to be. Then we can really determine what is “better” on even footing. KCJ – Do we want to commission expts? MS – We probably can’t decide that now. But we can let subproject owners identify whether it would be useful to commission new experiments. JDC – Our goals need to be modest on the “commissioning expts” front, given that NIST has a LARGE budget to do this, and they’re struggling. H-G expts from Gilson lab will help, my lab can provide densities. But collabs with eg. Rafael Wafweiler can provide NMR and x-ray DLM – Not sure that ThermoML has continuous coverage for everything we need – It may have gaps of coverage MS –
|
Force fields | KCJ – the main features of FFs and stick with it, it can be changed if everyone agrees. JDC – What’s more important – Releasing on the DATE we want, or release with the FEATURES we want? MS – Depends on what will make our funders happiest JDC – releasing on time would make funders happy rather than adding more features KCJ & MS – disagree DLM – Remember that we’re doing product development, on top of science. The science can’t be done on a schedule. The FFs can. The infrastructure is in between. JDC – So maybe we should do FF releases on a DATE, with whatever features are available KCJ – Disagree – We should release FFs guaranteed to have certain FEATURES. It’s disruptive for partners to update to a new FF, so we should make sure there’s a clear value-add each time DLM – This gets to “what’s a release vs. a generation” of a FF. A “release” may include a bit more training or a fix for an observed issue, whereas a “generation” is a change in the science. JDC – This seems like the best of both worlds. KCJ – Naming scheme is complicated, and is throwing people off. So let’s be very deliberate about this. DLM – Maybe we should get rid of the word “release” entirely. “Generation” is a good name for significant changes. Each “generation” will have a new herb name. X only increments if there’s a change in functional form. (General) – Generation changes should be considered “major releases”
MT – Do we expect to maintain previous herb releases and make point releases of them? DLM – No What about people outside OpenFF making and naming new FFs? (General) disagreement (General) – Agreed – Sage will be openff-2.0.0 JDC – Short term plan for biomolecular FFs is to just pull in an AMBER FF. Medium term plan is to build out the infrastructure to parameterize a biopolymer, like graph charges. Long term is to pull in NMR and other data to actually optimize the parameters. KCJ – Let’s focus on naming for now. JDC – The new FFs will be exactly AMBER – Just OFFXML representations of the exact same parameters. KCJ – But what will we call them? General (DM) – We’l just call it DC – metadata with provenance of parameter changing? SB – Could the toolkit just do a MW – what people want out of data? In my past project, we had something similar. (General) – Punt on this topic. We could have the FF person think about this and report what infrastructure would be useful.
|
Feature specification for next FF generations | DM – suggestions for the next FF generations:
More distant future
DM – dividers across the next two generations JDC - more torsion data in QCArchive to do a good job with WBO interpolation DM – we need someone who can take over Chaya’s work – Need to task someone with making FF with few explicit bond roders JDC – LJ stuff seems ready, but we have data selection problem and how to constrain LJ parameters not to go too crazy. Simon? SB – LJ should be able to come out in the next release. We should add some water properties to make sure we stay compatible JDC – Can we add aqueous measurements of sidehcain analogues and other biopolymer-like moelcules? MKG – I would vote for this, it would reassure people. It’s easy, since ethanol is like serine SB – it depends if we have data for it MS – the first pass would be just to use it benchmarking and not in optimization JDC – we need to find data first. MS, you have some data? MS – I wasn’t thinking about all quantities when I built that dataset (some aq. free energies) DLM – We should punt on this so that the appropriate team (Shirts lab) can look into data avilability. Biomolecular FFS MKG – AMBER-parameterized sidechain analogues, versus Parsley-parameterized? DLM+MS – Carlos recommended we take the latest QM torsional data from AMBER fits, and refit our torsions to fit that. No need to worry about consistency as we remain internally consistent with LJ parameters. JDC – Fully agree that’s the first step toward our medium-to-long term vision for biomol FFs MKG – I’m not excited about parameterizing our LJ in the context of AMBER LJ JDC – But we need to start converging if we’re hoping to ever be compatible with Amber FFs. We need to include something to constrain. (General) – We should first do this as a benchmarking experiment, and then see how incompatible the FFs are SB – (INFRASTRUCTURE) infrastructure burden, currently not possible, we’d need Jeff or Matt to put some time in it (General) – Disagreement about importance of “compatibility” DLM – I don’t think compatibility is really that important JDC – I’m worried that this could break P-L binding affinities. KCJ – Sacrificing affinity in the short term may be the best way forward. MKG – Why do we expect this to be worse? Why should FFs be incompatible off the bat? If the benchmarking DOES look bad, thenw e don’t make a release. DLM – Decision – We don’t worry too much about “compatibility” , and benchmark the result before we release. MS – If we want to say “clean break with everything, we’re starting over” – then THAT’s a good time to make our own water. MKG – So, “don’t change the water, and check for fortuitous compatibility” SB – Unsure about whether to include water mixtures in Sage release. We’re not sure if it’s going to make sense. There may be a data availability issue. Will report later. MS + JDC – It’d be good to test out TIP3P-FB while we still have LPW here. LPW – Forcebalance has a lot of water models now, and they all use the same manually-curated datasets I think TIP3P-FB is the right place to start. If we want to make a big improvement past this, then mixture data needs to be included. We should validate that OpenFF-evaluation can run aquaous mixtures. Feasbility tests should be easy, but settling on a final model will take a lot of work. DC – As a control expt, we might take the AMBER parameter set, and try fitting using those. Sidechain analgoues were like 17,000 data points, though protein backbone was fit by hand. So we can bring in the AMBER bakbone paramteres and otpimize the siechain parameters to the same data. DM – that’s what Carlos was saying, more or less, and we want to start with something like that JW – Timing – my experience has left me with a lot of things that I haven’t done, but I’m not sure about priority. I would vote for ranking features and infrastructure tasks based on their priority instead of assinging dates. KCJ – We should add “time windows” for delivery, so we have some semblance of calendar planning KCJ – Revisit QM theory level? LPW – Decision was made carefully with respect to conformational energy and with a plan to go forward for a year, but performing a benchmarking study for torsional drives would be important. Important settings/factors for torsiondrives may not be the same as for geometry optimization. It will be important to look at different levels of theories for OTHER properties we want to fit using QM. JDC – For torsion fitting, Josh (Horton?) is working on automating submission, and may be a good candidate for a study looking into use of minimization trajectories and gradients as an alternative to torsion driving LPW – Will this sample top of torsion barriers, and how important is that in fitting? INFRASTRUCTURE TASKS
SCIENCE TASKS/STUDIES
|
Supporting science | Atom type creation studies
Property collection+selection/Data accessibility
FF fitting science + infra
INFRASTRUCTURE TASKS
SCIENCE TASKS
|
FF improvement | KCJ – This is important to continuously do, since industry partners are interested in continual improvement DLM – Victoria Lim in my group is working on BenchmarkFF repo – uses various metrics to compare FFs. Has allowed her to idenitfy particular parameters that are a source of error JDC – Three categories that we want to present in the dashboard
KCJ – P-L free energy may be too expensive to continuously evaluate right now. Initially we should stick to small systems, since they’ll also be clearer for identifying sources of error. At this point we’re running P-L before host-guest, which seems out of order. MKG + JW – pARPika - PE integration is not production-ready KCJ – Are there a few H-G systems that we can ship around and test in all frameworks? DLM – Binding is heading toward standardization in David Hahn’s and Hannah MacDonald’s benchmark system repos MKG – Dave Slochower had done a paper on systematic benchmarking, that dataset may be available (data set: ) INFRASTRUCTURE TASKS
SCIENCE TASKS
|
Biomolecular FFs | DC –
DM – we would like to be able to do all of these things and then run experiments to answer these questions by benchmarking different FF versions DC – I can do much of this in DM – We will need to settle DC – Will need to learn our “language” and differences between amber atom typing MT – Will different functional forms of “latest” FF exist simultaneously? DM – We originally envisioned this happening. We will probably have one main FF and we’ll focus our efforts on it. But we may end up offering different functional forms that provide different accuracy/cost tradeoffs, and updating them regularly. KCJ – Thoughts about other biopolymers – DNA/RNA? MS – Need toolkit infrastructure (represent biopolymers). Then need benchmark infrastructure.
INFRASTRUCTURE TASKS
SCIENCE TASKS
|
QCArchive strategy | JDC – What can OpenFF push on to best synergize with QCA? DGS – I’m likely leaving MolSSI soon. Things are very stable now, so remaining team should be able to keep current capabilities active, and continue to add new data. MolSSI is happy to continue taking bug reports/feature requests to best support OpenFF needs moving forward. JDC – Should we have an OpenFF-employed QC developer? DGS – Current OpenFF budgeting should account for 50% of a QCA developer. Doing databases and distributed computing is complicated, so there will be a high onboarding cost. JDC – David Dotson could be a good candidate for this. It’d be good for OpenFF’s long-term health to have an in-house QCA person. MS – Our budget doesn’t include 50% of a developer for this in 2020. JDC – Open source science supplement from NIH may be a good source of support for this. https://grants.nih.gov/grants/guide/notice-files/NOT-OD-20-073.html MT – I can poke around with this code and evaluate how well I could fill that role. DGS – It’ll be dangerous to learn on QCA in production – Databases are brittle, so you should pursue more formal education about them if this is of interest. JDC + DLM – Could look at diversity supplements to pull in a CS grad student, or NIH (NOSY?) grant for software scientist |
Final prioritization | What can be done outside OpenFF organization or as one-off projects? What will be included in the next major FF?
INFRASTRUCTURE TASKS
FF SCIENCE TASKS
BIOPOLYMERS / PROTEINS
|
Action items
Decisions
- We agree that the paper writing procedure takes too long, and we should streamline it.
- Sage will be
openff-2.0.0
– Generally major increments MAY indicate a functional form/compatibility change, but don’t NEED to. If compatibility DOES change, then major version number MUST increment. - Biomolecular FFs naming – We’l just call it
amber14SB.offxml
and then they will become a part ofopenff-X.Y.Z
with documentation describing the details. - Provenance of parameter changes – where to store it, etc. Future STUDY
- Biomolecular FFs: We will take C. Simmerling view that consistency is overrated, we go ahead with Simon’s LJ refitting plan. Thus, we will do benchmarking of combined protein-ligand systems with separately optimized Openff small molecule parameters and existing AMBER protein parameters. If benchmarking results are okay, we can release, and next step will be a jump to co-parameterization of self-consistent small-molecule and protein parameters. If benchmarking results not okay, go back and optimize Openff parameters in context of some protein side-chain analogs outfitted with AMBER protein parameters.
- Water optimization – start planning co-optimization this year and use LPW’s expertise in designing this study, which will be executed at a later date. TIP3P is a good starting point. Settling on the final parameters will be the slowest step.
- STUDY Charges – decide which models to implement, in which order, coupling extent with LJ, etc.
- study QM torsional benchmarking study
- study Inclusion of incidental data in optimization (J. Horton is building infrastructure that will enable it)
- study Off-site charges – long-term goal (Science + infrastructure)
- STUDYMissing PERSON New parameter addition – requires a person which we don’t have at the moment, it will have to wait.
- study Identify data missing for property estimation. Start thinking about how to address it. Short-term.