Info |
---|
This scientific roadmap includes the next two planned force field releases and a list of scientific studies which need to be performed in 2020. Each study has a priority assigned to it. This roadmap can be continuously updated, but the overall status and priorities will be revised and updated in June 2020. |
...
Study | Priority | Effort | Science dependencies | Infrastructure dependencies | Comment | Start date | End date | Status | Driver/Team | ||||||||||||||||||
Chemical perception | |||||||||||||||||||||||||||
Addition of new parameters – manually fixing problems |
|
| Made easier by benchmarking dashboard (Optional) | Made easier by benchmarking dashboard (Optional |
| Hyesu Jang David Mobley Jessica Maat (Deactivated) Victoria Lim (Deactivated) | |||||||||||||||||||||
Automated typing inference from scratch |
|
| Organise a meeting to coordinate efforts. | Full-time person needed – to be discussed further. Work of Josh Fass (Deactivated) and Tobias Huefner may assist here. Owen Madin interested. | |||||||||||||||||||||||
Mixture Properties | |||||||||||||||||||||||||||
|
| Driver: Simon Boothroyd | |||||||||||||||||||||||||
|
|
| Driver:Simon Boothroyd | ||||||||||||||||||||||||
Chemical potential-like properties |
| Non-bonded optimization | Implementation in | Need to evaluate the data first (testing needed) |
| Simon Boothroyd Spinoff (student) | |||||||||||||||||||||
Octanol-water partition coefficients |
| Implementation in | Data needed, harder problem |
| Simon Boothroyd spinoff (Student) | ||||||||||||||||||||||
Data coverage and availability |
| Feasibility studies | Check the available data and identify missing data points. Worry in the future what to do about it. We will use what we have for Sage. | Ongoing | |||||||||||||||||||||||
QM Data Generation | |||||||||||||||||||||||||||
QM dataset selection (training data) |
| Need to expand to benchmarking set. |
| ||||||||||||||||||||||||
Benchmarking/re-evaluating our choice of QM theory |
| (Optional) QC Dataset submission infrastructure | Test of the whole torsiondrive. Keep within 10-50 torsiondrives. More is better. |
| Hyesu Jang lead; Lee-Ping Wang Hyesu Jang also leading molecule set selection with help from Jessica Maat (Deactivated) and Victoria Lim (Deactivated) | ||||||||||||||||||||||
Protomer/tautomer enumerated molecules |
| QM level of theory validation (QMLoTV) | Protonation/tautomer enumeration integration (Joshua Horton doing OE version in toolkit; there’s currently no good protonation state enumeration with RDKit – see
| ||||||||||||||||||||||||
Data on molecules with nonzero formal charges |
| QM level of theory validation (QMLoTV) | (Optional) QC Dataset submission infrastructure | ||||||||||||||||||||||||
Enamine REAL fragment coverage |
| Automated fragmentation integration (Joshua Horton | |||||||||||||||||||||||||
Ligand Expo fragment coverage |
| Automated fragmentation integration (Joshua Horton | Ligand Expo has higher priority than Enamine Real. | ||||||||||||||||||||||||
Richer torsion data for WBO fitting |
| WBO torsion implementation | (person needed to continue work of Chaya Stern (Deactivated); probably Pavan with input from Jessica Maat (Deactivated) or vise versa. Overseen by Simon Boothroyd ? ) | ||||||||||||||||||||||||
Biopolymer data selection (ensure sidechain data is available in QCA) |
| ASAP | |||||||||||||||||||||||||
Biopolymer data computation |
| (Optional) QC Dataset submission infrastructure | |||||||||||||||||||||||||
More efficient torsion sampling with less grid points during scan |
|
| |||||||||||||||||||||||||
Fitting | |||||||||||||||||||||||||||
Addition of new parameters – manually fixing problems |
|
| Ongoing |
| Hyesu Jang David Mobley Jessica Maat (Deactivated) Victoria Lim (Deactivated) | ||||||||||||||||||||||
LJ refitting (Sage) |
|
| |||||||||||||||||||||||||
WBO refitting (Sage) |
| More torsion data | WBO torsion implementation | Implement what Chaya has already done. As soon as infrastructure is ready. | After May meeting | Late 2020 (Sep 2020) | Jessica Maat (Deactivated) Hyesu Jang Someone else to continue where Chaya left it off | ||||||||||||||||||||
BCC refitting |
| LJ refit Patterns for BCCs; could start with something simple like bond SMARTS. | ChargeIncrementModel implementation (early May) | Person needed (
David Mobley can help | |||||||||||||||||||||||
Study how to set prior widths and weights for different sorts of data during FF optimization |
| Lee-Ping Wang Hyesu Jang Spinoff? | |||||||||||||||||||||||||
Value of data generated “incidentally” during torsiondrive in fitting, e.g. optimization snapshots, gradients, energies (low control over these data points) |
| Some parts of Bespoke workflow |
| ||||||||||||||||||||||||
Benchmarking | |||||||||||||||||||||||||||
Small reference system for fast testing of FE infrastructure – 5-10 small reference systems, possibly subset of SAMPL challenges, for comparison of different free energy methods to avoid using large P-L systems for test calculations |
|
| Should use SAMPLing challenge systems plus a couple more similar ones. | ASAP |
| ||||||||||||||||||||||
Benchmarking/re-evaluating our choice of QM theory |
|
|
| ||||||||||||||||||||||||
CCDC data selection/release |
|
| |||||||||||||||||||||||||
Create a list of tests to judge the “quality” of biopolymer FF with our scientific advisory board |
| Organise the meeting with our IAB, invite to May meeting | April / May | ||||||||||||||||||||||||
| Minor release of Parsley | Benchmarking dashboard | Done in preprint form, but no benchmarking dashboard. Still need torsion benchmarking; utilize work just done for OpenFF 1.0 paper. | Mid 2020 | Done-ish | ||||||||||||||||||||||
| Release of Sage | Benchmarking dashboard | Late 2020 | ||||||||||||||||||||||||
Biopolymers | |||||||||||||||||||||||||||
Which quantum method should we use for biopolymers (should it be the same as small molecules)? |
| QM benchmarking study | |||||||||||||||||||||||||
Feasibility/benchmarking studies of torsional CMAPs |
| After protein FF implementation | CMAP support in OFFTK | ||||||||||||||||||||||||
Feasibility/benchmarking studies of other cross-terms |
| Support for cross-terms in OFFTK | |||||||||||||||||||||||||
Charges | |||||||||||||||||||||||||||
GCN charge model |
| In a few steps:
|
| John Chodera Yuanqing Wang | |||||||||||||||||||||||
Off-site charge SMIRKS definition/fitting/benchmarking |
|
| VirtualSite support in OFFTK | Helpful discussion in Slack: https://openforcefieldgroup.slack.com/archives/C1907SGET/p1590251452068100 |
| ||||||||||||||||||||||
Bayesian inference and surrogate modeling | |||||||||||||||||||||||||||
Testing Bayesian inference on an analytical model |
|
|
| ||||||||||||||||||||||||
Generalizing analytical model for Bayesian inference and testing methods |
|
| We don’t need Bayesian framework to work immediately |
| |||||||||||||||||||||||
Constructing full Bayesian architecture with reweighting and simulation to build surrogate models |
|
| Analytical Bayesian inference testing |
| |||||||||||||||||||||||
Automated typing inference from scratch |
|
| Full-time person needed – to be discussed further. Work of Josh Fass (Deactivated) and Tobias Wulsdorf may assist here. | ||||||||||||||||||||||||
Other | |||||||||||||||||||||||||||
Water co-optimization planning study (to be executed later) – discuss with Lee-Ping Wang |
|
| spinoff | ||||||||||||||||||||||||
Thinking about metals / ions / salts / ionic liquids |
|
| Owen Madin Matt Thompson spinoff |
...