Known issues and bugs (@striketeam)

Team members interested in following/pursuing/raising issues (please add yourself if you’re missed): 

striketeam: @Chapin Cavender @Chris Ringrose @Christopher Bayly @Daniel Cole @David Dotson @David Mobley @Jeffrey Wagner @John Chodera @Josh Mitchell @Joshua Horton @Lee-Ping Wang @Lorenzo D'Amore @Matt Thompson @Michael Gilson @Michael Shirts @Pavan Behara @Tobias Huefner @Trevor Gokey@Lily Wang @Alexandra McIsaac

 

Time horizon: On-going


Goals

List of FF issues raised by internal/external members

Issue/task

Preliminary work/progress/comments

Status/Timeline

Notes

Issue/task

Preliminary work/progress/comments

Status/Timeline

Notes

Halogenated compounds

Discussion link: https://openforcefieldgroup.slack.com/archives/CKSHCE7SB/p1611592728000200

Brief summary: Large discrepancies with experiment for the COVID Moonshot on compounds containing a chloro group on aromatic scaffolds

 

 

Amide issue

Discussion link:

https://openforcefieldgroup.slack.com/archives/CKSHCE7SB/p1596485145006600

Brief summary:

High preference to non-planar amides using Parsley 1.2.0

 

 

Problem chemistries from benchmark analysis by Thomas Fox, an industry partner.
Additional analysis by Lorenzo D’Amore and Xavier Lucas

Discussion link:

https://openforcefieldgroup.slack.com/archives/C01BN0C5AKF/p1623935171007300

https://openforcefieldgroup.slack.com/archives/C01BN0C5AKF/p1623935450009200

https://openforcefield.atlassian.net/wiki/spaces/~954728248/pages/2110488578

 

https://openforcefieldgroup.slack.com/archives/C01BN0C5AKF/p1631634575020600

Brief summary:

RMSD analysis of the QM minimum energy conformer and MM optimized conformer with 1.3.0

Observations of Thomas Fox:

  • CC bond in oxirane 0.12A too long
    ("C1CO1")

  • CC bond between the two C=O in quadratic acid 0.1A too long
    ("Oc1c(O)c(=O)c1=O")

  • sulfoximide S=N bond 0.18A too long
    (Eg. "CCCCS(=N)(=O)CCC(N)C(O)=O")

  • 2-vinyl-furan ("C=CC1=CC=CO1") is not planar

  • aromatic nitro groups not in plane with ring (Eg. "C1=CC=C(C=C1)[N+](=O)[O-]")

  • aromatic thioethers out of plane (Eg. "CSC1=CC=CC=C1")

  • ring puckering of spiro-pyrrolidine
    (Eg. "C1CNCC12NC3=CC=CC=C3C(=O)N2")

  • aryl-methoxy groups not planar (45°)
    (Eg. "COC1=CC=CC=C1")

    Disclaimer: added some smiles as a starting point/reference in case of general chemical space observations, may not be the exact molecules Thomas Fox is referring to when you see Eg..

No action needed

Problem could not be reproduced with publically available molecules

Cyclobutane is flat (not puckered) with 1.2.0, 1.3.0.

QM in green above (from PhalkEtOH dataset)

1.2.0 in magenta above

1.3.0 in cyan above

 

 

Nitrogen-oxygen bonds in constrained systems (like rings) might result in long bonds with lengthy MM runs

Discussion link: 2021-07-21 BespokeFit meeting notes

  • From Josh Horton’s lengthy MM run

 

 

Bicyclo pentane moieties difficult to converge with MM

Discussion link:

 

 

 

t49 "*~[#7a]:[#6a:3]~*" in Sage 2.1.- seems to be handled by later t84; should t49 be deleted?

(raised by Paul Labute from CCG in a private email)

Redundant parameters in Sage 2.1 @Alexandra McIsaac

Completed

t49 is not redundant

t123 "[*:1]~[#15:2]-[#6:3]-[*:4]" in Sage 2.1.0 seems entirely contained in t123a and t124 - should t123 be deleted?  The V1 value is suspicious too.

Redundant parameters in Sage 2.1 @Alexandra McIsaac

Completed

t123 is redundant (at least for the molecules in the training set)

t164 "[*:1]~[#7:2]=[#15:3]~[*:4]" in Sage 2.1.0 seems suspicious with V2 = -0.9671 which encourages 90 degrees.  This can't be true for a true double bond between P and N (e.g.,C-P=N-C).  This rule seems intended to cover CN=P(C)(C)C  only.  Is this true?  If so, then #15X4 may be more appropriate.

Missing parameter coverage | t164 @Brent Westbrook Only covered by 3 conformations of 1 molecule:

Completed

t164 needs way more training data

t129 "* [*:1]-[#8X2r5:2]-;@[#7X2r5:3]~[*:4]:" in Sage 2.1.0 has a suspicious V = -19.907.  Is this correct?

Redundant parameters in Sage 2.1

@Brent Westbrook

Completed

Reasonable for the training set, but may need more coverage in Sage 2.2

A rather more serious problem is with phosphorus parameters:

t159 "[*:1]-[#8X2:2]-[#15:3]~[*:4]"  9.3828   0 -1.8998   0  0.4283   0  0.0000   0  0.0000  0.0000 
t160 "[#8X2:1]-[#15:2]-[#8X2:3]-[#6X4:4]"  8.2041   0 -1.5697   0 -0.7592   0  0.0000   0  0.0000  0.0000 # t160

Are the phosphorus parameters are supposed to handle phosphines or just phosphates?

@Brent Westbrook t159 is covered 142 times in the sage 2.1.0 dataset, t160 is covered 74 times.

To do: phosphine vs phosphates question

 

 

t65 "[*:1]-[#6X4:2]-[#7X3$(*~[#8X1]):3]~[#8X1:4]" 

should this be nitro [#7X3$:3](~[#8X1])~[#8X1:4]" like other similar rules or is nitroso also intended? If nitroso is intended then all you need is [*:1]-[#6X4:2]-[#7X3:3]~[#8X1:4]

As of Sage 2.1.0, t65 is trained on largely nitro groups but also additional N-O groups. However, it’s only benchmarked on nitro groups in the industry dataset, and its intention pretty clearly seems to be nitro applications. Changing it to the suggested SMARTS may be best here.

 

 

t77-t79 amide rules appear to encourage cis amides

Sage 2.1 t77-t79 cis amides

 

 

Systematic errors in parameters

Brief summary: Many of our parameters seem to have either systematic errors (median error > 0) or sub-populations that suggest the parameter should be split

Systematic errors in parameters