Throughput status | OpenFF Protein Capped 1-mer Sidechains v1.3 - 44/61 TDs (3 * 5 new starting points added for the errored out TDs) CC – There was one TD that was going very slow, but did eventually finish. Of the three that didn’t finish, I made a new version of the dataset (this one, 1.3) that attempted to use different starting coordinates (made a starting conformer for each grid point using sage). That didn’t work, so I made a new dataset where I did that same thing but with 30 degree spacing. I also did a dataset where starting configurations were generated using sage for random grid points. JW – Any characteristics of these conformers that are failing to optimize? CC – Mostly things with hbonds between backbone and sidechain. So like LYS and GLU. JW: Any chance of proton transfer or the geometries constrained too hard on a hydrogen bond? CC – I tried running locally but didn’t know how to capture the geometry from QCEngine during the optimization. DD – Would you have time to work with PB and myself to try this out? I’m pretty sure I’ve done this before PB – IIRC, it was something like messy=True that made it story the temp files. (DD, PB, and CC will meet Thursday afternoon Pacific)
PB – Do you use the strong internal hbond structures for fitting? In general we try to avoid using those for fitting. CC – I agree, but the thing I’m wondering about is whether missing some grid points will make the rest of the torsiondrive valid? CC – I looked into this, and these torsiondrives do seem to have higher-energy structures compared to the minimum than the successful torsiondrives. So either there’s something up with these AAs, or we need to run more optimizations on each grid point to find more relaxed structures. PB – Some part of QCArchive has an energy cutoff for torsion scans at 30 kcal/mol where it disregards structures above that. I don’t think structures above that energy are saved at all. CC – I do see some results in my datasets that are 30 kcal/mol above the minimum. Would that cause a problem? PB – I’m not sure whether that would get cleaned up between optimizations/grid points. CC – For the lysine sidechain I have an example where the bakbone dihedrals are in a beta sheet configuration, and the highest energy above the minimum was 15 kcal/mol. But the alpha helix backbone for LYS is 35 kcal/mol above the minimum.
SPICE sets: Around 34.2K jobs last week SPICE PubChem Set 2 Single Points Dataset v1.2: 100 consistent incomplete/stale jobs, not showing any error message DD – BP, would you be able to take a peek at these incomplete jobs? BP – Sure, could you send me the IDs? (PB sent job ids) BP – It looks like the managers are stuck DD – Last week I restarted all the PRP and Lilac managers to try and debug this. Can you see which host this is running on? BP – This is marked as “running” on lilac, but the manager is marked “inactive”. This shouldn’t happen, and this will keep the task from getting picked up. I may have a script that can solve this - It happens some times when a manager starts up and then shuts down quickly (like, it shuts down while a task is in transit from the main QCA). (BP ran the stuck-job-restarter script, reports that it restarted 487 jobs)
SPICE PubChem Set 3 Single Points Dataset v1.2: 121806 from 95725 (only 200 jobs yet) SPICE Pubchem Set 4 Single Points Dataset v1.2: 8162 from 0
TG started back QC workers on UCI-hpc3 (25 workers: 8 cores/240 GB), PB is running a few workers as well since friday (10 workers: 40/48 cores/180GB), throughput is around 7K jobs yesterday.
|