MT – This looks good. It’s a bit more detail than I’ve thought out myself. A few things:
MT – Things like Dataset.from_qcarchive may need to have some pretty complex configuration to handle things like filters for elements and other loading options.
LW – I’d think that those could be handled after the dataset is loaded.
MT – I think the season 1 benchmarking filtered things in serial like that.
JW – It mostly filtered in serial, but there were places where the filtering happened in-memory, before anything was written to disk.
LW – Could have another step for filters
MT – For a method like convert_topology_to_interchange, a method like that would need to require that the molecules in the Topology have positions. Maybe some other requirements on information content.
MT – For QM stuff, Dataset.from_qcarchive will pull from global QCA or a local QCF server. For phys prop it will load from csv/thermoml. What if there’s a third set of reference data we’d think about in the future? I’d like to keep in mind that this is the sort of thing we’ll want to extend out. Are there places that may be extensions to this?
LW – Stuff like extension to recharge values.
LW – Maybe also chapin’s stuff - like chemical shifts of small peptides - which he’ll run soon. So it would be cool to support that, but I don’t think we’d have any hope of full protein simulations/properties, so I don’t know how we’d draw the line there. But the property class I outline here would be extensible.
MT – Is the thought that the plugins would be modular enough that most of them could have all their custom code inside of the compute method?
LW – Basically, and there could be some amount of postprocessing available.
MT – Nice, I’m broadly on board with this
MT – What if my Dataset.from_x wants to grab all of eneamine REAL, or some very large dataset? Is this intended to all be local, or can it reference external storage/compute? Like, what if this generates a ton of bulky trajectories?
LW – I can’t think of a reason why these would need to all be in memory at once. For example, a single phys prop calc doesn’t need to know about others.
(General) Will it always be safe to delete data after it’s used? Or will we get in trouble if a future step may need to reference a previous step?
(General) When will one part of a dataset need to know about other parts of a dataset?
(General) – TorsionDrives are the only case that we’ll think about here.
JW – We also had trouble in the benchmarking season 1 project where we treated the conformers of a molecule SEPARATELY when we computed RMSD, but then we needed to treat them TOGETHER when we calculated dEs
LW – So I think a single molecule (with all its conformers) would be a single computed property.
MT – Could you walk through what an ESP analysis would look like?
(LW wrote example on slide 7)
MT – This design broadly looks good, I don’t have much feedback at this moment. I think the thing that I’d focus on most with further questions would be extensibility
LW – I think we want to focus on extensibility around the metrics the most. The other areas would be less valuable. I’m not sure how much we’d want to support trajectory analyses.
MT – Should compute have an output structure?
JW – In season 1 benchmarking, the final few steps had csvs and pdfs as outputs
LW – It should, but I can’t specify it right now.
MT – Ok, I’ll aim for something standardish and serializable.
LW – Having everything be pydantic-based would be great.
JW –
How would a metric know whether it can use a dataset? Like, what if I tried to feed a phys prop dataset into something that wants conformer energies?
LW – I didn’t see datasets as being something that people subclass. So I’d do a decorator for each metric that registers it. If someone registers a new metric, it’ll have some information about what sorts of datasets it’s compatible with.
Does this design satisfy “inspectability”? Like, will there be stages between each step where users can manipulate the dataset on disk or is that not a requirement?
(General) – If everyhting’s based on pydantic objects, then things will be inspectable between stages. Users could add/remove entire atomic objects (like a molecule with multiple conformers) safely, but they could NOT modify an atomic object (like removing a single conformer from a molecule)
Will this workflow be expected to parallelize/multithread?
LW – I’d hope so. I kinda assume that, the more modular things are, the easier they’ll parallelize
MT – This seems like it’ll parallelize well, I don’t see anything here that’s necessarily serial.
Do we want to require this to be a superset of the capbilities of the season 1 benchmarking project? Where would a step like “conformer generation” and “QM optimization” fit here?
LW – It would be great to include it, but it doesn’t seem like data generation should be considered part of benchmarking. It seems like it could be part of an optimization property. Also, there could be lots of settings/knobs about how many conformers people want to generate, and other settings. This would be very slow and would probably be an external utility, or it could be a classmethod on an optimizationproperty.
JW – If we use QCEngine for MM optimizations (which do seem to be squarely in scope) then we’ll already have the machinery for QM optimizations. And the previous season 1 QCF deployment docs are already written.
Do we want to support trajectory generation at all? If we do, do we still want to provide “setup evaluator on your cluster” instructions? Will there be a way to extend that for eg. peptide sims?
MT – I think we DO want to support trajectory generation for phys props. So 1) Yes, 2) yes, 3) yes but those considerations should go in the dataset loading part. So there would be different system preparation considerations for protiens/peptides.
LW – Agree. We definitely need to support traj gen for phys prop. I don’t think proteins/peptides are in scope for the MVP, but we should be able to extend to those later.
MT – Agree.
JW – Ok, great. This aligns with Chapin’s direction, which is to explicitly NOT use evaluator for the initial protein/peptide benchmarks, so he can tightly control/inspect how things go and make adjustments to get the protocol right.
LW – Time commitment moving forward?
JW – I think working quarter- to half-time on this for the next 6 months could work really well. That would reserve MT’s bandwidth to hedge against a flood of user needs/technical stuff following the 0.11 rleease
MT – That sounds good. Agree that there’s a lot of uncertainty around user support load following 0.11. Wouldn’t mind if the fraction of my effort committed toward this was higher.
LW – That works for me.
Goals for next week?
LW – Pseudocode season 1 ddEs with SMIRNOFF and a small dataset?
JW – Want to get season 1 running to be able to inspect the workflow?
MT – Sure
JW – I’ll send MT the startup instructions for openff-benchmark and ensure he can get it running