PDB loading review | JM – The whole PDB is 43GB (this is small). I can load it into the PDBFile classes in about one hour. So one hurdle might be runtime - It might take a lot of time to load into OpenFF Mols. So that might be a place to look at performance improvements. Also not sure how long PDBFixer processing will take. The process will look like “get an inventory of the whole PDB, according to what we can/can’t load”, and this might involve subsampling or something like that. So one prong of the effort is getting an inventory of the whole PDB, and the other prong is loading things. The former would be classifying/describing inputs by things like (JM and JW sketch out whiteboard below, update next series of to-dos on trello)
|