2020-07-01 `openff-evaluator` Meeting notes

Date

Jul 1, 2020

Participants

  • @David Dotson

  • @Simon Boothroyd

Goals

  • Big picture - desired state, current state

  • Introduction to structure of openff-evaluator

  • Development workflow

  • High priority starting issues

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Big picture issues

Simon

  • Data model for output data probably the messiest place right now; hard to standardize these yet

    • performance a problem as well, since many objects get shuttled around and replicated on dask workers

    • serialization/deserialization across network a bottleneck

    • provenance info duplicated and shuttled around too, adding to problem

    • we want to figure out a satisfying solution for provenance that meets our performance needs

pAPRika

Simon

  • Issue on TSCC with filesystem writes (#224)

    • Could address with retry wrapping around components at each layer (protocol, individual dask steps, etc.)

    • we already do aggressive checkpointing, so this shouldn’t be a huge waste as a compensation for filesystem issues

      • need to verify coverage of checkpointing

  • The pAPRika PR is getting unwieldy, and currently have to manually keep it in sync with mainline branch

    • may help to split paprika components out as a separate repo that isn’t highly coupled to evaluator

    • @David Dotson will touch base with @Jeffry Setiadi on whether this solution works for him next

Development workflow

Simon

  • Numpy docstrings

  • black / flake8 for formatting / linting

  • Try to keep PRs under 500 lines of changes; not always possible

Good first issue

Simon

  • AttrsXXX (#226) should go all in on pydantic. Issue should be adjusted accordingly

Action items

@David Dotson will engage with @Jeffry Setiadi, support science from pAPRika
@David Dotson will start making pydantic shifts in data model as individual PRs against issue #226

Decisions