2020-07-01 TSCC/Lustre/pAPRika debugging meeting notes

Date

Jul 1, 2020

Participants

  • @David Dotson

  • @Jeffry Setiadi

Goals

  • Identify current issues with pAPRika blocking scientific goals

  • Agree upon solutions and commitment to resolving these isssues

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Issue #250

Jeffry

  • APR calculation used to run as a separate step

    • repeating calculation of host was redundant

    • However, this change introduced a problem where downstream tasks now fail, except for one of them

  • Possibly related, we also observe a new issue where identical HG systems yield the same result actual result, but the results themselves are only returned by one execution

Issue #224

Jeffry

  • This hasn’t been a problem lately; after reaching out to TSCC staff, they identified users abusing the filesystem and set policy and ongoing monitoring for the issue. Considered addressed for now.

  • If the issue reappears, we will address with retry logic at various layers in evaluator/paprika.

Action items

@David Dotson will address openff-evaluator#250; we want to enable @Jeffry Setiadi's science, and this is currently the biggest blocker. Commits will be made directly to PR #44 for the time being.

Decisions