Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Discussion topics

Item

Notes

Recap

  • We could make a pydantic class for “quantity”

    • This class could encode its own serialization rules

      • Would instances of this class be tied to a fixed unit, or a mutable unit?

        • If mutable, would it be possible to serialize instances using the default unit without any deserialization overhead?

      • Another idea is that we could have two or three classes

        1. Fixed units positions: Quantity[Angstrom]

        2. Implicit units coordinates{,_angstrom}=[1., 1., 1.]

        3. Mutable units positions: Quantity

Updates from MosDeF collab

  • MosDeF has the additional constraint that they want users to EXTEND their classes.

    • If a scientific developer wants to add an additional attribute to GMSO or one of its owned objects, they shouldn’t have to deal with all the complexity of making their new attribute in pydantic

    • Solution: a GLOBAL_JSON_ENCODER_REGISTRY, which the developer populates with items like {type: (serializer_to_dict, serializer_from_dict)}. The benefit of this is that their objects can be extended WITHOUT requiring that all new types that are added is derived from BaseModel. (The downside is hideous complexity, since now they need to support TWO initialization pathways)

Matt’s efforts to implement pydantic backend

  • MT – My initial impression is that option 1 above would be possible, but would require so much customization that it wouldn’t be worth it.

Next decision points

  • Do we continue hammering on option 1?

  • Requirements:

    • Rapid serialization – If all units in an array are the same, only write the name of the unit once

    • Unit mutability – At least one option should be provided where the units can be changed (important for supporting different units of same dimension in force fields)

    • Pydantic compatibility

    • Round trips to/from disk MUST be identical

    • Support cases where unit depends on another field (eg, “if I represent density, then my unit is g/mL, but if I represent X, then my unit is Y)

      • This would be the case of an attribute looking at self.X to determine the unit for self.Y

    • Support unit storage/serialization needs for

      • FFs

      • Attributes which need to record which unit was last set for its value

      • conformers

      • physical property data

      • large arrays of positions

  • NOT requirements

    • Heterogenous units in one array (if this is needed, make it a list of FloatQuantitys)

  • Implementation

    • Provide FloatQuantity and ArrayQuantity

    • How to concatenate two quantity-wrapped arrays?

      • SB – Pint can basically handle this.

Homogenous contents

Heterogenous contents

Mutable units

FloatQuantity and ArrayQuantity

Iterable[Union[FloatQuantity, ArrayQuantity]]

Immutable units

positions_angstrom: np.ndarray

Will never exist

Action items

  •  

Decisions