You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think we can consider the AlchemicalNetwork object as our reproducibility container as it contains everything needed to run the calculations including the components, protocol and run-time settings making it the ideal object to publish when reporting a study. However, to aid with total reproducibility it might be nice to have some provenance information on how the network was planned, for example, details on the atom_mapper, edge_scorer, network_planner and the settings and software versions of these would help. I think this becomes even more vital for users who use the CLI and yaml settings to plan the network. I did have a go at this in ASAP-Alchemy see here for an example of the software provenance for the Kartograf atom mapper.
The text was updated successfully, but these errors were encountered:
I like this idea, and since provenance information is by definition immutable it very much falls in line with the data model requirements for AlchemicalNetworks, Transformations, etc.
I've added this to the 2.0.0 milestone, and will add it to discussion in an upcoming gufe dev call. If you have an idea for how you'd like to see this provenance done, feel free to draft it up in a PR and we can discuss it sooner.
The idea of serializing software stack information was raised as something we could add to gufe tokenizables. One of the concerns that couldn't be resolved at the time was about the mutable nature of things when you add provenance info at that stage.
I.e. if you roundtrip a serialization but switch software stack in between, then you suddenly overwrite your old provenance.
Another concern was that Transformations' execution is delayed in nature, i.e. you can build one locally in one environment and then execute it elsewhere in another. The Transformation itself is immutable, but the provenance info you need more is the one you got on Unit execution.
The latter isn't necesarily a blocker - i.e. you could just continually set object properties each time you create it (i.e. for a Protocol execution depend on the ProtocolUnitResult object's provenance being set).
The former is more about when we set this information and how we avoid it being overriden.
I think we can consider the
AlchemicalNetwork
object as our reproducibility container as it contains everything needed to run the calculations including the components, protocol and run-time settings making it the ideal object to publish when reporting a study. However, to aid with total reproducibility it might be nice to have some provenance information on how the network was planned, for example, details on theatom_mapper
,edge_scorer
,network_planner
and the settings and software versions of these would help. I think this becomes even more vital for users who use the CLI and yaml settings to plan the network. I did have a go at this in ASAP-Alchemy see here for an example of the software provenance for the Kartograf atom mapper.The text was updated successfully, but these errors were encountered: