Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LigandNetwork and AlchemicalNetwork provenance information #420

Open
jthorton opened this issue Nov 22, 2024 · 3 comments
Open

LigandNetwork and AlchemicalNetwork provenance information #420

jthorton opened this issue Nov 22, 2024 · 3 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@jthorton
Copy link
Contributor

I think we can consider the AlchemicalNetwork object as our reproducibility container as it contains everything needed to run the calculations including the components, protocol and run-time settings making it the ideal object to publish when reporting a study. However, to aid with total reproducibility it might be nice to have some provenance information on how the network was planned, for example, details on the atom_mapper, edge_scorer, network_planner and the settings and software versions of these would help. I think this becomes even more vital for users who use the CLI and yaml settings to plan the network. I did have a go at this in ASAP-Alchemy see here for an example of the software provenance for the Kartograf atom mapper.

@jthorton jthorton added the enhancement New feature or request label Nov 22, 2024
@dotsdl dotsdl added this to the Release 2.0.0 milestone Nov 26, 2024
@dotsdl
Copy link
Member

dotsdl commented Nov 26, 2024

I like this idea, and since provenance information is by definition immutable it very much falls in line with the data model requirements for AlchemicalNetworks, Transformations, etc.

@dotsdl
Copy link
Member

dotsdl commented Nov 26, 2024

I've added this to the 2.0.0 milestone, and will add it to discussion in an upcoming gufe dev call. If you have an idea for how you'd like to see this provenance done, feel free to draft it up in a PR and we can discuss it sooner.

@IAlibay
Copy link
Member

IAlibay commented Nov 26, 2024

The idea of serializing software stack information was raised as something we could add to gufe tokenizables. One of the concerns that couldn't be resolved at the time was about the mutable nature of things when you add provenance info at that stage.

I.e. if you roundtrip a serialization but switch software stack in between, then you suddenly overwrite your old provenance.

Another concern was that Transformations' execution is delayed in nature, i.e. you can build one locally in one environment and then execute it elsewhere in another. The Transformation itself is immutable, but the provenance info you need more is the one you got on Unit execution.

The latter isn't necesarily a blocker - i.e. you could just continually set object properties each time you create it (i.e. for a Protocol execution depend on the ProtocolUnitResult object's provenance being set).

The former is more about when we set this information and how we avoid it being overriden.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants