You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The goal of a set of provenance fields is to maintain a complete history of analysis steps that occured to generate a matrix or annotation. This can be broken down into an easier and harder problem:
Easier problem: Have a list of fields that describe how the current matrix was created from a previous matrix or matrices. Fields will likely include software package, version of software package, algorithm, function, or command, parameters using in the algorithm/command, and links to ids of parent matrices.
Harder problem: The ability to have a a log of all previous steps recorded. This may not be too bad for a series of matrices that get created from one another (e.g. A -> B -> C). It may be more challenging when matrices get concatenated or subsetted. For example, doublet detection and ambient RNA estimation occurs on matrices from individual samples. So the command may only apply to a subset of cells belong to one individual/sample within a matrix that has multiple samples. Or the tools could have been run on matrices for individuals first and then these matrices get concatenated later.
The text was updated successfully, but these errors were encountered:
The goal of a set of provenance fields is to maintain a complete history of analysis steps that occured to generate a matrix or annotation. This can be broken down into an easier and harder problem:
Easier problem: Have a list of fields that describe how the current matrix was created from a previous matrix or matrices. Fields will likely include software package, version of software package, algorithm, function, or command, parameters using in the algorithm/command, and links to ids of parent matrices.
Harder problem: The ability to have a a log of all previous steps recorded. This may not be too bad for a series of matrices that get created from one another (e.g. A -> B -> C). It may be more challenging when matrices get concatenated or subsetted. For example, doublet detection and ambient RNA estimation occurs on matrices from individual samples. So the command may only apply to a subset of cells belong to one individual/sample within a matrix that has multiple samples. Or the tools could have been run on matrices for individuals first and then these matrices get concatenated later.
The text was updated successfully, but these errors were encountered: