Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release #688

Merged
merged 1 commit into from
Feb 20, 2024
Merged

Release #688

merged 1 commit into from
Feb 20, 2024

Conversation

vivbak
Copy link
Contributor

@vivbak vivbak commented Feb 20, 2024

No description provided.

…pes' (#675)

* Fixed issue with sequencing groups of different 'types' from same sample getting duplicate analyses entries.
In this commit, we addressed a bug that was causing duplicate analysis entries for samples with sequencing groups of different 'types'. Specifically, if a sample had both 'exome' and 'genome' sequencing groups, the analysis entries for the 'genome' sequencing group were being incorrectly copied to the 'exome' sequencing groups. This resulted in the 'exome' sequencing groups having the same analysis IDs and metadata as the 'genome' sequencing group.

To fix this, we modified the code to create a more detailed mapping of old sample id to new sample id in order to track newly created sequencing groups so that the correct analyses were being copied to the newly created sequencing group according to the following rule: each sample only has one sequencing group for each type, platform, technology.

* refactoring getting of new sg IDs to append analyses too. Needs some tidying up

* Changed name of variable sequencing_group_ids_from_sample to sample_to_sg_attribute_map so that it more accurately reflects the contents of the mapping. This variable is here to map the unique sequencing group attributes that for each samples sequencing groups so that we can track which newly created sequencing group gets the analyses from the current old sequencing group we are using to update the analyses.

* Linting fixes

* iSort linting change

* minor changes to improve readability and error checking

* Add detailed docstring with example data to get_new_sg_id function

* Add early failure check for get_new_sg_id function to check it returns a list of sequencing group id's of lenght 1 so that we know it's only adding data to one sequencing group during the Analysis creation inside trasnfer_analyses()

* Ensuring sample and sequencing group IDs are completely fake (they already were fake) to please the linter.

* Changing to get the external ID of the participant instead of the sample's external ID as the participant external ID is conistent, compared to a sample external ID that could change if a participant has multiple samples. While none of the PA datasets currently multiple samples per participant (that I'm aware of), the RD team does and in the interest of future proofing this change was made
@illusional illusional merged commit 6d8a041 into main Feb 20, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants