Release #688

vivbak · 2024-02-20T08:32:42Z

No description provided.

…pes' (#675) * Fixed issue with sequencing groups of different 'types' from same sample getting duplicate analyses entries. In this commit, we addressed a bug that was causing duplicate analysis entries for samples with sequencing groups of different 'types'. Specifically, if a sample had both 'exome' and 'genome' sequencing groups, the analysis entries for the 'genome' sequencing group were being incorrectly copied to the 'exome' sequencing groups. This resulted in the 'exome' sequencing groups having the same analysis IDs and metadata as the 'genome' sequencing group. To fix this, we modified the code to create a more detailed mapping of old sample id to new sample id in order to track newly created sequencing groups so that the correct analyses were being copied to the newly created sequencing group according to the following rule: each sample only has one sequencing group for each type, platform, technology. * refactoring getting of new sg IDs to append analyses too. Needs some tidying up * Changed name of variable sequencing_group_ids_from_sample to sample_to_sg_attribute_map so that it more accurately reflects the contents of the mapping. This variable is here to map the unique sequencing group attributes that for each samples sequencing groups so that we can track which newly created sequencing group gets the analyses from the current old sequencing group we are using to update the analyses. * Linting fixes * iSort linting change * minor changes to improve readability and error checking * Add detailed docstring with example data to get_new_sg_id function * Add early failure check for get_new_sg_id function to check it returns a list of sequencing group id's of lenght 1 so that we know it's only adding data to one sequencing group during the Analysis creation inside trasnfer_analyses() * Ensuring sample and sequencing group IDs are completely fake (they already were fake) to please the linter. * Changing to get the external ID of the participant instead of the sample's external ID as the participant external ID is conistent, compared to a sample external ID that could change if a participant has multiple samples. While none of the PA datasets currently multiple samples per participant (that I'm aware of), the RD team does and in the interest of future proofing this change was made

vivbak requested a review from illusional February 20, 2024 08:32

illusional approved these changes Feb 20, 2024

View reviewed changes

illusional merged commit 6d8a041 into main Feb 20, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release #688

Release #688

vivbak commented Feb 20, 2024

Release #688

Release #688

Conversation

vivbak commented Feb 20, 2024