Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Genes data pipeline: update field mapping for variant cooccurrence #1403

Merged
merged 3 commits into from
Feb 14, 2024

Conversation

nadeaujoshua
Copy link
Contributor

While running the data-pipeline for genes, there was a missing field error for the prepare_heterozygous_variant_cooccurrence_counts task.

The source data readme doc shows that the field names have been updated, and so this PR updates the field mapping to use the new field names.

This update was tested locally by running the aforementioned pipeline task successfully.

Copy link
Contributor

@rileyhgrant rileyhgrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes look good to me, nice work and thanks for diving into this pipeline and fixing the things that were breaking it since the data last changed!

My only comment is a bit nitpicky, and so it is not blocking and I'll approve the PR here and just mention my thought re: commit messages.

The commit messages are a bit vague and hard to parse what really they're doing at first glance if they get into main (i.e. update field mapping could refer to quite a few things), if the plan was to squash and merge this PR then that makes sense to me if the PR title includes something about the genes pipeline. In my humble opinion squash and merge seems like a good fit for this PR since one of the commits is one to appease the formatter, and then the individual commits will be saved in the PR history while being a single commit in the main branch history.

Nicely done going for it on the pipelines and getting the genes pipeline to succesfully run in dataproc.

@nadeaujoshua nadeaujoshua changed the title Update field mapping for variant cooccurrence Genes data pipeline: update field mapping for variant cooccurrence Feb 14, 2024
@nadeaujoshua nadeaujoshua merged commit 69dbfc6 into main Feb 14, 2024
1 check passed
@nadeaujoshua nadeaujoshua deleted the josh/update-variant-coocurrence-fields branch February 14, 2024 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants