-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix test subset script #519
Conversation
Codecov ReportPatch and project coverage have no change.
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. Additional details and impacted files@@ Coverage Diff @@
## dev #519 +/- ##
=======================================
Coverage 72.05% 72.05%
=======================================
Files 90 90
Lines 7746 7746
=======================================
Hits 5581 5581
Misses 2165 2165 ☔ View full report in Codecov by Sentry. |
- As we run scripts via AR, the interactivity never makes sense. - Most of the validation is inaccurate now, as those use cases have been handled
existing_data = query(EXISTING_DATA_QUERY, {'project': target_project}) | ||
|
||
samples = original_project_subset_data.get('project').get('samples') | ||
transfer_samples_sgs_assays( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to add a dry_run
cmd line option so that these transfer_
methods print what they will do, rather than actually doing the thing? For example rather than calling aapi.create_analaysis
it would log the details of the analysis that will be created. Would be good to have so that a user can run a quick validation check of the operations that will be performed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great suggestion and I just looked into implementing it -- however I think a dry-run is less trivial than I anticipated because for example, transfer_samples_sgs_assays
is dependent on transfer_participants
having happened first. Happy to create a ticket to think about this further, but perhaps beyond the scope of this PR?
I haven't dug all the way into this, but a couple of usage thoughts. Apologies for farting this into a comment box, I think it would be useful to have as a conversation, possibly with a lot of pen + paper. Maybe tearing some of it into little shapes and moving them around.
This framing means that any combination of the 4 arguments can be used and should never clash. I do think that any time a SG is transferred to test, it should also be accompanied by its family members and their data too. Not sure how that interacts with this.
create_test_subset.py --add-family FAM1 --add-family FAM2 --add-family FAM3 --add-sample SAM1 --add-sample SAM2 and create_test_subset.py --add-family FAM1 FAM2 FAM3 --add-sample SAM1 SAM2 |
Hello Please re-review in line with changes requested: |
Looks delightful 👌 |
Addresses #514
Several changes including
Querying has been significantly simplified in this PR, although unfortunately upserting is more complex. So net neutral.
Note: There is another pending ticket to improve and refactor this script. This doesn't address that, and the main goal was to get the script functional to unblock everyone. Happy to take on all feedback of course, but keen to move improvements upon the initial script to another ticket.