Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate tob-wgs-test metamist with nagim gvcf files #653

Draft
wants to merge 50 commits into
base: dev
Choose a base branch
from

Conversation

michael-harper
Copy link
Contributor

No description provided.

illusional and others added 30 commits October 12, 2023 12:55
* Bill 112 (#583)

* START: add billing home page and data page

* FIX: fixed style in the nav. Everythign but search bar

* DONE: menu and billing pages done

* REMOVE: remove redundant code

* User Story 1.3: As a user, I want to sort costs by budget and by the percentage of budget used

* First version of Current Billing Cost page.

* Added dropdown to select grouping by Project, Topic or Dataset.

* [BIL-39] User Story 2.3: As a user, I want to view costs in a time series with daily granularity in the billing dashboard (#588)

* Added gcp-projects API to billing, new Billing CostByTime page

* Updated StackedAreaByDateChart wiht more custom properties.

* Bill 150 (#589)

* API: add api changes to allow the get running cost query to filter using invoice month

* IN PROGRESS: trying to cache the call to the seqr prop map API endpoint. UI cleaned up.

* RM: console.log

* Extended bq looks back time to 300 days as we do not have much loaded in the dev table. (#591)

* REFACOR: move data loading logic into the main page to be used by all charts

* DONE: data table on time view page complete

* Added gcp-projects API to billing, new Billing CostByTime page

* Updated StackedAreaByDateChart with more custom properties.

* First version of Bar and Donut charts.

* Upgrading babel / vulnerability.

* Bil 242 - Hide billing pages when env variables aren't set (#600)

* DONE: menu and billing pages done

* REMOVE: remove redundant code

* API: add api changes to allow the get running cost query to filter using invoice month

* FIX: Dropdown to FieldSelector in BillingInvoiceMonthCost

* FIX: all pages and navigation/links working

* FIX: fixed all of the navigate() issues now all links work

* IN PROGRESS: trying to cache the call to the seqr prop map API endpoint. UI cleaned up.

* LINT: fix typing issues in API

* RM: console.log

* Extended bq looks back time to 300 days as we do not have much loaded in the dev table. (#591)

* Billing cost by time data refactor (#592)

* IN PROGRESS: data table on over time cost page

* DONE: data table on time view page complete

* UPDATE: frontend now checks if the the billing endpoint is returning an OK status or not. Hides billing pages on any failed (not 200) status

---------



* Added gcp-projects API to billing, new Billing CostByTime page

* Updated StackedAreaByDateChart with more custom properties.

* First version of Bar and Dount charts.

* Upgrading babel / vulnerability.

* Billing - show 24H table fields only to the current invoice month (#604)

* Fixing last 24H billing calculations.

* Limit Budget % and 24H to only latest month.

* Pick only the latest monthly budget row per gcp_project.

* Added Last 24H UTC day to the table header.

* Removing unused job_config.

* Fixing docker image building issues.

* Getting gcp_project data from gcp_billing view instead of aggregated view. (#606)

* Small fix to enable query cost by ar-guid, added example to API docs. (#611)

---------

Co-authored-by: Sabrina Yan <[email protected]>
Copy link
Contributor

@illusional illusional left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! I've left some inline code quality suggestions. I'd also probably suggest that this script can be genericised with a little bit of work. A suggestion: migrate participant metadata to test.

You could parameterise the suffix added on to avoid externalID clashes

scripts/add_nagim_gvcfs_to_tob_wgs_test_metamist.py Outdated Show resolved Hide resolved
required=True,
help='The path to the file containing the newly created samples.',
)
def main(project: str, projectID: int, new_samples_path: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you want the project and project ID? You are getting this in the graphql query above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! Will get it from the query_response and remove the param requirement

scripts/add_nagim_gvcfs_to_tob_wgs_test_metamist.py Outdated Show resolved Hide resolved
scripts/add_nagim_gvcfs_to_tob_wgs_test_metamist.py Outdated Show resolved Hide resolved
@click.option(
'--new_samples_path',
required=True,
help='The path to the file containing the newly created samples.',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more descriptive to what this means? Including the expected structure of the file, and maybe an anonymised example of the first couple of lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the parameter name because it doesn't actually reflect what we're parsing, here's the new:

@click.option(
    '--sample-path-mappings',
    required=True,
    help='''The path to a CSV file containing mappings of `main` CPG ID's, 
            the `external_id` and `gvcf` paths. 
            The file should have at least four columns: sgid, ext_id, gvcf, and gvcf_idx. 
            Here's an example of what the first couple of lines might look like:

            sgid,ext_id,gvcf,gvcf_idx
            sg1,ext1,gvcf1,gvcf_idx1
            sg2,ext2,gvcf2,gvcf_idx2
        '''
)

scripts/add_nagim_gvcfs_to_tob_wgs_test_metamist.py Outdated Show resolved Hide resolved
scripts/add_nagim_gvcfs_to_tob_wgs_test_metamist.py Outdated Show resolved Hide resolved
scripts/add_nagim_gvcfs_to_tob_wgs_test_metamist.py Outdated Show resolved Hide resolved
scripts/add_nagim_gvcfs_to_tob_wgs_test_metamist.py Outdated Show resolved Hide resolved
next(reader) # skip the header
for row in reader:
sgid, extID, gvcf, gvcf_idx = row[0], row[1], row[2], row[3]
extID_to_row[extID] = (sgid, gvcf, gvcf_idx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should consider using a namedtuple or data class here to be super clear how this data is organised.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's super cool. I've just implemented a dataclass!

@codecov-commenter
Copy link

codecov-commenter commented Jan 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (fcf07f2) 69.53% compared to head (2f66f18) 69.53%.

Additional details and impacted files
@@           Coverage Diff           @@
##              dev     #653   +/-   ##
=======================================
  Coverage   69.53%   69.53%           
=======================================
  Files         121      121           
  Lines        9743     9743           
=======================================
  Hits         6775     6775           
  Misses       2968     2968           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@illusional illusional left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work - looking great! Can you rename the file, and retitle the PR.
Once you apply that review feedback, you're good to merge (I don't need to see it again if you can action this feedback).

You'll also need to merge dev to main if you want to run this on main (but I don't think you need to, you can just run this as an analyst - as analysts can write to the metamist test project).

scripts/add_nagim_gvcfs_to_tob_wgs_test_metamist.py Outdated Show resolved Hide resolved
scripts/add_nagim_gvcfs_to_tob_wgs_test_metamist.py Outdated Show resolved Hide resolved
scripts/add_nagim_gvcfs_to_tob_wgs_test_metamist.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants