Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Data upload: upload a CSV data set #352

Merged
merged 32 commits into from
Mar 24, 2024
Merged

Conversation

eatyourgreens
Copy link
Collaborator

@eatyourgreens eatyourgreens commented Feb 27, 2024

  • towards upload data #347.
  • upload a CSV file and map headers.
  • map administration ID to dosing compartment.
  • map observations to model variables.
  • data stratification UI.
  • data stratification in the Django models (subject groups.)
  • generate groups of simulations from datasets.
  • data visualisation.

Copy link

codecov bot commented Feb 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.82%. Comparing base (45545e2) to head (1ac9be3).

Additional details and impacted files
@@               Coverage Diff               @@
##           development     #352      +/-   ##
===============================================
+ Coverage        76.35%   76.82%   +0.46%     
===============================================
  Files               95      101       +6     
  Lines             5379     5492     +113     
===============================================
+ Hits              4107     4219     +112     
- Misses            1272     1273       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@eatyourgreens eatyourgreens force-pushed the data-upload branch 4 times, most recently from b572be0 to 084892c Compare March 6, 2024 13:14
@eatyourgreens eatyourgreens changed the title feat: Data upload: upload a CSV and map headers feat: Data upload: upload a CSV data set Mar 6, 2024
@eatyourgreens eatyourgreens force-pushed the data-upload branch 3 times, most recently from 9abe775 to 89673da Compare March 8, 2024 14:47
@eatyourgreens eatyourgreens force-pushed the data-upload branch 6 times, most recently from ebedbe8 to 7d8edea Compare March 12, 2024 11:28
martinjrobins and others added 10 commits March 19, 2024 16:14
Fix type errors for data sub-page tabs

Case insensitive header normalisation

Trim blank lines from CSV

Add Administration ID to the CSV data parser

Allow time_unit as a header
* Map data to model variables

Show a table of dose amount data (and units), with select menus
to map these to model inputs.
Show a table of observation variables (and units), with select menus
to map those to model outputs.

* Add a mapped qname to biomarkers

Store the mapped variable qname on biomarker types. Read it from
OBSERVATION_VARIABLE in a dataset.

* Support ES2015

* Allow for multiple variable mappings

Map each Administration ID to a dosing compartment.
Map each Observation ID to a model output.
Add new columns to the CSV data, with mappings and optional units.

* Preview the final dataset

Add a final step to the upload, which will preview the final CSV before saving it.

* Save modified dataset to backend

- load or create a dataset when we start an upload.
- save the dataset when we finish an upload.
- modify the `/datasets/:dataset_id:/csv` endpoint to accept a JSON string.

- update the dataset API to allow filtering by project ID.

* Allow for a single unit column

- When there's a single unit column, use that column for both dosing and observations.
- Split the CSV data into dosing rows and observation rows.
- Add administration route to the Map Dosing screen.
- Allow for dimensionless observation units.
- Filter mapped observation variables for compatibility with the observation unit value.
- Replace the tabbed interface for data uploads with a stepper.
- Add utilities to group subjects by protocol.
- Add stratification to the stepper.
- Stratify by dose protocols for the time being.
- Display each protocol group as a MUI data grid.
- Display protocol groups as tabs in the trial design view.
- Add the dosing compartment qname to the protocol model.
- Add a 'cohort' field to the CSV.
- always use the global dataset object rather than local state.
- remove a debugging log.
- expand the lists of expected column headers for units and observations.
- set non-numeric observations to 0.0.
- break `create_myokit_simulator` up into smaller methods.
- break `simulate` up into smaller methods.
- add optional `dosing_protocols` to `create_myokit_simulator`. If set, it
overrides the default dosing variable protocols.
* feat: subject groups

Add a subject group model to the Django app and the API. Subject groups
have a list of subjects and a list of protocols associated with those
subjects. Datasets have a list of subject groups in `dataset.groups`.

Refactor the Data and Trial Design views to use `dataset.groups` for
tabbed views.

* Change GROUP to GROUP_ID
Refactor the Myokit model mixin to allow multiple simulations to be run.
Each dataset subject group has its own simulation, with subject dosing protocols and dosing events.

Update the simulate API to return multiple simulations.

Show multiple plots in the Simulations tab

Test simulations with a project  and dataset
`SimulateResponseSerializer` doesn't need a separate `ListSerializer`.
It can use `many=True` to return a list of simulations.
- add biomarker data to the dataset serialiser.
- add biomarker data to the `useDataset` hook.
- display the dataset as data grids in the Data tab.
- show biomarkers as scatter plots on simulation plots.
When the uploaded CSV doesn't have an observation unit column, allow
the user to pick a unit from a list of units, then pick a compatible
variable for the selected unit.
Better handling of missing unit columns in the CSV.
- When time units are missing, prompt to choose a time unit.
- Change the column headings in the submitted CSV to match
headings that are expected by the Python data parser utility.
- Add a preclinical flag to the dosing step during uploads.
Makes sure that observation units are valid symbols. Symbols loaded
from a CSV sometimes have encoding issues.

If the symbol isn't valid, show the unit menu instead, so that we can
pick a valid unit and correct the uploading error.
@eatyourgreens
Copy link
Collaborator Author

That latest commit should fix a bug where you couldn't upload a CSV if there are encoding errors in the unit symbols. Now you get a dropdown menu so that you can fix the error.

Screenshot of a dataset with a Unicode error in the observation units, and a dropdown menu open showing a list of valid units.

Update the Group ID column when the selected cohort changes.
Use the group column values as the group ID for each row
eg. (1, 2, 3) or (Male, Female.)
When there's no selected unit yet, the observation variables menu
should show all possible model variables.
Choose a mapped variable, then pick a unit for that variable (if required.)
Without this, saving a protocol from the Trial Design tab will
trigger a loop of failed PUT requests.
@eatyourgreens
Copy link
Collaborator Author

eatyourgreens commented Mar 21, 2024

Swtiched the Unit and Variable selections around in the previous screenshot. Choosing a variable, then a unit, feels like a more natural interaction (and reduces the number of choices for unit.)

Screenshot of the observations screen showing C1_t selected as the observation variable, and the unit menu open to a list of only concentration units eg. mg/L.

There is a bug in this particular case, in that once you've selected a unit, it stops being editable.

EDIT: fixed by f90611b.

When there's only observation values in the uploaded CSV, we should
always be able to pick from all available model variables.
Map ADDL and II columns, in the CSV, to repeat doses in the model
and simulations.
In certain cases, the observation units menu could disappear after
selecting a unit. This should fix that by allowing its visible state
to change from false to true, but not from true to false.
In certain cases, the observation units menu could disappear after
selecting a unit. This should fix that by allowing its visible state
to change from false to true, but not from true to false.
martinjrobins and others added 2 commits March 22, 2024 10:02
Add checkboxes to control which subject groups are displayed in the
simulations view.
@eatyourgreens eatyourgreens marked this pull request as ready for review March 24, 2024 15:33
Copy link

Copy link

Quality Gate Failed Quality Gate failed for 'pkpdapp-team_pkpdapp_frontend'

Failed conditions
0.0% Coverage on New Code (required ≥ 80%)
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

@eatyourgreens eatyourgreens merged commit 2008d88 into development Mar 24, 2024
8 of 9 checks passed
@eatyourgreens eatyourgreens deleted the data-upload branch March 24, 2024 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants