Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C elegans subject metadata #171

Open
bendichter opened this issue Mar 29, 2023 · 4 comments
Open

C elegans subject metadata #171

bendichter opened this issue Mar 29, 2023 · 4 comments

Comments

@bendichter
Copy link
Member

We are trying to help a group get C. elegans data into NWB and DANDI and they are coming up against some incompatibilities in the subject metadata.

From a conversation with @rly and @dysprague:

I was wondering if there were possible changes we could make to the restrictions imposed on age and sex. C. elegans growth stages are defined from L1-L4 followed by adulthood. Most of the worms we use are designated 'YA' for young adult. I believe for our purposes, this is a more useful designation than dahs or weeks old since the lifesplan of the worm is relatively short. Furthermore, for sex, C. elegans have two sexes: male 'XO' and hermaphrodite 'XX' which are designated by their sex chromosomes. I see that there is an option for other, but since the hermaphrodite is the typical system we study, I think it would be useful to be able to just input the chromosomal designations if that's possible.

  1. Subject.sex is currently limited to "M", "F", "U", or "O". For C. elegans, would it be possible to also accept "XO" (male) and "XX" (hermaphrodite)? Would it be possible to accept such data in the NWB schema?

  2. Subject.age. For C. elegans, they are not sure if they can recover the age of the worm. It would be more informative to store the growth stage. We could extend Subject to a new ndtype CElegansSubject that has an additional field growth_stage, and make the necessary changes in NWB Inspector. Would it be possible to accept such data in the NWB schema?

We have started a Draft PR to make these changes for (1) in the NWB Inspector: NeurodataWithoutBorders/nwbinspector#353

@satra
Copy link
Member

satra commented Apr 1, 2023

for c elegans: see supplemental file 7 here: https://www.biorxiv.org/content/10.1101/2020.04.30.066209v3.supplementary-material

also i think it would be good to rethink the nwb core schemas around subject. if you are going to redo pieces of the score schema, let's do a better job of bringing many of the elements in dandischema and from the aind work into the core schema. @saskiad and i also looked at a bunch of related things. finally, more work is coming from cell lines and organoids.

we are also planning on modeling all of this in linkml relatively soon (a lot has already happened connected to this), which may offer a good route for many downstream tasks.

@rly
Copy link

rly commented Apr 1, 2023

From follow-up emails with @dysprague et al, they will be creating a CElegansSubject with the following fields:

  • growth_stage: value from the list below
  • time_in_stage: ISO 8601 duration format, which allows for minutes, hours, or days (optional)
  • cultivation_temp: float (in C)

This captures the stages described in supp file 7, except that the paper authors describe estimated hours since birth/hatch rather than time since the start of a growth stage. I am not a C elegans expert, so do not know which is more common or useful...

Growth stages that @dysprague et al identified for neurophysiology are:

  • two-fold = 2-fold embryo Ce (WBls:0000019) = The C. elegans life stage spanning 460-520min after first cleavage at 20 Centigrade. Cell number remains at ~560 cells, with some new cells generated and some cells go through programmed cell death. The shape of embryo is elongated and double fold. A stage between 1.5-fold embryo and 3-fold embryo.
  • three-fold = 3-fold embryo Ce (WBls:0000020) = The C. elegans life stage spanning 520-620min after first cleavage at 20 Centigrade. Cell number remains at ~560 cells, with some new cells generated and some cells go through programmed cell death. The shape of embryo is elongated and tripple fold. A stage between 2-fold embryo and fully-elongated embryo. Also called pretzel embryo or pretzel stage.
  • L1 = L1 larva Ce (WBls:0000024) = The first stage larva. At 25 Centigrade, it ranges 14-25.5 hours after fertilization, 0-11.5 hours after hatch.
  • L2 = L2 larva Ce (WBls:0000027) = The second stage larva. At 25 Centigrade, it ranges 25.5-32.5 hours after fertilization, 11.5-18.5 hours after hatch.
  • L3 = L3 larva Ce (WBls:0000035) = The third stage larva. At 25 Centigrade, it ranges 32.5-40 hours after fertilization, 18.5-26 hours after hatch.
  • L4 = L4 larva Ce (WBls:0000038) = The fourth stage larva. At 25 Centigrade, it ranges 40-49.5 hours after fertilization, 26-35.5 hours after hatch.
  • adult = adult Ce (WBls:0000041) = The stage that begins when a C.elegans individual is fully-developed and has reached maturity.
  • dauer = dauer larva Ce (WBls:0000032) = A third stage larva specialized for dispersal and long term survival.
  • post-dauer L4 = post dauer L4 stage Ce (WBls:0000828) = A C. elegans L4 larval life stage that occurs after the animal has recovered from the dauer diapause.

@rly
Copy link

rly commented Apr 1, 2023

we are also planning on modeling all of this in linkml relatively soon (a lot has already happened connected to this), which may offer a good route for many downstream tasks.

The NWB team is also looking at this. Let's discuss at the next sync.

@satra
Copy link
Member

satra commented Apr 3, 2023

@rly - your proposal looks reasonable to me. the only change i would make is allowing for a range of durations in addition to duration (200 - 400 mins). we crafted a patch to ISO8601 to encode such a range in dandi, as is often needed in certain experiments where a more precise time is not available.

one consideration is whether there should be a specific class for a specific species, or a more general concept. finally, the concept of age here is related to a concept of environment (which presently seems to be temperature, but one may easily consider other parameters). thus the model could separate those considerations. it also seems that the stage identifier does specify temperature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants