Skip to content

Is ontology URL not supported for characteristics[organism]? #190

@noatgnu

Description

@noatgnu

Hi, I am testing validating of SDRF-proteomics file and in the official documentation for SDRF (https://github.com/bigbio/proteomics-sample-metadata/tree/master/sdrf-proteomics) it was mentioned that

Ontology url (Computer readable): Users can provide the corresponding URI (Uniform Resource Identifier) of the ontology/CV term as a value. This is recommended for enriched files where the user does not want to use intermediate tools to map from free text to ontology/CV terms.

source name characteristics[organism]
Sample 1 http://purl.obolibrary.org/obo/NCBITaxon_9606
Sample 2 http://purl.obolibrary.org/obo/NCBITaxon_9606

When testing using the following SDRF file

source name characteristics[organism] characteristics[organism part] characteristics[disease] characteristics[cell type] characteristics[biological replicate] characteristics[biological replicate] characteristics[age] characteristics[developmental stage] characteristics[sex] characteristics[ancestry category] characteristics[individual] material type assay name technology type comment[proteomics data acquisition method] comment[label] comment[instrument] comment[fraction identifier] comment[technical replicate] comment[cleavage agent details] comment[modification parameters] comment[modification parameters] comment[precursor mass tolerance] comment[fragment mass tolerance] comment[file uri] comment[data file] factor value[organism part]
PXD004684-Sample-1 http://purl.obolibrary.org/obo/NCBITaxon_9606 lung tumor-adjacent tissues control not available 1 1 not available not available male white not available tissue run 1 proteomic profiling by mass spectrometry NT=Data-Independent Acquisition;AC=NCIT:C161786 NT=label free sample;AC=MS:1002038 NT=Q Exactive Plus;AC=MS:1002634 1 1 NT=Trypsin;AC=MS:1001251 NT=Oxidation;AC=UNIMOD:35;MT=Variable;TA=M NT=Carbamidomethyl;AC=UNIMOD:4;TA=C;MT=Variable 10 ppm 0.05 Da ftp://ftp.pride.ebi.ac.uk/pride-archive/2017/02/PXD004684/N294-1.raw N294-1.raw lung tumor-adjacent tissues
PXD004684-Sample-1 http://purl.obolibrary.org/obo/NCBITaxon_9606 lung tumor-adjacent tissues control not available 1 1 not available not available male white not available tissue run 2 proteomic profiling by mass spectrometry NT=Data-Independent Acquisition;AC=NCIT:C161786 NT=label free sample;AC=MS:1002038 NT=Q Exactive Plus;AC=MS:1002634 1 2 NT=Trypsin;AC=MS:1001251 NT=Oxidation;AC=UNIMOD:35;MT=Variable;TA=M NT=Carbamidomethyl;AC=UNIMOD:4;TA=C;MT=Variable 10 ppm 0.05 Da ftp://ftp.pride.ebi.ac.uk/pride-archive/2017/02/PXD004684/N294-2.raw N294-2.raw lung tumor-adjacent tissues
PXD004684-Sample-2 http://purl.obolibrary.org/obo/NCBITaxon_9606 lung tumor-adjacent tissues control not available 2 2 not available not available male white not available tissue run 3 proteomic profiling by mass spectrometry NT=Data-Independent Acquisition;AC=NCIT:C161786 NT=label free sample;AC=MS:1002038 NT=Q Exactive Plus;AC=MS:1002634 1 1 NT=Trypsin;AC=MS:1001251 NT=Oxidation;AC=UNIMOD:35;MT=Variable;TA=M NT=Carbamidomethyl;AC=UNIMOD:4;TA=C;MT=Variable 10 ppm 0.05 Da ftp://ftp.pride.ebi.ac.uk/pride-archive/2017/02/PXD004684/N295-1.raw N295-1.raw lung tumor-adjacent tissues

It resulted in an error on validation

parse_sdrf validate-sdrf --sdrf_file "9923cc1c-6aab-4cf5-bd0f-438d24bb9f7e.tsv" 
{row: 0, column: "characteristics[organism]"}: "http://purl.obolibrary.org/obo/ncbitaxon_9606" the term name or title can't be found in the ontology -- ncbitaxon
{row: 1, column: "characteristics[organism]"}: "http://purl.obolibrary.org/obo/ncbitaxon_9606" the term name or title can't be found in the ontology -- ncbitaxon
{row: 2, column: "characteristics[organism]"}: "http://purl.obolibrary.org/obo/ncbitaxon_9606" the term name or title can't be found in the ontology -- ncbitaxon
There were validation errors.

After replace the ontology URL with the exact organism name Homo sapiens

parse_sdrf validate-sdrf --sdrf_file "9923cc1c-6aab-4cf5-bd0f-438d24bb9f7e.tsv"
Everything seems to be fine. Well done.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions