Skip to content

Commit 0bc4660

Browse files
authored
Merge pull request #717 from bigbio/dev
continue improvements to fix datasets.
2 parents 296f3f4 + a271827 commit 0bc4660

File tree

22 files changed

+2713
-2714
lines changed

22 files changed

+2713
-2714
lines changed

.github/workflows/validate-all.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,6 @@ jobs:
5252
python validate.py -v
5353
else
5454
export FILELIST=$HOME/filelist.txt
55-
cat "$FILELIST" | while read line; do echo "Changed file: $line"; parse_sdrf validate-sdrf --sdrf_file $line --skip_factor_validation --skip_experimental_design_validation --use_ols_cache_only; done
55+
cat "$FILELIST" | grep "*sdrf.tsv" | while read line; do echo "Changed file: $line"; parse_sdrf validate-sdrf --sdrf_file $line --skip_factor_validation --skip_experimental_design_validation --use_ols_cache_only; done
5656
fi
5757

README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,12 @@
77
![Contributors](https://flat.badgen.net/github/contributors/bigbio/proteomics-metadata-standard)
88
![Watchers](https://flat.badgen.net/github/watchers/bigbio/proteomics-metadata-standard)
99
![Stars](https://flat.badgen.net/github/stars/bigbio/proteomics-metadata-standard)
10-
[![Read the Docs](https://readthedocs.org/projects/proteomics-sample-metadata/badge/?version=latest)](https://proteomics-sample-metadata.readthedocs.io/en/latest/)
1110

1211
## Improving metadata annotation of Proteomics datasets
1312

1413
Metadata is essential in proteomics data repositories and is crucial to interpret and reanalyze the deposited data sets. While the dataset general description and standard data file formats are supported and captured for every dataset by ProteomeXchange partners, the information regarding the sample to data files is mostly missing. Recently, members of the European Bioinformatics Community for Mass Spectrometry (EuBIC - https://eubic-ms.org/) have created this open-source project to enable the standardization of sample metadata of public proteomics data sets.
1514

16-
The Proteomics Sample Metadata Project aims to standardize the way ProteomeXchange partners and the proteomics community capture the relation between the _samples_ and the _data_ generated within a PX submission. We have adapted the [MAGE-TAB v1.1 format](https://www.fged.org/projects/mage-tab/) to capture necessary metadata for Proteomics experiments to allow automated re-processing. The MAGE-TAB (MicroArray Gene Expression Tabular) is the file format to store the metadata and sample information on transcriptomics experiments. By repurposing and extending the MAGE-TAB for Proteomics, we aim to provide a format for future submissions of multiomics experiments to ProteomeXchange partners and better integration with other omics data. The MAGE-TAB is divided in two main files: IDF (Investigation Description Format) and SDRF (Sample and Data Relationship Format). We will describe how these two files are adapted for Proteomics.
15+
The Proteomics Sample Metadata Project aims to standardize the way ProteomeXchange partners and the proteomics community capture the relation between the _samples_ and the _data_ generated within a PX submission. We have adapted the [MAGE-TAB v1.1 format](https://www.fged.org/projects/mage-tab/) to capture the necessary metadata for Proteomics experiments to allow automated re-processing. The MAGE-TAB (MicroArray Gene Expression Tabular) is the file format to store the metadata and sample information on transcriptomics experiments. By repurposing and extending the MAGE-TAB for Proteomics, we aim to provide a format for future submissions of multiomics experiments to ProteomeXchange partners and better integration with other omics data. The MAGE-TAB is divided in two main files: IDF (Investigation Description Format) and SDRF (Sample and Data Relationship Format). We will describe how these two files are adapted for Proteomics.
1716

1817
Our goal is to ensure maximum reusability of the deposited data. Our work aims to define the minimum information required to report the experimental design of proteomics experiments, enabling the use and reuse of the deposited data by the proteomics community. The following _Use Cases_ should be considered to design the Proteomics Sample Metadata Format:
1918

annotated-projects/MSV000078494/MSV000078494.sdrf.tsv

Lines changed: 96 additions & 96 deletions
Large diffs are not rendered by default.

annotated-projects/MSV000078535/MSV000078535.sdrf.tsv

Lines changed: 44 additions & 44 deletions
Large diffs are not rendered by default.

annotated-projects/MSV000078555/MSV000078555.sdrf.tsv

Lines changed: 176 additions & 176 deletions
Large diffs are not rendered by default.

annotated-projects/MSV000080451/MSV000080451.sdrf.tsv

Lines changed: 71 additions & 71 deletions
Large diffs are not rendered by default.

annotated-projects/PXD000228/PXD000228.sdrf.tsv

Lines changed: 270 additions & 270 deletions
Large diffs are not rendered by default.

annotated-projects/PXD001224/PXD001224.sdrf.tsv

Lines changed: 109 additions & 109 deletions
Large diffs are not rendered by default.

annotated-projects/PXD002192/PXD002192.sdrf.tsv

Lines changed: 276 additions & 276 deletions
Large diffs are not rendered by default.

annotated-projects/PXD003469/PXD003469.sdrf.tsv

Lines changed: 383 additions & 383 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)