Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Organize a benchmark dataset for fmri analysis #4

Open
gkiar opened this issue May 11, 2021 · 18 comments
Open

Organize a benchmark dataset for fmri analysis #4

gkiar opened this issue May 11, 2021 · 18 comments

Comments

@gkiar
Copy link
Member

gkiar commented May 11, 2021

Ideally including:

  • Test retest data
  • "High" quality samples
  • "Typical" standards
  • Heterogeneity
@gkiar
Copy link
Member Author

gkiar commented May 17, 2021

Related to #3

@poldrack
Copy link

poldrack commented May 17, 2021 via email

@gkiar
Copy link
Member Author

gkiar commented May 17, 2021

Thanks, @poldrack!

@TingsterX, is there a publicly available primate fmri dataset which could be used for benchmarking tools?

When thinking about #3, which focuses specifically on benchmarks for skull extraction on structural data, does anybody have an idea of an existing collection which has ground truth segmentations for either human or primate? I may push this question to the Twitter-verse...

@gkiar
Copy link
Member Author

gkiar commented May 17, 2021

Also, cc: @hough129

@audreymhoughton
Copy link

audreymhoughton commented May 17, 2021

I have started making a test dataset with the studies listed below with a few subjects each for something else (or maybe related to this) already. These datasets have / are going to have BIDS input, processed outputs from the abcd-hcp-pipeline, and derivatives (we haven't decided what those derivatives specifically entail).

These currently live on Box.

ABCD (one for each scanner type - two scanner types are ready)
HBN (5 subjects - almost ready - still uploading processed)
PNC (not processed yet - need to modify pipeline)
HCP-D (ready to go - two subjects)
NKI-Rockland (have not processed yet - need BIDS inputs)

@gkiar
Copy link
Member Author

gkiar commented May 17, 2021

@arueter1
Copy link

Temporary storage location: S3 bucket.
Goals for future storage location: Loris study at MSI at UMN so that the raw dicoms / derivatives / subject data is easier to download whenever we want to share it. This won't be ready to go until later this year (Oct 2021?).

Action Items:
Are the datasets that Audrey listed above public? Any use considerations?
QSIPrep still needs to be run on some diffusion data.

@TingsterX
Copy link

For NHPs, check out the PRIME-DE (https://fcon_1000.projects.nitrc.org/indi/indiPRIME.html). The Oxford dataset has 20 macaques (~50min per monkey); UCdavis has 19 macaques, shorter fMRI scans but with higher-resolution of T1 and T2.

For the brain extraction dataset. Cameron had one human manually edited dataset open.
https://academic.oup.com/gigascience/article/5/1/s13742-016-0150-5/2737425?login=true

Recently, we just published a tool using a transfer-learning framework that trained the U-Net model on the human dataset and upgraded it with the macaque data. It also works for other species, e.g. chimps, marmosets, and pigs as well.
https://github.com/HumanBrainED/NHP-BrainExtraction

@gkiar
Copy link
Member Author

gkiar commented May 17, 2021

Also adding @engfranco to the thread

@engfranco
Copy link

I have started making a test dataset with the studies listed below with a few subjects each for something else (or maybe related to this) already. These datasets have / are going to have BIDS input, processed outputs from the abcd-hcp-pipeline, and derivatives (we haven't decided what those derivatives specifically entail).

These currently live on Box.

ABCD (one for each scanner type - two scanner types are ready)
HBN (5 subjects - almost ready - still uploading processed)
PNC (not processed yet - need to modify pipeline)
HCP-D (ready to go - two subjects)
NKI-Rockland (have not processed yet - need BIDS inputs)

Folks, let me know if you have any questions about the NKI-Rockland or HBN datasets. If you need 5 good data from the NKI-Rockland dataset, I recommend using these 5 that have low motion:
sub-A00056703/ses-BAS1
sub-A00055906/ses-BAS1
sub-A00075732/ses-BAS1
sub-A00034073/ses-BAS1
sub-A00063006/ses-BAS1

Links to the S3 bucket of the whole imaging dataset organized in BIDS can be seen here:
http://fcon_1000.projects.nitrc.org/indi/enhanced/aws_links.csv

@arueter1
Copy link

Thanks Alexandre. One quick thing: @engfranco I'm not sure that we should have subject IDs out on a public facing website. Maybe we can share that internally to this team somehow (maybe via email?).

@gkiar
Copy link
Member Author

gkiar commented May 18, 2021

For skull stripping, this looks awesome: http://preprocessed-connectomes-project.org/NFB_skullstripped/index.html

@gkiar
Copy link
Member Author

gkiar commented May 18, 2021

@engfranco
Copy link

Thanks Alexandre. One quick thing: @engfranco I'm not sure that we should have subject IDs out on a public facing website. Maybe we can share that internally to this team somehow (maybe via email?).

No need to worry. All these subject ID's are already public facing IDs and are available to anyone accessing the NKI-RS website. I'm not sharing anything that isn't already in the public domain. We have internal ID's for these participants as well.

@audreymhoughton
Copy link

I have started making a test dataset with the studies listed below with a few subjects each for something else (or maybe related to this) already. These datasets have / are going to have BIDS input, processed outputs from the abcd-hcp-pipeline, and derivatives (we haven't decided what those derivatives specifically entail).
These currently live on Box.
ABCD (one for each scanner type - two scanner types are ready)
HBN (5 subjects - almost ready - still uploading processed)
PNC (not processed yet - need to modify pipeline)
HCP-D (ready to go - two subjects)
NKI-Rockland (have not processed yet - need BIDS inputs)

Folks, let me know if you have any questions about the NKI-Rockland or HBN datasets. If you need 5 good data from the NKI-Rockland dataset, I recommend using these 5 that have low motion:
sub-A00056703/ses-BAS1
sub-A00055906/ses-BAS1
sub-A00075732/ses-BAS1
sub-A00034073/ses-BAS1
sub-A00063006/ses-BAS1

Links to the S3 bucket of the whole imaging dataset organized in BIDS can be seen here:
http://fcon_1000.projects.nitrc.org/indi/enhanced/aws_links.csv

Is there anything I need to do to be able to access this bucket?

@gkiar
Copy link
Member Author

gkiar commented May 18, 2021

@hough129 I believe it is public, if you use the --no-sign-request flag with aws s3

@ltetrel
Copy link

ltetrel commented May 19, 2021

Our old document on selecting openneuro datasets: https://docs.google.com/document/d/16xjAPvcbFs1dWFozvpwpoAky8JWmmfDFppDctoNIWrc/edit#heading=h.a8bx6kg8xh6y

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants