Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCGA Dataset Training and Testing Distributions #88

Open
bryanwong17 opened this issue Jan 22, 2024 · 1 comment
Open

TCGA Dataset Training and Testing Distributions #88

bryanwong17 opened this issue Jan 22, 2024 · 1 comment

Comments

@bryanwong17
Copy link

bryanwong17 commented Jan 22, 2024

Hi, could you please share with me the distribution of slides used for training and testing in the TCGA dataset, along with their respective labels?

I noticed that it's mentioned here "We randomly split the WSIs into 840 training slides and 210 testing slides (4 low-quality corrupted slides are discarded)". However, upon examining the TEST_ID.csv file from this link, I observed that there are 214 testing slides. Could you provide clarification which slides were discarded? And also which slides are used for training? Thank you!

@GeorgeBatch
Copy link
Contributor

@bryanwong17, I went through this. See the results of my investigation in my README file for downloading TCGA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants