This repository contains the Docker image and code required to convert the TCIA PSMA-PET-CT-Lesions dataset from DICOM format into NIfTI format, as used in the AutoPET III Grand Challenge. The script specifically creates the PSMA-PET/CT subset of the challenge dataset and organizes it following the nnUNet format.
Note: The TCIA PSMA-PET-CT-Lesions dataset currently contains a problematic study: 'PSMA_771c8dc6051db4d7/11-19-1998-NA-PETCT whole-body PSMA-82612'. The DICOM SEG in this study references a non-existent PET SOPInstanceUID. Please remove this study before conversion. This study will be fixed by TCIA soon.
- TCIA PSMA-PET/CT Dataset: 10.7937/R7EP-3X37
- NIfTI PSMA-PET/CT Dataset v1: 10.57754/FDAT.5bjzn-0vh28
- NIfTI PSMA-PET/CT Dataset v2: 10.57754/FDAT.gpeq5-yxy63
- Challenge Dataset: AutoPET III Dataset
After conversion, the dataset is structured in nnUnet format as follows:
nnUNet_dataset/
├── imagesTr/
│ ├── psma_patient1_study1_0000.nii.gz # CT image (resampled to PET space)
│ ├── psma_patient1_study1_0001.nii.gz # PET image (SUV units)
│ └── ...
└── labelsTr/
├── psma_patient1_study1.nii.gz # Manual tumor lesion segmentations
└── ...
The converter is packaged as a Docker image and expects:
<input-directory>mounted at/in(TCIA DICOM data)<output-directory>mounted at/out(generated NIfTI dataset)- Optional behavior is controlled via environment variables
-
Pull Docker Image
docker pull ghcr.io/clinicaldatascience/tcia-psma-pet-ct-preprocessing:latest
-
Run the Docker Container
docker run --rm -it \ -v <input-directory>:/in \ -v <output-directory>:/out \ -e CONVERT_FLAGS="--skip-existing" \ ghcr.io/clinicaldatascience/tcia-psma-pet-ct-preprocessing:latest
-
Pull Python Base Image
docker pull python:3.8.13
-
Build Docker Image
Clone this repository and run:
docker build -t <dockername> .
-
Run the Docker Container
docker run --rm -it \ -v <input-directory>:/in \ -v <output-directory>:/out \ -e CONVERT_FLAGS="--skip-existing" \ <dockername>
-e CONVERT_FLAGS="--skip-existing": Recommended to skip conversion if output files already exist.-e VALIDATE_FLAGS="--validate_hashes": Currently not recommended for use with TCIA dataset; only use if input DICOM data is original, i.e. not additionally fully de-faced by TCIA.
- The PET images are converted into SUV units.
- The time differences between radiotracer administration and image acquisition for SUV conversion are provided in a JSON file. This is done to ensure reproducibility of SUV factors with the autoPET III challenge dataset. However, for general use, we recommend to calculate the time differences directly from the appropriate DICOM tags.
- The CT images are resampled to match PET resolution and spacing.
- Annotations correspond to manual segmentations of tumor lesions.
- The resulting dataset is directly compatible with nnUNet and the AutoPET III challenge pipeline.
- The resulting dataset only contains the PSMA-PET/CT subset of the AutoPET III challenge dataset. Links for the FDG-PET/CT subset are provided in the Related Data and Tools section below.
- If the TCIA DICOM dataset is used as source, the NIfTI files produced by this code will not be identical to the official challenge data, as the TCIA source DICOM files are fully de-faced, while the challenge dataset is not.
-
FDG-PET/CT Dataset: FDG-PET-CT-Lesions
-
FDG Conversion Codebase: lab-midas/TCIA_processing
This repository and associated scripts are released under the MIT License.