Slicing of large (compressed) 4D files #1408
Replies: 1 comment 1 reply
-
Do you have NIfTI is Fortran-ordered, so volumes are the slowest-changing index, and hence contiguous. There's nothing special about the compression; it's just gzip, and there's no chunking to improve slicing. You cannot reorder axes in NIfTI. You can do that in other formats. You could also convert to a generic format like .zarr, which allows chunked compression to improve data access patterns. @balbasty has written a spec called NIfTI-Zarr (neuroscales/nifti-zarr#7) which allows lossless preservation of NIfTI metadata, so you can easily go back and forth. I am surprised that coronal/sagittal slicing is slow, as I haven't really experienced that. That's worth looking into, but it may just be that loading chunks with such large strides is inefficient over 30GB. If you're having trouble loading 30GB into memory, it could be that swapping and garbage collection are where you're actually spending your time. |
Beta Was this translation helpful? Give feedback.
-
I am working with dynamic 4D PET data (x,y,z,t), and one challenge is that it is not possible to do load the full data onto memory (often +30gb files).
Luckily, most analyses in dynamic PET can be performed independently on subsets of the array by either slicing at a specific time point (
t=t1
) or spatial location (z=z1
). I want to show you some cool results and hear you out on your general approach to this problem - maybe I missed another handy approach. I use.dataobj
to access the stored array dynamically.My example files:
img.nii (30gb) shape=(440,440,645,62)
img.nii.gz (2.5gb)
(identical but compressed version)The compressed version was created by
nib.save(nib.load("img.nii"),"img.nii.gz")
Now accessing the data:
I assume this has something to do with how the data is stored on disk (fortran/C-layout), but maybe I am wrong.
For fun, I also tried this with the
.gz
compressed file:I was very surprised that it is possible to slice on the time dimension. I assumed that the compression would make it impossible to do this kind of dynamic slicing and loading of data subsets from disk. Interestingly, it is not possible to slice in the axial dimension. My guess is that 4D images are compressed in way where the x-y-z 3D volumes are compressed independently, which makes it possible to quickly obtain these arrays for a specified
t
.Some algorithms (for instance voxel-wise Patlak modelling) require that I can slice on one of the spatial dimensions to obtain an x-y-t 3D array. This is possible with the .nii file, but I would love to get it to work with the
.nii.gz
file due to the significantly lower disk footprint. My idea is to somehow transpose either the data array or the save-order on the disk. If I am able to save it as (x,y,t,z) or (t,z,y,x) , then my hope is that the compression algorithm will make it possible to slice on the last spatial dimension!However, I was unable to find a good way to do this. Any ideas? :)
Beta Was this translation helpful? Give feedback.
All reactions