improvements to XNAT data fetching

Currently the XNAT data is fetched using a set of functions that are specific to csc-mlops.

It would be good to investigate whether we can create a class that inherits from the base torch dataset type to facilitate integration with other torch tools. This would reduce a lot of boilerplate and let us roll in validation functions into the dataset object. 

e.g. something like this allows us to use multiple xnat projects easily, we can inherit most of the dataet functionality from CacheDatset (the cache can be disabled by setting the cache_rate to 0.0). 

```python
from monai.util.data import CacheDataset

class XNATDataset(CacheDatset):
    def __init__(self, xnat_configuration, **kwargs)
        super etc
```

The dataset could include functions for validating data (checking all subjects return appropriate data objects etc). Then could be used like this:

```python
from mlops.data import XNATDataset

training_data = XNATDataset(project_name, actions, xnat_configuration, transforms, workers, etc)

test_data = XNATDataset(holdout_data_project_name, actions, xnat_configuration, transforms, workers, etc)

train_dl = Dataloader(training_data)
```


This would require some exploratory work to check it all looks good at works with pytorch lightning/monai etc but would be really useful in simplifying the Datamodule structure.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improvements to XNAT data fetching #153

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

improvements to XNAT data fetching #153

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions