Skip to content
Adam J. Stewart edited this page Jun 9, 2021 · 15 revisions

Datasets

There are many different ways in which we can classify our datasets. This classification allows us to create abstract base classes to ensure a uniform API for all subclasses.

Benchmark vs. Generic

  1. Benchmark: contains both images and targets (e.g. COWC, VHR-10, CV4A Kenya)
  2. Generic: contains only images or targets (e.g. Landsat, Sentinel, CDL, Chesapeake)

The problem with this classification is that we want to be able to combine two "generic" datasets to get a single "benchmark" dataset. For example, we need a way for users to specify an image source (e.g. Landsat, Sentinel) and a target source (e.g. CDL, Chesapeake). It isn't yet clear how one would do this.

Image vs. Target

  1. Image: contains raw images
  2. Target: contains ground truth targets

This makes it easy to combine "image" and "target" datasets into a single supervised learning problem, but what about datasets that contain both images and targets? Do we want to allow people to swap image or target sources in these kind of datasets?

Chip vs. Tile vs. Region

  1. Chip: pre-defined chips/patches (e.g. COWC, VHR-10, DOTA)
  2. Tile: possibly-overlapping tiles we need to sample chips/patches from (e.g. Landsat, Sentinel, CV4A Kenya)
  3. Region: static maps of stitched-together data (e.g. CDL, Chesapeake Land Cover, static Google Earth imagery)

Again, we need to be able to combine datasets from different categories into a single data loader.

Idea: what if we make our own DataLoader class that takes one or more Datasets! As long as we have a standard method for indexing these datasets, we can handle this. I've never seen a custom DataLoader before, but it should be doable to implement as a subclass. Alternatively, PyTorch has ConcatDataset, why not a horizontal concat instead of a vertical concat? So a Dataset that wraps around an ImageDataset and a TargetDataset.

Transforms

Torchvision uses PIL, which isn't compatible with multi-spectral imagery. Although some of our imagery isn't multi-spectral, we don't want to have to implement the same transforms for every possible data structure. Instead, we should probably standardize on torch Tensors. This also has the benefit that transforms can be run on the GPU. Does this mean we need to use nn.Module? See https://discuss.pytorch.org/t/state-of-the-art-for-torchvision-datasets-transforms-models-design/123625 for discussion on this.

Models

Radiant Earth + Planetary Computer is planning to distribute pre-trained models, we should support these.

Clone this wiki locally