-
Notifications
You must be signed in to change notification settings - Fork 350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Digital typhoon dataset #1748
Conversation
This is really cool! I wonder if there is any generalization between this and the Cyclone dataset |
Stay tuned:) |
) | ||
) | ||
|
||
# torchgeo expects a single label |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can fix this if needed. I'm trying to add support for multi-label classification in #2219
# tensor with added channel dimension | ||
tensor = torch.from_numpy(h5f['Infrared'][:]).unsqueeze(0) | ||
|
||
# follow normalization procedure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if this should be in the datamodule instead. Possibly also true for other stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it has to be part of the dataset because the normalization values can change. In the datamodule it will become quiet messy, and overall I think it's much nicer if normalization happens in the dataset and any augmentations in the datamodule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only complaint is that the user can't perform their own normalization, or if they do it is now duplicated. Definitely something worth discussing more broadly though (not just in your PR, for all datasets).
name: torch.tensor(feature_df[name].item()).float() | ||
for name in self.features | ||
} | ||
# normalize the targets for regression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good now. Only remaining comments worth addressing before we merge are:
- list -> tuple
- Dataset name
2 in particular is important to avoid API changes in the future. All other comments can be changed later.
Inspired by this rename I |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just need to get minimum tests passing now.
This PR adds the Digital Typhoon Dataset.
The implementation allows the following features:
TODO:
Sample Image: