Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image augmentation in dataset #207

Open
antimora opened this issue Mar 7, 2023 · 4 comments
Open

Image augmentation in dataset #207

antimora opened this issue Mar 7, 2023 · 4 comments
Labels
feature The feature request

Comments

@antimora
Copy link
Collaborator

antimora commented Mar 7, 2023

Feature description

Similar to Pytorch's vision utility, it'd be great if burn provides with interface and a few useful image augmentation utilities, such as random image rotation.

Feature motivation

Instead of having users coming up with their own utils, it would be beneficial for all if Burn's dataset supports data augmentation.

(Optional) Suggest a Solution

For image augmentation we could use these image backends: 1) https://lib.rs/gh/imazen/imageflow/imageflow_core or 2)https://github.com/image-rs/imageproc

@antimora antimora added the feature The feature request label Apr 2, 2023
@laggui
Copy link
Member

laggui commented Jan 15, 2024

Just leaving this here for future reference.

For some reason imageflow is not available on crates.io (ref issue).

I was just browsing the other possible crates and I found photon (extends image and imageproc crates). It already seems to implement different image transformations. Also supports WASM.

Also found zune-image with zune-imageprocs, which seems to have a focus on performance and support for different image types (u8, u16 and f32), as opposed to photon which at first glance seems to currently work on RGBA images only.

@ronofays
Copy link

ronofays commented Feb 12, 2024

Hello, I have been working on a first pass attempt for getting some vision utility with burn tensors.

I'm somewhat new to the open source space, so I am not sure whether what I have done so far warrants a PR, but you can find it in my fork of burn, under burn-vision: https://github.com/ronofays/burn/tree/burn-vision/burn-vision

Currently, I have three things done:

  • read rgba images as tensors (omitting alpha channel for now)
  • converting tensors back to rgba images
  • padding for tensors with 3 color channels

read/write is in io.rs
padding is in transforms.rs

Obviously a lot needs to be done to handle images that are not u8 rgba images, but I would be thrilled to have some feedback on the start that I have here, particularly if there are some code style things for burn that I am missing.

Thanks!

@antimora
Copy link
Collaborator Author

@ronofays Thanks for sharing your work! Yes, any addition in this space would be awesome.

I suggest you create a draft PR and submit to the project. It's easier to see your new additions and we can context comment.

@antimora
Copy link
Collaborator Author

Linking related ticket: #2361

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature The feature request
Projects
None yet
Development

No branches or pull requests

3 participants