Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] support reading generic images #154

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions crates/kornia-image/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,8 @@ pub mod error;
/// module containing ops implementations.
pub mod ops;

/// re-export the safe tensor types so that is easy to use them
pub use kornia_core::SafeTensorType;

pub use crate::error::ImageError;
pub use crate::image::{Image, ImageSize};
4 changes: 4 additions & 0 deletions crates/kornia-io/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,8 @@ pub enum IoError {
/// Error to decode the image.
#[error("Failed to decode the image")]
ImageDecodeError(#[from] image::ImageError),

/// Error to decode the image.
#[error("Unsupported image format")]
UnsupportedImageFormat,
}
74 changes: 59 additions & 15 deletions crates/kornia-io/src/functional.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
use std::path::Path;
use std::{ops::Deref, path::Path};

Check warning on line 1 in crates/kornia-io/src/functional.rs

View workflow job for this annotation

GitHub Actions / Check

unused import: `ops::Deref`

Check warning on line 1 in crates/kornia-io/src/functional.rs

View workflow job for this annotation

GitHub Actions / Test Suite - x86_64-unknown-linux-gnu

unused import: `ops::Deref`

Check warning on line 1 in crates/kornia-io/src/functional.rs

View workflow job for this annotation

GitHub Actions / Test Suite - i686-unknown-linux-gnu

unused import: `ops::Deref`

Check warning on line 1 in crates/kornia-io/src/functional.rs

View workflow job for this annotation

GitHub Actions / Test Suite - aarch64-unknown-linux-gnu

unused import: `ops::Deref`

use kornia_image::{Image, ImageSize};
use kornia_image::{Image, ImageSize, SafeTensorType};

use crate::error::IoError;

Expand Down Expand Up @@ -77,17 +77,45 @@
Ok(())
}

/// A generic image type that can be any of the supported image formats.
pub enum GenericImage {
/// 8-bit grayscale image
L8(Image<u8, 1>),
/// 8-bit grayscale image with alpha channel
La8(Image<u8, 2>),
/// 8-bit RGB image
Rgb8(Image<u8, 3>),
/// 8-bit RGB image with alpha channel
Rgba8(Image<u8, 4>),
/// 16-bit grayscale image
L16(Image<u16, 1>),
/// 16-bit grayscale image with alpha channel
La16(Image<u16, 2>),
/// 16-bit RGB image
Rgb16(Image<u16, 3>),
/// 16-bit RGB image with alpha channel
Rgba16(Image<u16, 4>),
/// 32-bit float RGB image
Rgb32F(Image<f32, 3>),
/// 32-bit float RGB image with alpha channel
Rgba32F(Image<f32, 4>),
}

// NOTE: another option is to use define types for each of the image formats and then implement
// type Mono8 = Image<u8, 1>;
// type Rgb8 = Image<u8, 3>;

/// Reads an image from the given file path.
///
/// The method tries to read from any image format supported by the image crate.
///
/// # Arguments
///
/// * `file_path` - The path to the image.
/// * `file_path` - The path to a valid image file.
///
/// # Returns
///
/// A tensor containing the image data.
/// An image containing the image data.
///
/// # Example
///
Expand All @@ -101,7 +129,13 @@
/// assert_eq!(image.size().height, 195);
/// assert_eq!(image.num_channels(), 3);
/// ```
pub fn read_image_any(file_path: impl AsRef<Path>) -> Result<Image<u8, 3>, IoError> {
pub fn read_image_any<T, const C: usize>(
file_path: impl AsRef<Path>,
) -> Result<GenericImage, IoError>
where
T: SafeTensorType,
{
// resolve the file path correctly
let file_path = file_path.as_ref().to_owned();

// verify the file exists
Expand All @@ -110,25 +144,32 @@
}

// open the file and map it to memory
// TODO: explore whether we can use a more efficient memory mapping approach
let file = std::fs::File::open(file_path)?;
let mmap = unsafe { memmap2::Mmap::map(&file)? };

// decode the data directly from memory
// TODO: update the image crate
// TODO: explore supporting directly the decoders
#[allow(deprecated)]
let img = image::io::Reader::new(std::io::Cursor::new(&mmap))
.with_guessed_format()?
.decode()?;

// TODO: handle more image formats
// return the image data
let image = Image::new(
ImageSize {
width: img.width() as usize,
height: img.height() as usize,
},
img.to_rgb8().to_vec(),
)?;
let size = ImageSize {
width: img.width() as usize,
height: img.height() as usize,
};

let image = match img.color() {
image::ColorType::L8 => {
GenericImage::L8(Image::<u8, 1>::new(size, img.into_luma8().into_vec())?)
}
image::ColorType::Rgb8 => {
GenericImage::Rgb8(Image::<u8, 3>::new(size, img.into_rgb8().into_vec())?)
}
_ => return Err(IoError::UnsupportedImageFormat),
};

Ok(image)
}
Expand All @@ -137,13 +178,16 @@
mod tests {
use crate::error::IoError;
use crate::functional::read_image_any;
use kornia_image::Image;

#[cfg(feature = "jpegturbo")]
use crate::functional::{read_image_jpeg, write_image_jpeg};

#[test]
fn read_any() -> Result<(), IoError> {
let image = read_image_any("../../tests/data/dog.jpeg")?;
// let image: Image<u8, 3> = read_image_any("../../tests/data/dog.jpeg")?;
let image: super::GenericImage = read_image_any("../../tests/data/dog.jpeg")?;
// NOTE: then how to access the size? we need to reimplment the methods ??
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emilmgeorge @jandremarais do you have any ideas here how to solve this ? Initially I wanted somehow that the read function could return the Image struct so that we can use directly the underlying api

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about returning a Box<dyn AnyImage> with

trait AnyImage {
    fn size(&self) -> ImageSize;
}

impl<T: SafeTensorType, const C: usize> AnyImage for Image<T, C> {
    fn size(&self) -> ImageSize {
        self.size()
    }
}

Copy link
Member Author

@edgarriba edgarriba Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's another option, however wouldn't be difficult to scale/maintain this as soon as we increase the functionality in Image ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would be more effort than the current enum approach but I can give it more thought. What if the AnyImage trait owns all the image functionality? Then we don't have to implement it twice

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think using traits to decouple the functionality may be a good direction, although I haven't fully thought it through.

If considering/implementing the above needs more time, another limited option for now, is to use the enum type only for returning and force the user to convert the enum to a concrete type before further operations. We can provide functions like into_rgb8() -> Result<Image<u8, 3>, XXError> etc. in impl GenericImage {}.

Based on use cases:

  • User who does not care about the pixel type at all
    Not easily possible with this limited solution. User has to either handle all image types with match (maybe macros?) or convert to one specific type (as below).

    Trait based solutions may work for this use case.

  • User who cares only about the final image type

    let gimg: super::GenericImage = read_image_any("../../tests/data/dog.jpeg")?;
    let gray = gimg.into_l8()?;
    // Any operation for Image<u8, 1>
  • User who wants to handle only specific type images

    let gimg: super::GenericImage = read_image_any("../../tests/data/dog.jpeg")?;
    if let GenericImage::Rgb8(image) = gimg {
       // Any operation for Image<u8, 3>
    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that sounds a good option. Also in this direction, i want start getting rid of image-rs and implement directly the decoders we want, which can lead to have a pool of free functions to read from different formats in the format of read_rgb8 , read_l8, etc. Possibly by curating first png, jpeg, exrand maybetiff` which are the most common formats used in ml/robotics

assert_eq!(image.size().width, 258);
assert_eq!(image.size().height, 195);
Ok(())
Expand Down
Loading