feat(iba): IBA::perpixel_op #4299

lgritz · 2024-06-17T20:37:27Z

Inspired by a question by Vlad Erium, I have added a simpler way for C++ users of OIIO to construct IBA-like functions for simple unary and binary operations on ImageBufs where each pixel is independent and based only on the corresponding pixel of the input(s).

The user only needs to supply the contents of the inner loop, i.e. just doing one pixel's work, and only needs to work for float values. All format conversion, sizing and allocation of the destination buffer, looping over pixels, and multithreading is automatic.

If the actual buffers in question are not float-based, conversions will happen automatically, at about a 2x slowdown compared to everything being in float all along, which seems reasonable for the extreme simplicity, especially for use cases where the buffers are fairly likely to be float anyway.

What you pass is a function or lambda that takes spans for the output and input pixel values. Here's an example that adds two images channel by channel, producing a sum image:

// Assume ImageBuf A, B are the inputs, ImageBuf R is the output
R = ImageBufAlgo::perpixel_op(A, B,
        [](span<float> r, cspan<float> a, cspan<float> b) {
            for (size_t c = 0, nc = size_t(r.size()); c < nc; ++c)
                r[c] = a[c] + b[c];
            return true;
        });

This is exactly equivalent to calling

R = ImageBufAlgo::add(A, B);

and for float IB's, it's just as fast.

To make the not-float case fast and not require the DISPATCH macro magic, I needed to change the ImageBuf::Iterator just a bit to add store() and load() method templates to the iterators, and add a field that holds the buffer type. That might make a slight ABI tweak, so I am thinking that I will make this for the upcoming OIIO 3.0, and not backport to the release branch.

I think this is ready to introduce at this time, but I'm also studying whether more varieties of this approach are needed, whether the non-float case can be sped up even more, and whether some of the existing IBA functions should switch to using this internally (good candidates would be those that are almost always performed on float buffers, but for which the heavy template expansion of the DISPATCH approach to handling the full type zoo currently makes them very bloated and expensive to compile, for very little real-world gain).

We should probably consider this to be experimental for a little while, just in case the function signature for this changes as I think about it more or add functionality.

lgritz · 2024-06-17T20:38:30Z

I haven't added docs yet, I will do so by the time this is merged. I'm still struggling with exactly where to explain it.

Inspired by a question by Vlad Erium, I have added a simpler way for C++ users of OIIO to construct IBA-like functions for simple unary and binary operations on ImageBufs where each pixel is independent and based only on the corresponding pixel of the input(s). The user only needs to supply the contents of the inner loop, i.e. just doing one pixel's work, and only needs to work for float values. All format conversion, sizing and allocation of the destination buffer, looping over pixels, and multithreading is automatic. If the actual buffers in question are not float-based, conversions will happen automatically, at about a 2x slowdown compared to everything being in float all along, which seems reasonable for the extreme simplicity, especially for use cases where the buffers are fairly likely to be float anyway. What you pass is a function or lambda that takes spans for the output and input pixel values. Here's an example that adds two images channel by channel, producing a sum image: // Assume ImageBuf A, B are the inputs, ImageBuf R is the output R = ImageBufAlgo::perpixel_op(A, B, [](span<float> r, cspan<float> a, cspan<float> b) { for (size_t c = 0, nc = size_t(r.size()); c < nc; ++c) r[c] = a[c] + b[c]; return true; }); This is exactly equivalent to calling R = ImageBufAlgo::add(A, B); and for float IB's, it's just as fast. To make the not-float case fast and not require the DISPATCH macro magic, I needed to change the ImageBuf::Iterator just a bit to add store() and load() method templates to the iterators, and add a field that holds the buffer type. That might make a slight ABI tweak, so I am thinking that I will make this for the upcoming OIIO 3.0, and not backport to the release branch. I think this is ready to introduce at this time, but I'm also studying whether more varieties of this approach are needed, whether the non-float case can be sped up even more, and whether some of the existing IBA functions should switch to using this internally (good candidates would be those that are almost always performed on float buffers, but for which the heavy template expansion of the DISPATCH approach to handling the full type zoo currently makes them very bloated and expensive to compile, for very little real-world gain). We should probably consider this to be experimental for a little while, just in case the function signature for this changes as I think about it more or add functionality. Signed-off-by: Larry Gritz <[email protected]>

Signed-off-by: Larry Gritz <[email protected]>

lgritz · 2024-06-21T15:40:01Z

I just pushed an update in which I do a bit of an overhaul of this: I made the perpixel_op functions take a KWArgs options. This subsumes the prepflags and nthreads arguments and leaves room for lots of expansion going forward without needing to change the function call ABI.

I also added new version of IBAPrep (I am seeking a better name, BTW, ideas appreciated) that takes a KWArgs rather than an integer full of bit flags. Again, I think that leaves more room for expansion in the future, including options that aren't just boolean bits.

lgritz added 3 commits June 20, 2024 22:35

paramlist.h: Add ParamValueSpan::get_bool()

b5eea18

Signed-off-by: Larry Gritz <[email protected]>

TRY: alternate IBAPrep and perpixel_op using KWArgs approach

f74f0ac

Signed-off-by: Larry Gritz <[email protected]>

lgritz force-pushed the lg-ibaperpixel branch from d943a6c to f74f0ac Compare June 21, 2024 15:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(iba): IBA::perpixel_op #4299

feat(iba): IBA::perpixel_op #4299

lgritz commented Jun 17, 2024

lgritz commented Jun 17, 2024

lgritz commented Jun 21, 2024

feat(iba): IBA::perpixel_op #4299

Are you sure you want to change the base?

feat(iba): IBA::perpixel_op #4299

Conversation

lgritz commented Jun 17, 2024

lgritz commented Jun 17, 2024

lgritz commented Jun 21, 2024