Inconsistent behaviour of ImageRecordIter depending on encoding format png vs jpeg #15710

anotinelg · 2019-07-31T13:53:51Z

anotinelg
Jul 31, 2019

Description

I was checking behaviour of ImageRecordIter on .rec file I have created myself.
I have seen that depending on how I created .rec the behaviour of ImageRecordIter is different in the way it handles the order of the color channels.

I create with png format the ImageRecordIter which by default fall back on the opencv encoding function, and thus inherits the default channel format which is BGR. However if I pack the image with the "jpeg" format, it will use the RGB order. Is this inconsistency expected? Can we pass a flag so that it always retrieve the RBG format?

Environment info (Required)

MXNet = 1.4.0
opencv-python==4.1.0.25

Reproducible step

Create the rec file with png format

record = mx.recordio.MXRecordIO("temp_png.rec", 'w')
header = mx.recordio.IRHeader(0, label, id, 0)
img = np.ones(shape) * 255
img[0][0][0] = 0
img[0][0][0] = 1
img[0][0][0] =2
packed_s = mx.recordio.pack_img(header, img, quality=100, img_fmt=".png")
record.write(packed_s)
record.close()

Create the rec file with jpeg format

record = mx.recordio.MXRecordIO("temp_png.rec", 'w')
header = mx.recordio.IRHeader(0, label, id, 0)
img = np.ones((5,5,3)) * 255
img[0][0][0] = 0
img[0][0][0] = 1
img[0][0][0] = 2
packed_s = mx.recordio.pack_img(header, img, quality=100, img_fmt=".jpeg") # default
record.write(packed_s)
record.close()

iter = ImageRecordIter(path_imgrec="temp_png.rec", data_shape=(3,5,5))
batch = iter.next()
#check that the first channel for first pixel is 2
#check that the second channel for first pixel is 1
check that the third channel for first pixel is 0

iter = ImageRecordIter(path_imgrec="temp_jpeg.rec", data_shape=(3,5,5))
batch = iter.next()
#check that the first channel for first pixel is 0
#check that the second channel for first pixel is 1
#check that the third channel for first pixel is 2

vrakesh · 2019-07-31T21:30:43Z

vrakesh
Jul 31, 2019

@anotinelg Thank you for query, a member of the community will get back to you on this
@mxnet-label-bot add [Question]

0 replies

kshitij12345 · 2019-08-01T18:29:59Z

kshitij12345
Aug 1, 2019

Hi,

I have tried the code with slight refactor.

import mxnet as mx
import numpy as np
shape = (3, 3, 3)

label = 1

record = mx.recordio.MXRecordIO("temp_png.rec", 'w')
header = mx.recordio.IRHeader(0, label, 1, 0)
img = np.ones(shape) * 255
img[0][0][0] = 0
img[0][0][1] = 1
img[0][0][2] = 2
packed_s = mx.recordio.pack_img(header, img, quality=9, img_fmt=".png")
record.write(packed_s)
record.close()

iter = mx.io.ImageRecordIter(path_imgrec="temp_png.rec", data_shape=shape, batch_size=1)
batch = iter.next()
data = batch.data[0]
print(data.shape)
print(mx.ndarray.transpose(data, (0, 2, 3, 1))) 

record = mx.recordio.MXRecordIO("temp_jpeg.rec", 'w')
header = mx.recordio.IRHeader(0, label, 1, 0)
img = np.ones(shape) * 255
img[0][0][0] = 0
img[0][0][1] = 1
img[0][0][2] = 2
packed_s = mx.recordio.pack_img(header, img, quality=100, img_fmt=".jpeg") # default
record.write(packed_s)
record.close()

iter = mx.io.ImageRecordIter(path_imgrec="temp_jpeg.rec", data_shape=shape, batch_size=1)
batch = iter.next()
data = batch.data[0]
print(data.shape)
print(mx.ndarray.transpose(data, (0, 2, 3, 1)))

Output that I get is

(1, 3, 3, 3)

[[[[  2.   1.   0.]
   [255. 255. 255.]
   [255. 255. 255.]]

  [[255. 255. 255.]
   [255. 255. 255.]
   [255. 255. 255.]]

  [[255. 255. 255.]
   [255. 255. 255.]
   [255. 255. 255.]]]]
<NDArray 1x3x3x3 @cpu(0)>
[23:48:32] ../src/io/iter_image_recordio_2.cc:178: ImageRecordIOParser2: temp_jpeg.rec, use 4 threads for decoding..
(1, 3, 3, 3)

[[[[  0.   0.   0.]
   [255. 255. 255.]
   [255. 255. 255.]]

  [[255. 255. 255.]
   [255. 255. 255.]
   [255. 255. 255.]]

  [[255. 255. 255.]
   [255. 255. 255.]
   [255. 255. 255.]]]]
<NDArray 1x3x3x3 @cpu(0)>

Note that the zero's are probably due to the .jpeg being lossy format.
To verify that

import numpy as np
import cv2
a = np.ones((3,3,3))
a *= 255
a[0][0][2] = 2
a[0][0][1] = 1
a[0][0][0] = 0
print(a)
print('-------------------')
ret, buf = cv2.imencode('.JPG', a, [cv2.IMWRITE_JPEG_QUALITY, 100])
print(cv2.imdecode(buf, cv2.IMREAD_UNCHANGED))

which outputs

[[[  0.   1.   2.]
  [255. 255. 255.]
  [255. 255. 255.]]

 [[255. 255. 255.]
  [255. 255. 255.]
  [255. 255. 255.]]

 [[255. 255. 255.]
  [255. 255. 255.]
  [255. 255. 255.]]]
-------------------
[[[  0   0   0]
  [255 255 255]
  [255 255 255]]

 [[255 255 255]
  [255 255 255]
  [255 255 255]]

 [[255 255 255]
  [255 255 255]
  [255 255 255]]]

So I don't think there is a problem with order.

Also bonus for reading till here.
The actual pack_img code that encodes doesn't play with the channel order.
https://github.com/apache/incubator-mxnet/blob/42a47b1cbd53026b7c69c915d17d507f6bc512d6/python/mxnet/recordio.py#L470-L509

0 replies

anotinelg · 2019-08-02T10:02:53Z

anotinelg
Aug 2, 2019
Author

thanks a lot @kshitij12345 for your reply.

First of all, I apologise for the code I put in the message, which was not even runnable (I have checked then)!
Second, I am little bit confused on the results:

I think my error is to have found a difference in behaviour between png and jpeg. I have re run my code, and now I do not find any difference (except that the lossy format of jpeg makes that the value are different. NOTA: if we use (0,100,200) instead of [0,1,2] we can easily guess where the channels are)
But in someway, I clearly see that there is an inversion of order in the channel, It looks like that the ImageRecordIter reorders the channel.
in your results, your input image is:

[[[  0.   1.   2.]
  [255. 255. 255.]
  [255. 255. 255.]]

 [[255. 255. 255.]
  [255. 255. 255.]
  [255. 255. 255.]]

 [[255. 255. 255.]
  [255. 255. 255.]
  [255. 255. 255.]]]

and your output image after transposing to (H;W;C) is

[[[[  2.   1.   0.]
   [255. 255. 255.]
   [255. 255. 255.]]

  [[255. 255. 255.]
   [255. 255. 255.]
   [255. 255. 255.]]

  [[255. 255. 255.]
   [255. 255. 255.]
   [255. 255. 255.]]]]

Am I missing something?
Thanks

0 replies

kshitij12345 · 2019-08-02T17:42:26Z

kshitij12345
Aug 2, 2019

Hi,

NOTA: if we use (0,100,200) instead of [0,1,2] we can easily guess where the channels are)

Thanks. Nice Idea.

ImageRecordIter gives you data with channels first as mentioned in the docs.
https://mxnet.incubator.apache.org/versions/master/api/python/io/io.html#mxnet.io.ImageRecordIter

Also you are correct that all channels are flipped i.e. if original is BGR then after loading it is RGB. Maybe due to use of OpenCV. (Not really sure).
So you are indeed correct that the two transformations are channel first and also the flipping of channels.

Note : The behaviour is same for JPEG as well as PNG.

import mxnet as mx
import numpy as np

shape = (3, 3, 3)
label = 1

record = mx.recordio.MXRecordIO("temp_png.rec", 'w')
header = mx.recordio.IRHeader(0, label, 1, 0)
img = np.ones(shape) * 255
img[:, :, 0] = 0
img[:, :, 1] = 100
img[:, :, 2] = 200
print('------Original---------------')
print(img)
packed_s = mx.recordio.pack_img(header, img, quality=9, img_fmt=".png")
record.write(packed_s)
record.close()

iter = mx.io.ImageRecordIter(path_imgrec="temp_png.rec", data_shape=shape, batch_size=1)
batch = iter.next()
data = batch.data[0]
print('--------PNG---------------')
print(data.shape)
recover_data = mx.ndarray.transpose(data, (0, 3, 2, 1)) # From channels first to channel last.
print(recover_data[:,:,:,::-1]) # Reverse the channels.
# Same as original

record = mx.recordio.MXRecordIO("temp_jpeg.rec", 'w')
header = mx.recordio.IRHeader(0, label, 1, 0)
img = np.ones(shape) * 255
img[:, :, 0] = 0
img[:, :, 1] = 100
img[:, :, 2] = 200
packed_s = mx.recordio.pack_img(header, img, quality=100, img_fmt=".jpeg") # default
record.write(packed_s)
record.close()

iter = mx.io.ImageRecordIter(path_imgrec="temp_jpeg.rec", data_shape=shape, batch_size=1)
batch = iter.next()
data = batch.data[0]
print('-------JPEG---------------')
print(data.shape)
recover_data = mx.ndarray.transpose(data, (0, 3, 2, 1))  # From channels first to channel last.
print(recover_data[[:,:,:,::-1]) # Reverse the channels.
#  Same as original

Output

------Original---------------
[[[  0. 100. 200.]
  [  0. 100. 200.]
  [  0. 100. 200.]]

 [[  0. 100. 200.]
  [  0. 100. 200.]
  [  0. 100. 200.]]

 [[  0. 100. 200.]
  [  0. 100. 200.]
  [  0. 100. 200.]]]

--------PNG---------------
(1, 3, 3, 3)

[[[[  0. 100. 200.]
   [  0. 100. 200.]
   [  0. 100. 200.]]

  [[  0. 100. 200.]
   [  0. 100. 200.]
   [  0. 100. 200.]]

  [[  0. 100. 200.]
   [  0. 100. 200.]
   [  0. 100. 200.]]]]
<NDArray 1x3x3x3 @cpu(0)>

-------JPEG---------------
(1, 3, 3, 3)

[[[[  0. 100. 199.]
   [  0. 100. 199.]
   [  0. 100. 199.]]

  [[  0. 100. 199.]
   [  0. 100. 199.]
   [  0. 100. 199.]]

  [[  0. 100. 199.]
   [  0. 100. 199.]
   [  0. 100. 199.]]]]
<NDArray 1x3x3x3 @cpu(0)>

0 replies

anotinelg · 2019-08-05T08:52:05Z

anotinelg
Aug 5, 2019
Author

Thanks for the confirmation. But then I have a concern and this is about the color normalisation we apply with imageRecordIter. It seems that it does not apply the coefficient mean_[CHANNEL] and std_[CHANNEL] to the correct one:

        import mxnet as mx
        import numpy as np

        # Create rec with png packing
        shape = (3, 3, 3)
        label = 1
        record = mx.recordio.MXRecordIO("temp_png.rec", 'w')
        header = mx.recordio.IRHeader(0, label, 1, 0)
        img = np.ones(shape) * 255
        img[0][0][0] = 0
        img[0][0][1] = 100
        img[0][0][2] = 200
        print(img)
        packed_s = mx.recordio.pack_img(header, img, quality=9, img_fmt=".png")
        record.write(packed_s)
        record.close()

        # test the imageRecordIter, with color normalisation only on BLUE Channel
        iter = mx.io.ImageRecordIter(path_imgrec="temp_png.rec", data_shape=shape, batch_size=1,
                                     mean_r=0,
                                     mean_g=0,
                                     mean_b=50,
                                     std_r=1,
                                     std_g=1,
                                     std_b=150
                                     )
        batch = iter.next()
        data = batch.data[0]
        print(data.shape)
        print(data)
        data = mx.ndarray.transpose(data, (0, 2, 3, 1))
        # imageRecordIter invert channel from RGB to BGR
        img_after = data.asnumpy()[0]
        print(img_after[0][0])
        self.assertTrue(img_after[0][0][0] == 1)
        self.assertTrue(img_after[0][0][1] == 100)
        self.assertTrue(img_after[0][0][2] == 0)

The output are:

[[[  0. 100. 200.]
  [255. 255. 255.]
  [255. 255. 255.]]

 [[255. 255. 255.]
  [255. 255. 255.]
  [255. 255. 255.]]

 [[255. 255. 255.]
  [255. 255. 255.]
  [255. 255. 255.]]]
[10:46:42] src/io/iter_image_recordio_2.cc:170: ImageRecordIOParser2: temp_png.rec, use 1 threads for decoding..
(1, 3, 3, 3)

[[[[200.         255.         255.        ]
   [255.         255.         255.        ]
   [255.         255.         255.        ]]

  [[100.         255.         255.        ]
   [255.         255.         255.        ]
   [255.         255.         255.        ]]

  [[ -0.33333334   1.3666667    1.3666667 ]
   [  1.3666667    1.3666667    1.3666667 ]
   [  1.3666667    1.3666667    1.3666667 ]]]]
<NDArray 1x3x3x3 @cpu_pinned(0)>
[200.         100.          -0.33333334]

It seems that the mean_r and std_r are applied on the Red Channel, instead of the blue channel as specified in the call of the imageRecordIter function

Is it a bug?

0 replies

anotinelg · 2019-08-06T07:55:03Z

anotinelg
Aug 6, 2019
Author

What I think is that imageRecordIter is meant to use with .rec files saved with image in BGR format, and outputs data in RGB as gluon network expects. (By the way can you confirm that assumption on gluon?)

Actually im2rec.py scripts given by mxnet reads the image with opencv (cv2.imread) if the "pass-through" parameters is true, and pack it into the .rec files. So in that case, the .rec files stored data in BGR (opencv format), imageRecordIter correctly applies color normalisation on the correct channel, and flip back the channel for the output image to be in RGB.

What it seems wrong, is that im2rec.py does not always flip the channel, for example, when the "pass-through" parameters is not set. And in the documentation of ImageRecordIter, there is no trace of BGR vs RGB format.

1 reply

atibaup Oct 29, 2020

It's a funny coincidence @anotinelg that right now I was struggling with exactly the same issue and I just stumbled upon this issue :D .

kshitij12345 · 2019-08-06T18:53:22Z

kshitij12345
Aug 6, 2019

Sorry, I haven't been able to go through, im2rec.py.

But I don't think it should be an issue.
Though OpenCV itself maps the image with BGR format in array, the encoded image is standard.

packed_s = mx.recordio.pack_img(header, img, quality=9, img_fmt=".png")
# packed is not an array but encoded representation of the image which is standard.
record.write(packed_s)

So if I am correct, it is written in the standard way and when it is read, even if it is restored as RGB, the mappings of the channel are correct.
We can verify that by slightly updating the previous script

import mxnet as mx
import numpy as np

shape = (3, 3, 3)
label = 1

record = mx.recordio.MXRecordIO("temp_png.rec", 'w')
header = mx.recordio.IRHeader(0, label, 1, 0)
img = np.ones(shape) * 255
img[:, :, 0] = 0 # B
img[:, :, 1] = 100 # G
img[:, :, 2] = 200 # R
print('------Original---------------')
print(img)
packed_s = mx.recordio.pack_img(header, img, quality=9, img_fmt=".png")
record.write(packed_s)
record.close()

iter = mx.io.ImageRecordIter(path_imgrec="temp_png.rec", data_shape=shape, batch_size=1, mean_r = 50)
batch = iter.next()
data = batch.data[0]
print('--------PNG---------------')
print(data.shape)
recover_data = mx.ndarray.transpose(data, (0, 3, 2, 1)) # From channels first to channel last.
print(recover_data[:,:,:,::-1]) # Reverse the channels.
# Same as original

Output

------Original---------------
[[[  0. 100. 200.]
  [  0. 100. 200.]
  [  0. 100. 200.]]

 [[  0. 100. 200.]
  [  0. 100. 200.]
  [  0. 100. 200.]]

 [[  0. 100. 200.]
  [  0. 100. 200.]
  [  0. 100. 200.]]]

--------PNG---------------
(1, 3, 3, 3)

[[[[  0. 100. 150.]
   [  0. 100. 150.]
   [  0. 100. 150.]]

  [[  0. 100. 150.]
   [  0. 100. 150.]
   [  0. 100. 150.]]

  [[  0. 100. 150.]

Notice that mean has been correctly subtracted from the R-channel.

0 replies

anotinelg · 2019-08-06T21:32:04Z

anotinelg
Aug 6, 2019
Author

I agree with you, but this works only if you pack the image in the .rec file in format BGR (as you did in the example)

img[:, :, 0] = 0 # B
img[:, :, 1] = 100 # G
img[:, :, 2] = 200 # R

It does NOT work if you pack the img into the .rec files in RGB format, the normalisation will be applied to wrong channel.

So I think this should be documented in the imageRecordIter class

0 replies

kshitij12345 · 2019-08-07T09:13:28Z

kshitij12345
Aug 7, 2019

What you are saying makes sense as pack_img calls cv2.imencode which is expecting BGR format.

So I think this should be documented in the imageRecordIter class

Agreed.

I'll ping one of the codeowner just in case we are missing something.
@larroy Could you please help.

0 replies

ugurkanates · 2020-11-26T21:01:23Z

ugurkanates
Nov 26, 2020

I had same issue sadly , i think this is kinda major to fix no ?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent behaviour of ImageRecordIter depending on encoding format png vs jpeg #15710

{{title}}

Replies: 10 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Inconsistent behaviour of ImageRecordIter depending on encoding format png vs jpeg #15710

Description

Environment info (Required)

Reproducible step

Replies: 10 comments · 1 reply

anotinelg Aug 2, 2019 Author

anotinelg Aug 5, 2019 Author

anotinelg Aug 6, 2019 Author

anotinelg Aug 6, 2019 Author

Replies: 10 comments 1 reply

anotinelg
Aug 2, 2019
Author

anotinelg
Aug 5, 2019
Author

anotinelg
Aug 6, 2019
Author

anotinelg
Aug 6, 2019
Author