Replies: 10 comments 1 reply
-
@anotinelg Thank you for query, a member of the community will get back to you on this |
Beta Was this translation helpful? Give feedback.
-
Hi, I have tried the code with slight refactor. import mxnet as mx
import numpy as np
shape = (3, 3, 3)
label = 1
record = mx.recordio.MXRecordIO("temp_png.rec", 'w')
header = mx.recordio.IRHeader(0, label, 1, 0)
img = np.ones(shape) * 255
img[0][0][0] = 0
img[0][0][1] = 1
img[0][0][2] = 2
packed_s = mx.recordio.pack_img(header, img, quality=9, img_fmt=".png")
record.write(packed_s)
record.close()
iter = mx.io.ImageRecordIter(path_imgrec="temp_png.rec", data_shape=shape, batch_size=1)
batch = iter.next()
data = batch.data[0]
print(data.shape)
print(mx.ndarray.transpose(data, (0, 2, 3, 1)))
record = mx.recordio.MXRecordIO("temp_jpeg.rec", 'w')
header = mx.recordio.IRHeader(0, label, 1, 0)
img = np.ones(shape) * 255
img[0][0][0] = 0
img[0][0][1] = 1
img[0][0][2] = 2
packed_s = mx.recordio.pack_img(header, img, quality=100, img_fmt=".jpeg") # default
record.write(packed_s)
record.close()
iter = mx.io.ImageRecordIter(path_imgrec="temp_jpeg.rec", data_shape=shape, batch_size=1)
batch = iter.next()
data = batch.data[0]
print(data.shape)
print(mx.ndarray.transpose(data, (0, 2, 3, 1))) Output that I get is
Note that the zero's are probably due to the import numpy as np
import cv2
a = np.ones((3,3,3))
a *= 255
a[0][0][2] = 2
a[0][0][1] = 1
a[0][0][0] = 0
print(a)
print('-------------------')
ret, buf = cv2.imencode('.JPG', a, [cv2.IMWRITE_JPEG_QUALITY, 100])
print(cv2.imdecode(buf, cv2.IMREAD_UNCHANGED)) which outputs
So I don't think there is a problem with order. Also bonus for reading till here. |
Beta Was this translation helpful? Give feedback.
-
thanks a lot @kshitij12345 for your reply. First of all, I apologise for the code I put in the message, which was not even runnable (I have checked then)!
and your output image after transposing to (H;W;C) is
Am I missing something? |
Beta Was this translation helpful? Give feedback.
-
Hi,
Thanks. Nice Idea. ImageRecordIter gives you data with channels first as mentioned in the docs. Also you are correct that all channels are flipped i.e. if original is BGR then after loading it is RGB. Maybe due to use of OpenCV. (Not really sure). Note : The behaviour is same for JPEG as well as PNG. import mxnet as mx
import numpy as np
shape = (3, 3, 3)
label = 1
record = mx.recordio.MXRecordIO("temp_png.rec", 'w')
header = mx.recordio.IRHeader(0, label, 1, 0)
img = np.ones(shape) * 255
img[:, :, 0] = 0
img[:, :, 1] = 100
img[:, :, 2] = 200
print('------Original---------------')
print(img)
packed_s = mx.recordio.pack_img(header, img, quality=9, img_fmt=".png")
record.write(packed_s)
record.close()
iter = mx.io.ImageRecordIter(path_imgrec="temp_png.rec", data_shape=shape, batch_size=1)
batch = iter.next()
data = batch.data[0]
print('--------PNG---------------')
print(data.shape)
recover_data = mx.ndarray.transpose(data, (0, 3, 2, 1)) # From channels first to channel last.
print(recover_data[:,:,:,::-1]) # Reverse the channels.
# Same as original
record = mx.recordio.MXRecordIO("temp_jpeg.rec", 'w')
header = mx.recordio.IRHeader(0, label, 1, 0)
img = np.ones(shape) * 255
img[:, :, 0] = 0
img[:, :, 1] = 100
img[:, :, 2] = 200
packed_s = mx.recordio.pack_img(header, img, quality=100, img_fmt=".jpeg") # default
record.write(packed_s)
record.close()
iter = mx.io.ImageRecordIter(path_imgrec="temp_jpeg.rec", data_shape=shape, batch_size=1)
batch = iter.next()
data = batch.data[0]
print('-------JPEG---------------')
print(data.shape)
recover_data = mx.ndarray.transpose(data, (0, 3, 2, 1)) # From channels first to channel last.
print(recover_data[[:,:,:,::-1]) # Reverse the channels.
# Same as original Output
|
Beta Was this translation helpful? Give feedback.
-
Thanks for the confirmation. But then I have a concern and this is about the color normalisation we apply with imageRecordIter. It seems that it does not apply the coefficient mean_[CHANNEL] and std_[CHANNEL] to the correct one:
The output are:
It seems that the mean_r and std_r are applied on the Red Channel, instead of the blue channel as specified in the call of the imageRecordIter function Is it a bug? |
Beta Was this translation helpful? Give feedback.
-
What I think is that imageRecordIter is meant to use with .rec files saved with image in BGR format, and outputs data in RGB as gluon network expects. (By the way can you confirm that assumption on gluon?) Actually im2rec.py scripts given by mxnet reads the image with opencv (cv2.imread) if the "pass-through" parameters is true, and pack it into the .rec files. So in that case, the .rec files stored data in BGR (opencv format), imageRecordIter correctly applies color normalisation on the correct channel, and flip back the channel for the output image to be in RGB. What it seems wrong, is that im2rec.py does not always flip the channel, for example, when the "pass-through" parameters is not set. And in the documentation of ImageRecordIter, there is no trace of BGR vs RGB format. |
Beta Was this translation helpful? Give feedback.
-
Sorry, I haven't been able to go through, But I don't think it should be an issue. packed_s = mx.recordio.pack_img(header, img, quality=9, img_fmt=".png")
# packed is not an array but encoded representation of the image which is standard.
record.write(packed_s) So if I am correct, it is written in the standard way and when it is read, even if it is restored as RGB, the mappings of the channel are correct. import mxnet as mx
import numpy as np
shape = (3, 3, 3)
label = 1
record = mx.recordio.MXRecordIO("temp_png.rec", 'w')
header = mx.recordio.IRHeader(0, label, 1, 0)
img = np.ones(shape) * 255
img[:, :, 0] = 0 # B
img[:, :, 1] = 100 # G
img[:, :, 2] = 200 # R
print('------Original---------------')
print(img)
packed_s = mx.recordio.pack_img(header, img, quality=9, img_fmt=".png")
record.write(packed_s)
record.close()
iter = mx.io.ImageRecordIter(path_imgrec="temp_png.rec", data_shape=shape, batch_size=1, mean_r = 50)
batch = iter.next()
data = batch.data[0]
print('--------PNG---------------')
print(data.shape)
recover_data = mx.ndarray.transpose(data, (0, 3, 2, 1)) # From channels first to channel last.
print(recover_data[:,:,:,::-1]) # Reverse the channels.
# Same as original Output
Notice that mean has been correctly subtracted from the R-channel. |
Beta Was this translation helpful? Give feedback.
-
I agree with you, but this works only if you pack the image in the .rec file in format BGR (as you did in the example)
It does NOT work if you pack the img into the .rec files in RGB format, the normalisation will be applied to wrong channel. So I think this should be documented in the imageRecordIter class |
Beta Was this translation helpful? Give feedback.
-
What you are saying makes sense as
Agreed. I'll ping one of the codeowner just in case we are missing something. |
Beta Was this translation helpful? Give feedback.
-
I had same issue sadly , i think this is kinda major to fix no ? |
Beta Was this translation helpful? Give feedback.
-
Description
I was checking behaviour of ImageRecordIter on .rec file I have created myself.
I have seen that depending on how I created .rec the behaviour of ImageRecordIter is different in the way it handles the order of the color channels.
I create with png format the ImageRecordIter which by default fall back on the opencv encoding function, and thus inherits the default channel format which is BGR. However if I pack the image with the "jpeg" format, it will use the RGB order. Is this inconsistency expected? Can we pass a flag so that it always retrieve the RBG format?
Environment info (Required)
MXNet = 1.4.0
opencv-python==4.1.0.25
Reproducible step
Create the rec file with png format
Create the rec file with jpeg format
Beta Was this translation helpful? Give feedback.
All reactions