Tensorflow image loader image would cause different result #2

zlin3000 · 2017-06-30T08:52:12Z

I randomly tested several images, the difference is between .10 to .20.

In fact, I tested the code one by one, and found the resize method might be the problem which cause this.

I also used opencv instead of PIL to do resize, the final result is similar to tensorflow resize. Moreover, I compared resize result between PIL and opencv, they are quite different, for example, the max difference value in one image is about 25, and the RMSD is about 3.

Last, I read some articles which point out that adding noise to a image might cause totally different result even though human being cannot find the difference between these two images.

PS: thanks to this repository which helps me to save time, otherwise I might need to spend lots of time to convert caffe to tensorflow. :)

delta9 · 2017-06-30T09:07:54Z

I've nothing to add, but I've seen similar results during testing which has lead me to use the original classifier with caffe.

I haven't had the time to look through the code more thoroughly but I'd suggest comparing the resize logic:

https://github.com/yahoo/open_nsfw/blob/master/classify_nsfw.py#L19
https://github.com/mdietrichstein/tensorflow-open_nsfw/blob/master/image_utils.py#L4

mdietrichstein · 2017-07-01T17:33:19Z

Hey @zlin3000,

I've put a lot of time into investigating this issue to no avail...

@delta9 and you might be right in suspecting different resize implementations.

Another reason might be different jpeg decoding mechanisms (see here and here).

I haven't found the time to further investigate this, but I would love to solve this once and for all. I don't know when I'll get around to look into it again though.

Help is always appreciated :)

hristorv · 2017-10-11T19:11:44Z

@zlin3000 @delta9 @mdietrichstein Has anyone found a solution for the issue ?

mdietrichstein · 2017-10-20T06:11:36Z

@hristorv Not yet, I'm afraid

mdietrichstein · 2017-11-17T19:42:09Z

I have fixed a bug in the model definition (e1ada8d) which definitely corrupted some classifications.

It would be awesome if some of you could run your checks again and let me know if there are still major differences between the implementations.

Thanks!

delta9 · 2017-11-18T00:25:54Z

Hey @mdietrichstein

I just did some quick random tests and still found major differences:

50 KB Image

Tensorflow: 0.21753324568271637
Caffe: 0.570083081722

449 KB Image

Tensorflow: 0.9021902084350586
Caffe: 0.981046199799

1.7 MB Image

Tensorflow: 0.6308742761611938
Caffe: 0.943047463894

These are the NSFW scores for some images from .. reddit

First I thought it had something to do with the image size but sadly it's all over the place.

Thank you so much for your work though!

mdietrichstein · 2017-11-20T09:02:34Z

Hey @delta9

Thanks for your help!

First I thought it had something to do with the image size but sadly it's all over the place.

I've found out that tensorflow and caffe (original implementation) use different approaches in regards to padding when doing convolutions, pooling, etc.

I've made some adaptions to the model and it looks like it delivers better results now when using the yahoo image loader (-l yahoo). It's still not perfect, but at least I know what the problem is.

mdietrichstein · 2017-11-21T18:54:18Z

I've spent some more time on this and have identified two serious problems:

Padding issues when doing pooling/convolutions
This was due to the aforementioned different padding approaches between caffe and tensorflow. I've managed to fix that. The current version of this project now delivers essentially the same results as the original implementation when using the yahoo image loader.

Replicating the original image loading and preprocessing procedure is hard
Yahoo did some weird things in their preprocessing code, like:

Decoding the input file using PIL
Resizing it using PIL
Encoding the resized image with PIL as JPEG in memory...
Decoding the JPEG again, but this time with skimage?!

On top of that their model is very sensitive to changes in e.g. the JPEG codec, quality level, ....

I don't think it's possible to perfectly replicate the whole process with plain tensorflow due to different jpeg encoding/decoding and resize implementations/configurations between PIL, skimage and tensorflow.

That being said, I was still able to adapt the tensorflow loading code in a way that makes the difference a lot smaller than before (at least for my tests). The biggest difference I've observed was about 0.02.

@delta9 @zlin3000 It would be awesome if you could check out the new version and test if the results have improved for you too.

delta9 · 2017-11-22T08:04:47Z

Thanks for the detailed explanation!

My use case would have been to use the converted model with Tensorflow Serving in conjunction with a mobile app backend to check user generated content on upload.

I was hoping for higher performance over the original caffe script since invoking the python script using a wrapper each time has a lot of overhead.

The current version of this project now delivers essentially the same results as the original implementation when using the yahoo image loader.

So I would need to preprocess the images using the yahoo image loader and then send over the data for prediction - if I want to use it with Tensorflow Serving?

mdietrichstein · 2017-11-22T17:34:52Z

So I would need to preprocess the images using the yahoo image loader and then send over the data for prediction - if I want to use it with Tensorflow Serving?

That's correct.

You could also try to use the improved tensorflow image loader and check if the results are good enough for your use case.

I'm currently trying to get access to a nsfw dataset to evaluate both image loader implementations and get some real numbers on the differences between them.

waheebyaqub · 2017-12-21T06:47:41Z

using three datasets listed:

6785 Non-porn easy images
3555 Non-porn difficult images
4373 Porn images

yahoo image loader gave the following results:
all values are averaged

non-porn easy images:
SFW : 0.91932448
NSFW : 0.08067552
non-porn difficult images:
SFW : 0.70118753
NSFW : 0.29881247
porn images:
SFW : 0.19570181
NSFW : 0.80429819

Original caffe yahoo NSFW gave the following results:
all values are averaged

non-porn easy images:
SFW : 0.91932540
NSFW : 0.08067460
non-porn difficult images:
SFW : 0.70118952
NSFW : 0.29881049
porn images:
SFW : 0.19570224
NSFW : 0.80429776

Overall 4.3*10^-6 difference is not significant, between yahoo image loader implemented in tensorflow vs original caffe yahoo nsfw model based on the dataset

tensorflow image loader results:

3)porn images:
SFW: 0.20717194
NSFW: 0.79282895

mdietrichstein · 2018-01-02T09:07:07Z

Hey @waheebyaqub!

Thank you so much for posting you results here. May I ask which dataset you were using for your test?

I'm planning to use this dataset for a detailed comparison in the future.

waheebyaqub · 2018-01-03T17:21:56Z

@mdietrichstein, I actually used the same data, that you have linked, with some preprocessing on porn frames.

liudanking · 2018-01-04T09:38:30Z

So has this problem been solved?

mdietrichstein · 2018-01-05T09:10:47Z

@waheebyaqub Alright, thanks!

So has this problem been solved?

@liudanking If you use the yahoo image loader then yes, the issue is fixed.
The tensorflow image loader still gives different results because of the reasons mentioned in this issue though.

liudanking · 2018-01-05T10:28:56Z

@mdietrichstein Partially solved is still cool!
BTW, is there any plan to give a guide for fine-tuning tensorflow-open_nsfw just like the yahoo one?

mdietrichstein · 2018-01-07T11:52:33Z

BTW, is there any plan to give a guide for fine-tuning tensorflow-open_nsfw just like the yahoo one?

Not in the near future since I'm spending most of my time on a different project at the moment. I'd like to look into it once I have a bit more time though.

tower506 · 2018-01-08T21:15:32Z

Hi guys,@mdietrichstein @waheebyaqub , is there any chance you guys can provide an alternative link to download the dataset? The one for google sites no longer seems to be working (dataset download, site is up).
I'm trying to compare how much diferrence is there between a pretrained inception model, vs this implementation / the original open nsfw.
Thanks,

loretoparisi · 2018-03-12T13:35:29Z

@mdietrichstein @waheebyaqub thanks for your suggestions. Could you please share the final results of this new tuning and which is the improved of the original Yahoo! weights of open_nsfw?

Also the scores range posted above, are specific to this test, or we can consider valid in general?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensorflow image loader image would cause different result #2

Tensorflow image loader image would cause different result #2

zlin3000 commented Jun 30, 2017 •

edited

Loading

delta9 commented Jun 30, 2017

mdietrichstein commented Jul 1, 2017

hristorv commented Oct 11, 2017 •

edited

Loading

mdietrichstein commented Oct 20, 2017

mdietrichstein commented Nov 17, 2017

delta9 commented Nov 18, 2017

mdietrichstein commented Nov 20, 2017 •

edited

Loading

mdietrichstein commented Nov 21, 2017

delta9 commented Nov 22, 2017

mdietrichstein commented Nov 22, 2017

waheebyaqub commented Dec 21, 2017 •

edited

Loading

mdietrichstein commented Jan 2, 2018 •

edited

Loading

waheebyaqub commented Jan 3, 2018

liudanking commented Jan 4, 2018

mdietrichstein commented Jan 5, 2018

liudanking commented Jan 5, 2018

mdietrichstein commented Jan 7, 2018

tower506 commented Jan 8, 2018

loretoparisi commented Mar 12, 2018 •

edited

Loading

Tensorflow image loader image would cause different result #2

Tensorflow image loader image would cause different result #2

Comments

zlin3000 commented Jun 30, 2017 • edited Loading

delta9 commented Jun 30, 2017

mdietrichstein commented Jul 1, 2017

hristorv commented Oct 11, 2017 • edited Loading

mdietrichstein commented Oct 20, 2017

mdietrichstein commented Nov 17, 2017

delta9 commented Nov 18, 2017

mdietrichstein commented Nov 20, 2017 • edited Loading

mdietrichstein commented Nov 21, 2017

delta9 commented Nov 22, 2017

mdietrichstein commented Nov 22, 2017

waheebyaqub commented Dec 21, 2017 • edited Loading

mdietrichstein commented Jan 2, 2018 • edited Loading

waheebyaqub commented Jan 3, 2018

liudanking commented Jan 4, 2018

mdietrichstein commented Jan 5, 2018

liudanking commented Jan 5, 2018

mdietrichstein commented Jan 7, 2018

tower506 commented Jan 8, 2018

loretoparisi commented Mar 12, 2018 • edited Loading

zlin3000 commented Jun 30, 2017 •

edited

Loading

hristorv commented Oct 11, 2017 •

edited

Loading

mdietrichstein commented Nov 20, 2017 •

edited

Loading

waheebyaqub commented Dec 21, 2017 •

edited

Loading

mdietrichstein commented Jan 2, 2018 •

edited

Loading

loretoparisi commented Mar 12, 2018 •

edited

Loading