ROIPooling layer in fast and faster R-CNN #565

chuzui · 2016-01-06T11:20:07Z

Now Fast R-CNN and Faster R-CNN are start-of-the-art image detection methods. The most important component of these method is a ROI pooling layer and the authors implemented it in caffe.

I find it may be difficult to implement the ROI pooling layer using the ops in theano. Is there anyone has any idea? Or we can only implement it with C extension?

benanne · 2016-01-06T12:47:08Z

It looks like this is just max-pooling with a pool size dependent on the input, so that the output always has the same size (e.g. 7x7)? That should be fairly simple to implement in pure Theano. And since it's unlikely to be a very time-consuming part of a network, making a faster C implementation probably isn't worth it.

chuzui · 2016-01-06T13:28:58Z

It looks like this is just max-pooling with a pool size dependent on the input, so that the output always has the same size (e.g. 7x7)? That should be fairly simple to implement in pure Theano. And since it's unlikely to be a very time-consuming part of a network, making a faster C implementation probably isn't worth it.

But the input of ROIPooling layer in a batch is several object proposal sub-windows of the same image. They have different sizes and all max-pooling to the same size (e.g. 7x7) with different pooling sizes. So i think it's not very easy to implement.

benanne · 2016-01-06T23:32:02Z

Right, in that case it's going to be tough to avoid scan, or something like that. A custom CUDA kernel might even be worth considering (it's fairly easy to wrap them in Theano using PyCUDA).

kshmelkov · 2016-01-11T10:40:09Z

Can't it be emulated via TransformerLayer? IIRC, it should do the trick if you transform bbox coordinates into affine transform parameters.

f0k · 2016-01-11T17:28:55Z

The TransformerLayer will do bilinear interpolation, though, not max-pooling. You could use it to extract regions scaled to a fixed target size, but not to implement the ROI pooling discussed here.
Looking at the implementation, it seems the most efficient solution will indeed be wrapping it into a custom kernel.
Implementing it in pure Theano will probably be slow. You'd need to theano.scan() over the region proposals, extract the corresponding subtensor, subdivide that again (to get, e.g., 7x7 subregions) and take the maximum of each.

kshmelkov · 2016-01-11T19:21:16Z

Fair point. Somehow I missed that it is called pooling for reason. Anyway I am messing around faster rcnn and I almost finished implementation of ROI 'pooling' via TransformerLayer. I hope it doesn't make a significant difference.

faizankshaikh · 2016-05-10T23:27:53Z

Has anything been done for this issue?

f0k · 2016-05-10T23:41:36Z

Has anything been done for this issue?

No, but the deepdetect issue linking to ours has a Theano implementation posted: https://github.com/ddtm/theano-roi-pooling
This could be integrated into Theano and wrapped as a Lasagne layer, or integrated into Lasagne, or just be used as a basis for a Lasagne Recipe.

faizankshaikh · 2016-05-10T23:46:58Z

That seems reasonable. Thanks!

f0k · 2016-05-10T23:55:22Z

Feel free to submit a PR to Lasagne/Recipes when you got it working, or send a PR to Theano for that Op and ping us back!

Sentient07 · 2017-02-16T19:16:54Z

Hi, I have made a draft here, Theano/Theano#5189 . Could you please have a look and let me know if the Op is implemented correctly?

Ramana

Sentient07 · 2017-03-15T20:29:30Z

Hi, If anyone is interested, the code for fast-RCNN, with installation instructions are here, Lasagne/Recipes#35 (comment)

beniz mentioned this issue Jan 23, 2016

Object Detection via Faster R-CNN jolibrain/deepdetect#43

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROIPooling layer in fast and faster R-CNN #565

ROIPooling layer in fast and faster R-CNN #565

chuzui commented Jan 6, 2016

benanne commented Jan 6, 2016

chuzui commented Jan 6, 2016

benanne commented Jan 6, 2016

kshmelkov commented Jan 11, 2016

f0k commented Jan 11, 2016

kshmelkov commented Jan 11, 2016

faizankshaikh commented May 10, 2016

f0k commented May 10, 2016

faizankshaikh commented May 10, 2016

f0k commented May 10, 2016

Sentient07 commented Feb 16, 2017

Sentient07 commented Mar 15, 2017

ROIPooling layer in fast and faster R-CNN #565

ROIPooling layer in fast and faster R-CNN #565

Comments

chuzui commented Jan 6, 2016

benanne commented Jan 6, 2016

chuzui commented Jan 6, 2016

benanne commented Jan 6, 2016

kshmelkov commented Jan 11, 2016

f0k commented Jan 11, 2016

kshmelkov commented Jan 11, 2016

faizankshaikh commented May 10, 2016

f0k commented May 10, 2016

faizankshaikh commented May 10, 2016

f0k commented May 10, 2016

Sentient07 commented Feb 16, 2017

Sentient07 commented Mar 15, 2017