Softmax in routing algorithm incorrect:? #12

geefer · 2018-05-17T11:45:18Z

Hi,
I think the softmax in the routing algorithm is being calculated over the wrong dimension.

Currently the code has:

        # Initialize routing logits to zero.
        b_ij = Variable(torch.zeros(1, self.in_channels, self.num_units, 1)).cuda()

        # Iterative routing.
        num_iterations = 3
        for iteration in range(num_iterations):
            # Convert routing logits to softmax.
            # (batch, features, num_units, 1, 1)
            c_ij = F.softmax(b_ij)

and since the dim parameter is not passed to the F.softmax call it will choose dim=1 and compute the softmax over the self.in_channels dimension (1152 here) whereas the softmax should be computed so that the c_ij between each input capsule and all the capsules in the next layer should sum to 1.

Thus the correct call should be:

           c_ij = F.softmax(b_ij, dim=2)

The text was updated successfully, but these errors were encountered:

InnovArul · 2018-06-13T18:32:35Z

Have you tried to implement and test with dim=2?

The implementation here (https://github.com/gram-ai/capsule-networks) is similar to the code in this repo. I.e., takes softmax over dim=1. But when I implemented with appropriate dim as you mentioned, the network is not learning

geefer · 2018-06-14T09:47:51Z

Hi,

Yes my implementation operates the softmax over the dimension that corresponds to the number of digit capsules (10) and works well it appears - giving a best test accuracy of 99.68% (not as good as reported in the paper but I have yet to see another implementation that matches their results.)

If you check the code in the naturomics tensorflow implementation that you reference, you will see that also applies the softmax over the dimension that has the number of digit caps (i.e. 10).

In the paper, equation (3) shows the softmax operation and it can be seen that the summation on the divisor is over k in b_ik and gives c_ij where c_ij is the coupling coefficient between capsule i and all the capsules in the layer above, which sum to one. Thus the softmax should be over the dimension of size 10.

See also the implementation by the author of the paper at https://github.com/Sarasra/models/blob/master/research/capsules/models/layers/layers.py line 110.

I am not sure why your network does not learn if you change the softmax but possibly there is another problem somewhere else?

lcwy220 · 2020-04-01T15:36:44Z

Hi,
I think the softmax in the routing algorithm is being calculated over the wrong dimension.

Currently the code has:
        # Initialize routing logits to zero.
        b_ij = Variable(torch.zeros(1, self.in_channels, self.num_units, 1)).cuda()

        # Iterative routing.
        num_iterations = 3
        for iteration in range(num_iterations):
            # Convert routing logits to softmax.
            # (batch, features, num_units, 1, 1)
            c_ij = F.softmax(b_ij)
and since the dim parameter is not passed to the F.softmax call it will choose dim=1 and compute the softmax over the self.in_channels dimension (1152 here) whereas the softmax should be computed so that the c_ij between each input capsule and all the capsules in the next layer should sum to 1.

Thus the correct call should be:
           c_ij = F.softmax(b_ij, dim=2)

Hi, I also notice the softmax function problem, and I think you are right that the softmax function should be used in dim=2.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Softmax in routing algorithm incorrect:? #12

Softmax in routing algorithm incorrect:? #12

geefer commented May 17, 2018

InnovArul commented Jun 13, 2018 •

edited

Loading

geefer commented Jun 14, 2018

lcwy220 commented Apr 1, 2020

Softmax in routing algorithm incorrect:? #12

Softmax in routing algorithm incorrect:? #12

Comments

geefer commented May 17, 2018

InnovArul commented Jun 13, 2018 • edited Loading

geefer commented Jun 14, 2018

lcwy220 commented Apr 1, 2020

InnovArul commented Jun 13, 2018 •

edited

Loading