The training does not converge #10

cactuslei · 2018-02-23T14:16:27Z

Hi, thanks a lot for your great work, your code is very clear and easy to understand.

However, when I train with your main.py, the network does not converge. The only change is that I set "SAMPLE" from True to False, since if True, then only 500 samples are used for training. However, the training loss always be around 2.3 and the training is terminated because there is no improvement after 1000 steps. Could you tell me your best accuracy achieved with your code? Thanks a lot.

jacobunderlinebenseal · 2018-03-19T10:06:45Z

me too

bit1002lst · 2018-04-11T03:34:33Z

me too, and all of the images in samples are empty

kevinzakka · 2018-04-18T07:47:57Z

Hey guys, I'll take a look at the code when I get the time.

moormoon · 2018-04-30T02:00:13Z

I have the same issue. Looks like it is related to the initialization of the conv and fc layers. I tried only using fc layers for regression and the training converged. Remember to initialize the weight to zeros and bias to identity.

Haven't figured out how to initialize conv layers yet. If anyone make progress on this please let us know.

BlueWinters · 2018-05-02T12:33:26Z

i think that the bilinear interpolation process in funtion bilinear_sampler is wrong, and a good example of this process can be found in https://github.com/tensorflow/models/tree/master/research/transformer
` # get pixel value at corner coords
Ia = get_pixel_value(img, x0, y0)
Ib = get_pixel_value(img, x0, y1)
Ic = get_pixel_value(img, x1, y0)
Id = get_pixel_value(img, x1, y1)

# recast as float for delta calculation
x0 = tf.cast(x0, 'float32')
x1 = tf.cast(x1, 'float32')
y0 = tf.cast(y0, 'float32')
y1 = tf.cast(y1, 'float32')

# calculate deltas
wa = (x1-x) * (y1-y)
wb = (x1-x) * (y-y0)
wc = (x-x0) * (y1-y)
wd = (x-x0) * (y-y0)

# add dimension for addition
wa = tf.expand_dims(wa, axis=3)
wb = tf.expand_dims(wb, axis=3)
wc = tf.expand_dims(wc, axis=3)
wd = tf.expand_dims(wd, axis=3)

# compute output
out = tf.add_n([wa*Ia, wb*Ib, wc*Ic, wd*Id])`

lifan9880 · 2018-11-02T02:11:56Z

me too

kakashi571 · 2019-03-27T12:17:42Z

how to start the training?

wanziz · 2019-08-22T01:28:36Z

如何开始培训？

Excuse me,do you know how to train now?

wanziz · 2019-08-22T01:37:14Z

Excuse me, I would like to know how to start training the model？thank you！

turandai · 2021-11-17T05:20:51Z

I think the problem is the gradients of bilinear sampling can not be auto-generated by tensorflow properly. In the original paper, author defined special gradients during this process, and this package has not included it for now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The training does not converge #10

The training does not converge #10

cactuslei commented Feb 23, 2018

jacobunderlinebenseal commented Mar 19, 2018

bit1002lst commented Apr 11, 2018

kevinzakka commented Apr 18, 2018

moormoon commented Apr 30, 2018

BlueWinters commented May 2, 2018

lifan9880 commented Nov 2, 2018

kakashi571 commented Mar 27, 2019

wanziz commented Aug 22, 2019

wanziz commented Aug 22, 2019

turandai commented Nov 17, 2021

The training does not converge #10

The training does not converge #10

Comments

cactuslei commented Feb 23, 2018

jacobunderlinebenseal commented Mar 19, 2018

bit1002lst commented Apr 11, 2018

kevinzakka commented Apr 18, 2018

moormoon commented Apr 30, 2018

BlueWinters commented May 2, 2018

lifan9880 commented Nov 2, 2018

kakashi571 commented Mar 27, 2019

wanziz commented Aug 22, 2019

wanziz commented Aug 22, 2019

turandai commented Nov 17, 2021