You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, thanks for publicly sharing your implementations of the reinforcement learning algorithms. I find your repos very useful!
As I was playing around with the QR-DQN, I think I noticed a bug in your implementation of the Quantile Huber loss function. The code seems to run fine if batch_size == atoms. However, if you change one of the two, you get an error due to the incompatible tensor shapes in line 75 of QR-DQN.py:
loss = tf.where(tf.less(error_loss, 0.0), inv_tau * huber_loss, tau * huber_loss)
I think the error is related to the fact that TF2 implementation of the Huber loss reduces the dimension of the output by 1 with respect to the inputs (docu), even when setting reduction=tf.keras.losses.Reduction.NONE. This is different from the behavior in TF1, where the output dimension matches the one of the input (docu). Therefore, if I am not mistaken, one could fix this by changing the self.huber_loss to tf.compat.v1.losses.huber_loss? I am having a bit of a hard time working out the exact dimensions upon which different operations act, so I would be happy to hear from your side if my theory is correct :P
The text was updated successfully, but these errors were encountered:
mcshel
changed the title
Bug in the Quantile Hubler loss?
Bug in the Quantile Huber loss?
Feb 25, 2021
Hi,
First of all, thanks for publicly sharing your implementations of the reinforcement learning algorithms. I find your repos very useful!
As I was playing around with the QR-DQN, I think I noticed a bug in your implementation of the Quantile Huber loss function. The code seems to run fine if batch_size == atoms. However, if you change one of the two, you get an error due to the incompatible tensor shapes in line 75 of QR-DQN.py:
loss = tf.where(tf.less(error_loss, 0.0), inv_tau * huber_loss, tau * huber_loss)
I think the error is related to the fact that TF2 implementation of the Huber loss reduces the dimension of the output by 1 with respect to the inputs (docu), even when setting
reduction=tf.keras.losses.Reduction.NONE.
This is different from the behavior in TF1, where the output dimension matches the one of the input (docu). Therefore, if I am not mistaken, one could fix this by changing theself.huber_loss
totf.compat.v1.losses.huber_loss
? I am having a bit of a hard time working out the exact dimensions upon which different operations act, so I would be happy to hear from your side if my theory is correct :PThe text was updated successfully, but these errors were encountered: