You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I have observed that including bias seems to cause the loss produced by my models to diverge significantly from the losses produced without cut cross-entropy. They tend to be much, much lower, often in the negative ranges. Omitting bias seems to improve things dramatically, such that losses are within a range of +/-0.05 of their originals.