-
Notifications
You must be signed in to change notification settings - Fork 10
Add SVRG parameter updater #7
Comments
Just trying to make sense of the paper. It seems like they are proposing the gradient update for weight
where Do agree this is their method? |
This paper has a maybe slightly better summary of it: https://arxiv.org/pdf/1603.06160v2.pdf (see Algorithm 2). I think you have the basic idea... It might be slightly more difficult to fit this into the ParameterUpdater framework since there is a nested loop. Here is some pseudocode -- function svrg_psuedocode(data, w)
# hold current parameters and params from last iter
w = initialize_weights()
w_prev = similar(w)
# main loop
for s = 1:iterations
# store previous weights
copy!(w_prev, w)
mu = mean([ grad(w, target, output) for (target,output) in data ])
for t = 1:epoch_length
(target,output) = rand_sample(data)
# calc gradients
∇w = grad(w, target, output)
∇w_prev = grad(w_prev, target, output)
# update
w -= learnrate*(∇w - ∇w_prev + mu)
end
end
end |
Related? https://arxiv.org/abs/1604.07070v2 On Sunday, October 9, 2016, Alex Williams [email protected] wrote:
|
This is an interesting stochastic optimizer with some nice theoretical guarantees for convex problems. Would be interesting to compare to the others we have implemented already.
https://papers.nips.cc/paper/4937-accelerating-stochastic-gradient-descent-using-predictive-variance-reduction.pdf
The text was updated successfully, but these errors were encountered: