Skip to content

Latest commit

 

History

History
24 lines (18 loc) · 1.13 KB

04_optimizers.md

File metadata and controls

24 lines (18 loc) · 1.13 KB

Optimizers


Optimizers minimize an objective function (op::Operation) with respect to several variables (op::Operation). All optimizers inherit from optimizer::Optimizer<T> class and define a minimize function which takes a vector of variables to minimize w.r.t.

For example, consider the following code, which minimizes the expression x^2 + c w.r.t. x using Stochastic Gradient Descent.

/* initialize our expression */
auto x = op::var<double> ("x", {1}, {CONSTANT, {5.0}}, mem);
auto c = op::var<double> ("c", {1}, {CONSTANT, {10.0}}, mem);
auto expr = op::add(op::pow(x,2), c);

/* create our optimizer */
double learning_rate = 0.05;
optimizer::GradientDescent<double> optim (learning_rate);

unsigned int n_iter = 100;
for (unsigned int i = 0; i < n_iter; i++) {
    optim.minimize(expr, {x});
}

This should bring the value of x->eval() close to 0 and expr->eval() to 10. The accuracy of this is dependent on the learning rate and number of iterations. For a neural network a learning rate around 0.01-0.1 is typically sufficient and the number of training iterations is very model dependent.