Skip to content
This repository was archived by the owner on Jun 16, 2024. It is now read-only.
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,6 @@ There is 'native' support for the following activation functions. If you define
### Training Methods
Once the MLP type is constructed we train it using one of several provided training functions.

* `train(nn, trainx, valx, traint, valt)`: This training method relies on calling the external [Optim.jl](https://github.com/JuliaOpt/Optim.jl) package. By default it uses the `gradient_descent` algorithm. However, by setting the `train_method` parameter, the following algorithms can also be selected: `levenberg_marquardt`, `momentum_gradient_descent`, or `nelder_mead`. The function accepts two data sets: the training data set (inputs and outputs given with `trainx` and `traint`) and the validation set (`valx`, `valt`). Input data must be a matrix with each data point occuring as a column of the matrix. Optional parameters include:
* `maxiter` (default: 100): Number of iterations before giving up.
* `tol` (default: 1e-5): Convergence threshold. Does not affect `levenberg_marquard`.
* `ep_iterl` (default: 5): Performance is evaluated on the validation set every `ep_iter` iterations. A smaller number gives slightly better convergence but each iteration takes a slightly longer time.
* `verbose` (default: true): Whether or not to print out information on the training state of the network.

* `gdmtrain(nn, x, t)`: This is a natively-implemented gradient descent training algorithm with momentum. Returns (N, L), where N is the trained network and L is the (optional) list of training losses over time. Optional parameters include:
* `batch_size` (default: n): Randomly selected subset of `x` to use when training extremely large data sets. Use this feature for 'stochastic' gradient descent.
* `maxiter` (default: 1000): Number of iterations before giving up.
Expand Down