diff --git a/README.md b/README.md
index 3511a97..64a036a 100644
--- a/README.md
+++ b/README.md
@@ -37,12 +37,6 @@ There is 'native' support for the following activation functions. If you define
 ### Training Methods
 Once the MLP type is constructed we train it using one of several provided training functions.
 
-* `train(nn, trainx, valx, traint, valt)`: This training method relies on calling the external [Optim.jl](https://github.com/JuliaOpt/Optim.jl) package. By default it uses the `gradient_descent` algorithm. However, by setting the `train_method` parameter, the following algorithms can also be selected: `levenberg_marquardt`, `momentum_gradient_descent`, or `nelder_mead`. The function accepts two data sets: the training data set (inputs and outputs given with `trainx` and `traint`) and the validation set (`valx`, `valt`). Input data must be a matrix with each data point occuring as a column of the matrix. Optional parameters include:
-    * `maxiter` (default: 100): Number of iterations before giving up.
-    * `tol` (default: 1e-5): Convergence threshold. Does not affect `levenberg_marquard`.
-    * `ep_iterl` (default: 5): Performance is evaluated on the validation set every `ep_iter` iterations. A smaller number gives slightly better convergence but each iteration takes a slightly longer time.
-    * `verbose` (default: true): Whether or not to print out information on the training state of the network.
-
 * `gdmtrain(nn, x, t)`: This is a natively-implemented gradient descent training algorithm with momentum. Returns (N, L), where N is the trained network and L is the (optional) list of training losses over time. Optional parameters include:
     * `batch_size` (default: n): Randomly selected subset of `x` to use when training extremely large data sets. Use this feature for 'stochastic' gradient descent.
     * `maxiter` (default: 1000): Number of iterations before giving up.