`precondition`: Preconditioning Optimizers

Installation (note package name is precondition but pypi distribution name is precondition-opt):

pip3 install -U precondition-opt

Currently, this contains several preconditioning optimizer implementations. Please refer to the citations below.

Shampoo (distributed_shampoo.py)

@article{anil2020scalable,
  title={Scalable second order optimization for deep learning},
  author={Anil, Rohan and Gupta, Vineet and Koren, Tomer and Regan, Kevin and Singer, Yoram},
  journal={arXiv preprint arXiv:2002.09018},
  year={2020}
}

Sketchy (distributed_shampoo.py), logical reference implementation as a branch in Shampoo.

@article{feinberg2023sketchy,
  title={Sketchy: Memory-efficient Adaptive Regularization with Frequent Directions},
  author={Feinberg, Vladimir and Chen, Xinyi and Sun, Y Jennifer and Anil, Rohan and Hazan, Elad},
  journal={arXiv preprint arXiv:2302.03764},
  year={2023}
}

In Appendix A of the aforementioned paper, S-Adagrad is tested along with other optimization algorithms (including RFD-SON, Adagrad, OGD, Ada-FD, FD-SON) on three benchmark datasets (a9a, cifar10, gisette_scale). To recreate the result in Appendix A, first download the benchmark datasets (available at https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/) to your a local folder (e.g. ~/data). For each DATASET = {'a9a', 'cifar10', 'gisette'}, run the following two commands. The first command will run a sweep over all the hyperparameter sets, and the second command will plot a graph of the best set of hyperparameters from the previous run. Consistent with the fair allocation of hyperparameter training budget, we tune delta and the learning rate, each over 7 points uniformly spaced from 1e-6 to 1 in logarithmic scale, for FD-SON and Ada-FD. For the rest of the optimization algorithms, where delta is taken to be 0, we tune the learning rate over 49 points uniformly spaced from 1e-6 to 1 in logarithmic scale. The commands are provided as following:

For running sweep over different sets of hyperparameters, run: (... hides different values of the hyperparameters to sweep over)

python3 oco/sweep.py --data_dir='~/data' --dataset=DATASET --save_dir='/tmp/results' \
--lr=1 ... -lr=1e-6 --alg=ADA, --alg=OGD, --alg=S_ADA, --alg=RFD_SON

and

python3 oco/sweep.py --data_dir='~/data' --dataset=DATASET --save_dir='/tmp/results' --lr=1 ... --lr=1e-6 \
--delta=1 --delta=1e-6 --alg=ADA_FD, --alg=FD_SON

After running the above commands, the results will be saved in the folders named with the time stamps at execution (e.g. '/tmp/results/YYYY-MM-DD' and '/tmp/results/yyyy-mm-dd').

To plot the best set of hyperparameters from the previous run, run:

python3 oco/sweep.py --data_dir='~/data' --dataset=DATASET --save_dir='/tmp/results' \
--use_best_from='/tmp/results/YYYY-MM-DD' --use_best_from='/tmp/results/yyyy-mm-dd'

For detailed documentations on the supported flags, run:

python3 oco/sweep.py --help

SM3 (sm3.py).

@article{anil2020scalable,
  title={Scalable second order optimization for deep learning},
  author={Anil, Rohan and Gupta, Vineet and Koren, Tomer and Regan, Kevin and Singer, Yoram},
  journal={arXiv preprint arXiv:2002.09018},
  year={2020}
}

This external repository was seeded from existing open-source work available at this google-research repository.

This is not an officially supported Google product.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.github/workflows		.github/workflows
.vscode		.vscode
precondition		precondition
.gitignore		.gitignore
.pylintrc		.pylintrc
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`precondition`: Preconditioning Optimizers

About

Releases 2

Packages

Contributors 5

Languages

License

google-research/precondition

Folders and files

Latest commit

History

Repository files navigation

precondition: Preconditioning Optimizers

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 5

Languages

`precondition`: Preconditioning Optimizers

Packages