hep_ml provides specific machine learning tools for purposes of high energy physics (written in python).
- uniform classifiers - the classifiers with low correlation of predictions and mass (or some other variable(s))
- uBoost optimized implementation inside
- UGradientBoosting (with different losses, specially FlatnessLoss is very interesting)
- measures of uniformity (see hep_ml.metrics)
- advanced losses for classification, regression and ranking for UGradientBoosting (see hep_ml.losses).
- hep_ml.nnet - theano-based flexible neural networks
- hep_ml.reweight - reweighting multidimensional distributions
(multi here means 2, 3, 5 and more dimensions - see GBReweighter!) - hep_ml.splot - minimalistic sPlot-ting
- hep_ml.speedup - building models for fast classification (Bonsai BDT)
- sklearn-compatibility of estimators.
Basic installation:
pip install hep_ml
If you're new to python and don't never used pip
, first install scikit-learn with these instructions.
To use latest development version, clone it and install with pip
:
git clone https://github.com/arogozhnikov/hep_ml.git
cd hep_ml
sudo pip install .
- documentation
- notebooks, code examples
- repository
- issue tracker (discussion is also there)
- if you have a question, please open a ticket in this repository
Libraries you'll require to make your life easier.
- IPython Notebook — web-shell for python
- scikit-learn — general-purpose library for machine learning in python
- yandex/REP — python wrappers around different machine learning libraries (including TMVA) + goodies, required to plot learning curves and reports after classification. Required to execute howtos from this repository
- numpy — 'MATLAB in python', vector operation in python. Use it whenever you need to perform any number crunching.
- theano — optimized vector analytical math engine in python
- ROOT — main data format in high energy physics
- root_numpy — python library to deal with ROOT files (without pain)
Apache 2.0, library is open-source.
Linux, Mac OS X and Windows are supported.