Transpile trained scikit-learn estimators to C, Java, JavaScript and others.
It's recommended for limited embedded systems and critical applications where performance matters most.
We're hard working on the first major release of sklearn-porter.
Until that we will just release bugfixes to the stable version.
| Estimator | Programming language | |||||
| Classifier | Java * | JS | C | Go | PHP | Ruby |
| svm.SVC | ✓, ✓ ᴵ | ✓ | ✓ | ✓ | ✓ | |
| svm.NuSVC | ✓, ✓ ᴵ | ✓ | ✓ | ✓ | ✓ | |
| svm.LinearSVC | ✓, ✓ ᴵ | ✓ | ✓ | ✓ | ✓ | ✓ |
| tree.DecisionTreeClassifier | ✓, ✓ ᴱ, ✓ ᴵ | ✓, ✓ ᴱ | ✓, ✓ ᴱ | ✓, ✓ ᴱ | ✓, ✓ ᴱ | ✓, ✓ ᴱ |
| ensemble.RandomForestClassifier | ✓ ᴱ, ✓ ᴵ | ✓ ᴱ | ✓ ᴱ | ✓ ᴱ | ✓ ᴱ | ✓ ᴱ |
| ensemble.ExtraTreesClassifier | ✓ ᴱ, ✓ ᴵ | ✓ ᴱ | ✓ ᴱ | ✓ ᴱ | ✓ ᴱ | |
| ensemble.AdaBoostClassifier | ✓ ᴱ, ✓ ᴵ | ✓ ᴱ, ✓ ᴵ | ✓ ᴱ | |||
| neighbors.KNeighborsClassifier | ✓, ✓ ᴵ | ✓, ✓ ᴵ | ||||
| naive_bayes.GaussianNB | ✓, ✓ ᴵ | ✓ | ||||
| naive_bayes.BernoulliNB | ✓, ✓ ᴵ | ✓ | ||||
| neural_network.MLPClassifier | ✓, ✓ ᴵ | ✓, ✓ ᴵ | ||||
| Regressor | Java * | JS | C | Go | PHP | Ruby |
| neural_network.MLPRegressor | ✓ | |||||
✓ = is full-featured, ᴱ = with embedded model data, ᴵ = with imported model data, * = default language
$ pip install sklearn-porterIf you want the latest changes, you can install this package from the master branch:
$ pip uninstall -y sklearn-porter
$ pip install --no-cache-dir https://github.com/nok/sklearn-porter/zipball/masterThe following example demonstrates how you can transpile a decision tree estimator to Java:
from sklearn.datasets import load_iris
from sklearn.tree import tree
from sklearn_porter import Porter
# Load data and train the classifier:
samples = load_iris()
X, y = samples.data, samples.target
clf = tree.DecisionTreeClassifier()
clf.fit(X, y)
# Export:
porter = Porter(clf, language='java')
output = porter.export(embed_data=True)
print(output)The exported result matches the official human-readable version of the decision tree.
You should always check and compute the integrity between the original and the transpiled estimator:
# ...
porter = Porter(clf, language='java')
# Compute integrity score:
integrity = porter.integrity_score(X)
print(integrity) # 1.0You can compute the prediction(s) in the target programming language:
# ...
porter = Porter(clf, language='java')
# Prediction(s):
Y_java = porter.predict(X)
y_java = porter.predict(X[0])
y_java = porter.predict([1., 2., 3., 4.])You can run and test all notebooks by starting a Jupyter notebook server locally:
$ make open.examples
$ make stop.examplesIn general you can use the porter on the command line:
$ porter <pickle_file> [--to <directory>]
[--class_name <class_name>] [--method_name <method_name>]
[--export] [--checksum] [--data] [--pipe]
[--c] [--java] [--js] [--go] [--php] [--ruby]
[--version] [--help]
The following example shows how you can save a trained estimator to the pickle format:
# ...
# Extract estimator:
joblib.dump(clf, 'estimator.pkl', compress=0)After that the estimator can be transpiled to JavaScript by using the following command:
$ porter estimator.pkl --jsThe target programming language is changeable on the fly:
$ porter estimator.pkl --c
$ porter estimator.pkl --java
$ porter estimator.pkl --php
$ porter estimator.pkl --java
$ porter estimator.pkl --rubyFor further processing the argument --pipe can be used to pass the result:
$ porter estimator.pkl --js --pipe > estimator.jsFor instance the result can be minified by using UglifyJS:
$ porter estimator.pkl --js --pipe | uglifyjs --compress -o estimator.min.jsYou have to install required modules for broader development:
$ make install.environment # conda environment (optional)
$ make install.requirements.development # pip requirementsIndependently, the following compilers and intepreters are required to cover all tests:
| Name | Version | Command |
| GCC | >=4.2 |
gcc --version |
| Java | >=1.6 |
java -version |
| PHP | >=5.6 |
php --version |
| Ruby | >=2.4.1 |
ruby --version |
| Go | >=1.7.4 |
go version |
| Node.js | >=6 |
node --version |
The tests cover module functions as well as matching predictions of transpiled estimators. Start all tests with:
$ make testThe test files have a specific pattern: '[Algorithm][Language]Test.py':
$ pytest tests -v -o python_files='RandomForest*Test.py'
$ pytest tests -v -o python_files='*JavaTest.py'While you are developing new features or fixes, you can reduce the test duration by changing the number of tests:
$ N_RANDOM_FEATURE_SETS=5 N_EXISTING_FEATURE_SETS=10 \
pytest tests -v -o python_files='*JavaTest.py'It's highly recommended to ensure the code quality. For that Pylint is used. Start the linter with:
$ make lintIf you use this implementation in you work, please add a reference/citation to the paper. You can use the following BibTeX entry:
@unpublished{skpodamo,
author = {Darius Morawiec},
title = {sklearn-porter},
note = {Transpile trained scikit-learn estimators to C, Java, JavaScript and others},
url = {https://github.com/nok/sklearn-porter}
}
The module is Open Source Software released under the MIT license.
Don't be shy and feel free to contact me on Twitter or Gitter.