Releases: sdpython/onnx-extended
Releases · sdpython/onnx-extended
0.3.0
- #189: use onnxruntime==1.19.2 as default, pybind11 2.13.5, MatX 0.8.0
- #187: Fix compilation with GCC>=13 #187
- #185: adds custom operator MulMulSigmoid on CUDA
- #184: use onnxruntime==1.18.0 as default
- #181: adds MaskedScatterNDOfShape custom operator
- #175: adds custom operator MulSub and SubMul on CUDA
- #173: adds custom operator AddSharedInput, MulSharedInput on CUDA
- #170: adds custom operator TriMatrix on CUDA
- #169: adds custom operator ReplaceZero on CUDA
- #168: adds custom operator MulSigmoid on CUDA
- #167: adds custom operator Rotary on CUDA
- #166, #178: adds custom operators AddMul, MulAdd on CUDA
- #165: adds custom operators AddAddAdd, MulMulMul on CUDA
- #163: use onnxruntime==1.17.3 as default
- #162: add ScatterNDOfShape implementation on CUDA without atomics
- #159: add AddAdd custom operator on CUDA
- #158: add MulMul custom operator on CUDA
- #157: add ScatterNDOfShape custom operator
- #155: add a function to draw a timeline from a profile
- #154: improves ploting legend for profiling
- #151: refactoring of TreeEnsemble code to make them faster
- #129, #132: support sparse features for TreeEnsemble
0.2.4
0.2.3
- #99: use onnxruntime==1.16.1 as default
- #96: implements a fonction to convert a ModelProto into string (not bytes),
add a function to multiply the number of trees in a TreeEnsemble - #75: add an implementation of murmurhash3 to validate some options
- #93: validates the wheels in CI
- #89: add a function to merge models and update them if both have different opsets
0.2.2
0.2.1
- #79: update to onnxruntime v1.16.0
- #77: helpers to benchmark a model
- #74: add a function to enumerate all intermediate results with onnxruntime
- #71, #72, #73: add function to analyse a profile produce by onnxruntime
- #68, #69, #70: add CPU implementation for CustomGemmFloat8
- #67: add a function to extract a subgraph of a model
- #59, #60, #61, #62, #63, #65,
#66, #68, #69, #70:
add local functions to quantize into float 8, float 16 - #57: add C implementation for DynamicQuantizeLinear (for experimentation)
- #56: add C implementation to cast a float into float 8
- #55, #58: add basic functionality to transform a graph, starts with basic quantization
- #51: fix optmized TreeEnsembleRegressor and adds TreeEnsembleClassifier as custom ops
- #50: add command line store to store intermediate outputs
- #49: add option to save intermediate results in CReferenceEvaluator
- #45: add option cuda-link to setup.py to specify how to link with CUDA library
- #41: implements a custom kernel for RandomForestRegressor easier to optimize
- #34: update to onnxruntime v1.15.1
- #31: implement a custom CUDA kernel (gemm)
- #32: update to onnxruntime v1.15.0
- #27: add a custom kernel with parameters to onnxruntime
- #26: add a custom kernel to onnxruntime
- #24: use Eigen to implement Conv operator
- #23: make pip wheel . work
- #22: rename cmake into _cmake to avoid warnings related to cmake package
- #19: minimal settings to use onnxruntime
- #14: minimal setting to use CUDA
- #8: support for C++ unit test
0.1.0
Adds action for the documentation (#18) * Adds action for the documentation * update * doc * rename * index * remove one line * Update requirements-dev.txt * Update requirements-dev.txt * Update requirements-dev.txt --------- Co-authored-by: xavier dupré <[email protected]>