Significant discrepancies between `HistGradientBoostingRegressor` scikit and ONNX output

When using skl2onnx for exporting pipelines using `HistGradientBoostingRegressor`, we are seeing significant discrepancies in the output (double-digit relative error on some inputs) even when we use the `CastTransformer` to float32-cast inputs to the regressor. For reference, our pipeline looks something like
```py
Pipeline([
    ("preprocessor", ColumnTransformer([("categorical", TargetEncoder(), categorical_features), ("numeric", "passthrough", numeric_features)])),
    ("cast", CastTransformer(dtype=np.float32)),
    ("regressor", HistGradientBoostingRegressor())
])
```
My suspicion is that the way histogram-based gradient boosting generates buckets and decision thresholds requires the same float64->float32 threshold adjustment that the normal `GradientBoostingRegressor` gets, even when the inputs are float32.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Significant discrepancies between `HistGradientBoostingRegressor` scikit and ONNX output #1192

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Significant discrepancies between HistGradientBoostingRegressor scikit and ONNX output #1192

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Significant discrepancies between `HistGradientBoostingRegressor` scikit and ONNX output #1192