You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### What changes were proposed in this pull request?
Adds
```python
class _IsotonicRegressionBase(HasFeaturesCol, HasLabelCol, HasPredictionCol, HasWeightCol): ...
```
with related `Params` and uses it to replace `JavaPredictor` and `HasWeightCol` in `IsotonicRegression` base classes and `JavaPredictionModel,` in `IsotonicRegressionModel` base classes.
### Why are the changes needed?
Previous work (apache#25776) on [SPARK-28985](https://issues.apache.org/jira/browse/SPARK-28985) replaced `JavaEstimator`, `HasFeaturesCol`, `HasLabelCol`, `HasPredictionCol` in `IsotonicRegression` and `JavaModel` in `IsotonicRegressionModel` with newly added `JavaPredictor`:
https://github.com/apache/spark/blob/e97b55d32285052a1f76cca35377c4b21eb2e7d7/python/pyspark/ml/wrapper.py#L377
and `JavaPredictionModel`
https://github.com/apache/spark/blob/e97b55d32285052a1f76cca35377c4b21eb2e7d7/python/pyspark/ml/wrapper.py#L405
respectively.
This however is inconsistent with Scala counterpart where both classes extend private `IsotonicRegressionBase`
https://github.com/apache/spark/blob/3cb1b57809d0b4a93223669f5c10cea8fc53eff6/mllib/src/main/scala/org/apache/spark/ml/regression/IsotonicRegression.scala#L42-L43
This preserves some of the existing inconsistencies (`model` as defined in [the official example](https://github.com/apache/spark/blob/master/examples/src/main/python/ml/isotonic_regression_example.py)), i.e.
```python
from pyspark.ml.regression impor IsotonicRegressionMode
from pyspark.ml.param.shared import HasWeightCol
issubclass(IsotonicRegressionModel, HasWeightCol)
# False
hasattr(model, "weightCol")
# True
```
as well as introduces a bug, by adding unsupported `predict` method:
```python
import inspect
hasattr(model, "predict")
# True
inspect.getfullargspec(IsotonicRegressionModel.predict)
# FullArgSpec(args=['self', 'value'], varargs=None, varkw=None, defaults=None, kwonlyargs=[], kwonlydefaults=None, annotations={})
IsotonicRegressionModel.predict.__doc__
# Predict label for the given features.\n\n .. versionadded:: 3.0.0'
model.predict(dataset.first().features)
# Py4JError: An error occurred while calling o49.predict. Trace:
# py4j.Py4JException: Method predict([class org.apache.spark.ml.linalg.SparseVector]) does not exist
# ...
```
Furthermore existing implementation can cause further problems in the future, if `Predictor` / `PredictionModel` API changes.
### Does this PR introduce any user-facing change?
Yes. It:
- Removes invalid `IsotonicRegressionModel.predict` method.
- Adds `HasWeightColumn` to `IsotonicRegressionModel`.
however the faulty implementation hasn't been released yet, and proposed additions have negligible potential for breaking existing code (and none, compared to changes already made in apache#25776).
### How was this patch tested?
- Existing unit tests.
- Manual testing.
CC huaxingao, zhengruifeng
Closesapache#26023 from zero323/SPARK-28985-FOLLOW-UP-isotonic-regression.
Authored-by: zero323 <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
0 commit comments