Skip to content

Commit 8556710

Browse files
zero323srowen
authored andcommitted
[SPARK-28985][PYTHON][ML][FOLLOW-UP] Add _IsotonicRegressionBase
### What changes were proposed in this pull request? Adds ```python class _IsotonicRegressionBase(HasFeaturesCol, HasLabelCol, HasPredictionCol, HasWeightCol): ... ``` with related `Params` and uses it to replace `JavaPredictor` and `HasWeightCol` in `IsotonicRegression` base classes and `JavaPredictionModel,` in `IsotonicRegressionModel` base classes. ### Why are the changes needed? Previous work (apache#25776) on [SPARK-28985](https://issues.apache.org/jira/browse/SPARK-28985) replaced `JavaEstimator`, `HasFeaturesCol`, `HasLabelCol`, `HasPredictionCol` in `IsotonicRegression` and `JavaModel` in `IsotonicRegressionModel` with newly added `JavaPredictor`: https://github.com/apache/spark/blob/e97b55d32285052a1f76cca35377c4b21eb2e7d7/python/pyspark/ml/wrapper.py#L377 and `JavaPredictionModel` https://github.com/apache/spark/blob/e97b55d32285052a1f76cca35377c4b21eb2e7d7/python/pyspark/ml/wrapper.py#L405 respectively. This however is inconsistent with Scala counterpart where both classes extend private `IsotonicRegressionBase` https://github.com/apache/spark/blob/3cb1b57809d0b4a93223669f5c10cea8fc53eff6/mllib/src/main/scala/org/apache/spark/ml/regression/IsotonicRegression.scala#L42-L43 This preserves some of the existing inconsistencies (`model` as defined in [the official example](https://github.com/apache/spark/blob/master/examples/src/main/python/ml/isotonic_regression_example.py)), i.e. ```python from pyspark.ml.regression impor IsotonicRegressionMode from pyspark.ml.param.shared import HasWeightCol issubclass(IsotonicRegressionModel, HasWeightCol) # False hasattr(model, "weightCol") # True ``` as well as introduces a bug, by adding unsupported `predict` method: ```python import inspect hasattr(model, "predict") # True inspect.getfullargspec(IsotonicRegressionModel.predict) # FullArgSpec(args=['self', 'value'], varargs=None, varkw=None, defaults=None, kwonlyargs=[], kwonlydefaults=None, annotations={}) IsotonicRegressionModel.predict.__doc__ # Predict label for the given features.\n\n .. versionadded:: 3.0.0' model.predict(dataset.first().features) # Py4JError: An error occurred while calling o49.predict. Trace: # py4j.Py4JException: Method predict([class org.apache.spark.ml.linalg.SparseVector]) does not exist # ... ``` Furthermore existing implementation can cause further problems in the future, if `Predictor` / `PredictionModel` API changes. ### Does this PR introduce any user-facing change? Yes. It: - Removes invalid `IsotonicRegressionModel.predict` method. - Adds `HasWeightColumn` to `IsotonicRegressionModel`. however the faulty implementation hasn't been released yet, and proposed additions have negligible potential for breaking existing code (and none, compared to changes already made in apache#25776). ### How was this patch tested? - Existing unit tests. - Manual testing. CC huaxingao, zhengruifeng Closes apache#26023 from zero323/SPARK-28985-FOLLOW-UP-isotonic-regression. Authored-by: zero323 <[email protected]> Signed-off-by: Sean Owen <[email protected]>
1 parent df22535 commit 8556710

File tree

1 file changed

+33
-24
lines changed

1 file changed

+33
-24
lines changed

Diff for: python/pyspark/ml/regression.py

+33-24
Original file line numberDiff line numberDiff line change
@@ -465,8 +465,38 @@ def totalIterations(self):
465465
return self._call_java("totalIterations")
466466

467467

468+
class _IsotonicRegressionBase(HasFeaturesCol, HasLabelCol, HasPredictionCol, HasWeightCol):
469+
"""
470+
Params for :py:class:`IsotonicRegression` and :py:class:`IsotonicRegressionModel`.
471+
472+
.. versionadded:: 3.0.0
473+
"""
474+
475+
isotonic = Param(
476+
Params._dummy(), "isotonic",
477+
"whether the output sequence should be isotonic/increasing (true) or" +
478+
"antitonic/decreasing (false).", typeConverter=TypeConverters.toBoolean)
479+
featureIndex = Param(
480+
Params._dummy(), "featureIndex",
481+
"The index of the feature if featuresCol is a vector column, no effect otherwise.",
482+
typeConverter=TypeConverters.toInt)
483+
484+
def getIsotonic(self):
485+
"""
486+
Gets the value of isotonic or its default value.
487+
"""
488+
return self.getOrDefault(self.isotonic)
489+
490+
def getFeatureIndex(self):
491+
"""
492+
Gets the value of featureIndex or its default value.
493+
"""
494+
return self.getOrDefault(self.featureIndex)
495+
496+
468497
@inherit_doc
469-
class IsotonicRegression(JavaPredictor, HasWeightCol, JavaMLWritable, JavaMLReadable):
498+
class IsotonicRegression(JavaEstimator, _IsotonicRegressionBase, HasWeightCol,
499+
JavaMLWritable, JavaMLReadable):
470500
"""
471501
Currently implemented using parallelized pool adjacent violators algorithm.
472502
Only univariate (single feature) algorithm supported.
@@ -503,16 +533,6 @@ class IsotonicRegression(JavaPredictor, HasWeightCol, JavaMLWritable, JavaMLRead
503533
504534
.. versionadded:: 1.6.0
505535
"""
506-
507-
isotonic = \
508-
Param(Params._dummy(), "isotonic",
509-
"whether the output sequence should be isotonic/increasing (true) or" +
510-
"antitonic/decreasing (false).", typeConverter=TypeConverters.toBoolean)
511-
featureIndex = \
512-
Param(Params._dummy(), "featureIndex",
513-
"The index of the feature if featuresCol is a vector column, no effect otherwise.",
514-
typeConverter=TypeConverters.toInt)
515-
516536
@keyword_only
517537
def __init__(self, featuresCol="features", labelCol="label", predictionCol="prediction",
518538
weightCol=None, isotonic=True, featureIndex=0):
@@ -547,26 +567,15 @@ def setIsotonic(self, value):
547567
"""
548568
return self._set(isotonic=value)
549569

550-
def getIsotonic(self):
551-
"""
552-
Gets the value of isotonic or its default value.
553-
"""
554-
return self.getOrDefault(self.isotonic)
555-
556570
def setFeatureIndex(self, value):
557571
"""
558572
Sets the value of :py:attr:`featureIndex`.
559573
"""
560574
return self._set(featureIndex=value)
561575

562-
def getFeatureIndex(self):
563-
"""
564-
Gets the value of featureIndex or its default value.
565-
"""
566-
return self.getOrDefault(self.featureIndex)
567-
568576

569-
class IsotonicRegressionModel(JavaPredictionModel, JavaMLWritable, JavaMLReadable):
577+
class IsotonicRegressionModel(JavaModel, _IsotonicRegressionBase,
578+
JavaMLWritable, JavaMLReadable):
570579
"""
571580
Model fitted by :class:`IsotonicRegression`.
572581

0 commit comments

Comments
 (0)