Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the SkproAdapter class, enabling FunctionalCPD to integrate with external skpro-like estimators. The changes include refactoring the FunctionalCPD fitting logic to support skpro tags and adding corresponding unit tests. Review feedback highlights that both SkproAdapter and FunctionalCPD are missing sample methods necessary for model simulation and inference. Additionally, FunctionalCPD needs to initialize the variables attribute to ensure compatibility with the BaseFactor class requirements.
| @@ -0,0 +1,22 @@ | |||
| class SkproAdapter: | |||
There was a problem hiding this comment.
The SkproAdapter class is missing a sample method. In pgmpy, CPDs are expected to provide a way to generate samples (e.g., for use in model.simulate()). Since skpro models are probabilistic estimators, the adapter should implement a sample method that delegates to the underlying model's prediction or sampling API to ensure the fitted model is actually usable for inference.
| def __init__(self, variable, tag, estimator=None): | ||
| self.variable = variable | ||
| self.tag = tag | ||
| self.estimator = estimator |
There was a problem hiding this comment.
The FunctionalCPD class inherits from BaseFactor, which requires a variables attribute to function correctly within pgmpy models. This attribute is currently not initialized in the constructor. Additionally, since the parents argument was removed from the constructor in this refactor, you should ensure self.variables is correctly updated during the fit process (e.g., self.variables = [self.variable] + self.parents_) to maintain compatibility with methods like check_model().
| def __init__(self, variable, tag, estimator=None): | |
| self.variable = variable | |
| self.tag = tag | |
| self.estimator = estimator | |
| def __init__(self, variable, tag, estimator=None): | |
| self.variable = variable | |
| self.variables = [variable] | |
| self.tag = tag | |
| self.estimator = estimator |
| def _fit_external_ml(self): | ||
| if self.estimator is None: | ||
| raise ValueError("For skpro tag, `estimator` must be provided.") | ||
|
|
||
| self.fitted_cpd_ = SkproAdapter(variable=self.variable, model=self.estimator, parents=self.parents_).fit( | ||
| self.data_ | ||
| ) |
There was a problem hiding this comment.
The refactored FunctionalCPD class is missing a sample method. As this class acts as a wrapper for various fitted CPD implementations (such as TabularCPD, LinearGaussianCPD, and the new SkproAdapter), it must provide a unified sample interface that delegates the call to self.fitted_cpd_.sample(...). Without this, any model containing this CPD will fail during simulation or sampling-based inference.
Motivation
FunctionalCPDto use external (skpro-like) estimators when a user specifies askprotag and supplies a model object.FunctionalBayesianNetworkorchestration simple somodel.fit(data)delegates fitting to each CPD implementation.Description
_fit_external_ml()inFunctionalCPD_Refactorto instantiateSkproAdapterand run.fit(X, y)via the providedestimator, and allowedestimator=Noneby default on the CPD constructor.SkproAdapter(pgmpy/factors/hybrid/SkproAdapter.py) which wraps an external estimator and maps parent columns toXand the CPD variable toybefore callingmodel.fit(X, y).FunctionalCPD_Refactorrecognizesskproand tags starting withskpro.(e.g.tag=['skpro.BayesianLinearRegressor']).FunctionalBayesianNetwork_Refactorto import the refactor CPD class and leftfitorchestration unchanged so each CPD is responsible for fitting itself.DummySkproRegressorto verify the adapter receives the correctX/yshapes and thatmodel.fit(data)triggers the external fit flow.Testing
pytest -v pgmpy/tests/test_models/test_FunctionalBayesianNetwork_Refactor.pyand the suite passed (2 passed).pre-commit run --all-filesbut the environment lackspre-commit(command not found) so hooks were not executed here.Codex Task