-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Classification example? #89
Comments
Hi, we should definitely add more examples to the docs, including a binary response. In the meantime, you can check this https://github.com/Grupo-de-modelado-probabilista/BART/blob/master/experiments/space_influenza.ipynb When you say "As far as I understand from the paper", which paper is that? A with pm.Model() as model:
x = pm.BART('x', X, Y)
y = pm.Bernoulli('y', p=pm.math.sigmoid(x), observed=Y) you will get, after sampling, the values of with pm.Model() as model:
x = pm.BART('x', X, Y)
p = pm.Deterministic('p', pm.math.sigmoid(x))
y = pm.Bernoulli('y', p=p, observed=Y) Then you will get the values of Whether you should choose a logistic, probit or even other inverse link function like cloglog is a modeling choice. Personally, I have not evaluated the impact of this choice on BART models. I assume that with PyMC-BART that choice should have essentially the same pros and cons as with a typical generalized linear model. |
@aloctavodia thanks for fast response. I will check out the notebook. I was referring to the paper BART: Bayesian additive regression trees, which is the original paper for BART, as far as I understand. Thank you for the explanation for |
The implementation proposed in the original BART paper that you mention is different from the one in PyMC-BART, the original is not suitable for a probabilistic programming language like PyMC. For instance, PyMC-BART does not use conjugate-prior as described in the original BART paper. Welcome to PyMC (and PyMC-BART)! |
@aloctavodia I tried running the code with your suggestion:
Training went well. In test, I'm getting the error:
7877 is length of my training data, 2626 is length of my test data. I'm not sure if this is the right place to discuss this, or should I make a separate issue for this? |
You need to specify that the shape of the likelihood is the shape of the mutable variable. This is something related to PyMC and not directly with PyMC-BART. Something like this: with pm.Model() as model:
...
X_s = pm.MutableData("X_s", X)
....
_ = pm.Normal("y_obs", mu=μ, sigma=ϵ, observed=Y, shape=X_s.shape)
with model:
pm.set_data({"X_s": new_values})
pm.sample_posterior_predictive(idata, extend_inferencedata=True) |
@aloctavodia thank you, my model is finally working on binary classification! However, I still have trouble with a multiclass one. Is it even possible with the default BART? I tried defining my model as:
I get an error:
Using
I guess that makes sense, probability does not sum to 1. I have seen your response here, and I:
So my code is:
And I get an error:
Which is quite weird, since Lastly, I tried your code here:
But this way, I get yet another error:
|
As far as I understand from the paper, the only thing needed to perform classification with BART, i.e. have binary response, is adding a probit link.
However, this question uses sigmoid (logistic link), but without
pm.Deterministic
:Bayesian Computation Book in exercises 7M10 and 7M11 suggests modifying this code, which uses sigmoid, but with
pm.Deterministic
, like:Another discussion uses inverse probit distribution, also with
pm.Deterministic
:Which option should be used for BART binary classification? Also, adding BART classification example (even a very small code snippet) to the documentation would be really useful.
The text was updated successfully, but these errors were encountered: