do operator / conditioning #5280
Replies: 7 comments 17 replies
-
I like it and we can definitely weigh the pros and cons of the different approaches. I don't think we should restrict ourselves to what is possible right now but also think about what we would like the ideal API to look like. For my money, I do like the last code snippet where we don't even set an # 1. define joint P(mu, sigma, x)
with pm.Model() as m:
mu = pm.Normal("mu", 0, 1)
sigma = pm.HalfNormal("sigma", 1)
x = pm.Normal("obs", mu, sigma)
# 2. Generate N samples from P(x|mu=0, sigma=0.1)
with m:
data = pm.sample_cond(var=x, mu=0, sigma=.1, samples=100)
# alternative 2: sample all RVs not specified - infer x
data = pm.sample_cond(mu=0, sigma=.1, samples=100)
# alternative 3
data = pm.sample_prior_predictive(mu=0, sigma=.1, samples=100)
# Sample from posterior
with m:
# idea 1
idata = pm.sample(observed={x: data})
# idea 2
m.set_observed(x=data) # or x.set_observed(data)
idata = pm.sample()
# idea 3
idata = pm.sample_cond(x=data, samples=1000) # do posterior inference, do we still need pm.sample()? related to alternative 2 above. In this case, might want to just add the conditioning to pm.sample() directly. |
Beta Was this translation helpful? Give feedback.
-
Wanted to contribute to the discussion a notebook on counterfactuals. I think I have it done right, though if others are better-versed in the topic than I am, I'd be open to correction. https://github.com/ericmjl/causality/blob/master/docs/07-do-operator.ipynb |
Beta Was this translation helpful? Give feedback.
-
@ericmjl had another really neat idea: counterfactual_model = model.do(x=3)
with counterfactual_model:
pm.sample() Could also do stochastic do operators: counterfactual_model = model.do(x=pm.Normal("x", mu=3)) I think what this is really doing though is just replacing RVs, so maybe the more apt name would be: |
Beta Was this translation helpful? Give feedback.
-
I like |
Beta Was this translation helpful? Give feedback.
-
So this is either a genius idea or it isn't... We can replace a node with a constant, or a stochastic, and cut the incoming edges. But does it also make sense to be able to do more advanced graph surgery? We should be able to surgically attach an entire model into another model, right? What I'm thinking is kind of inspired by human cognition. Let's say we go to school and we learn about the causal structure of Interesting Thing A. The next day we go to school and learn about the causal structure of Interesting Thing B. Then if it is pointed out to us that Interesting Thing A and B are related (e.g. they share a node) then you can combine your understanding into a larger causal graph. This could feasibly be a big deal. Let's say we were interested not just in parameter estimation of a proposed model, but in causal discovery, then the ability to focus in on sub-problems allows the problem to become much more tractable. Kind of speculating here, but can't we think of this kind of thing happening when we revise our beliefs about the causal structure of the world? Anyway... while we are in brainstorming mode, I'd just like to put this proposal forward that we could replace a node (with the do operator) with an entire model and get back a new, combined model. |
Beta Was this translation helpful? Give feedback.
-
Replacing a node with a model. Isn't it similar to a stochastic node? We do no inference there and only query the model about our hypothesis. If you had a model you could pass the random variable from its idata posterior. Is it what you mean? The result would combine the estimated model 1 with estimated model 2 into counterfactual trace with predictions, yet another idata |
Beta Was this translation helpful? Give feedback.
-
Here are a few citations that discuss weak/unreliable/uncertain interventions. These include scenarios in which the causal intervention is successful only to some extent (e.g., a doctor recommending a lifestyle change to a patient) and scenarios in which the intervention impacts the observed variables in unknown ways (e.g., a new drug that may impact one or many different genes). In all cases, the causal interventions can modeled by augmenting the causal graph (e.g., an "intervention" node added to the graph) in ways that are more elaborate than the traditional do-operator implies.
|
Beta Was this translation helpful? Give feedback.
-
Desired functionality
It would be pretty cool if you could:
Below I have summarised some approaches and relevant information. This is doable now, but what would be nice is if this were to become more of an in-built language feature with a smoother and simpler API. Thoughts and ideas very welcome.
An existing solution
Chatting with @ricardoV94 and @lucianopaz, we came up with the follow code that works (in v4) with a simple example:
This is already pretty neat. You can relatively concisely define a joint distribution (or rather a function which returns a model) in step 1, condition upon data in step 2, then in step 3 run inference, conditioning on the data generated in step 2.
While the use of the model factory is pretty simple, I think it would not necessarily be simple to a newcomer to PyMC. So while it is pretty concise (and neat!) I don't think it represents the most optimal API.
An alternative approach
@ricardoV94 also suggested this approach. I favour this one a little less as it requires you to pre-commit to
N
at the time of defining the joint distribution.Alternative approaches for step 2
@ricardoV94 also suggested
aesara.function([mu, sigma], x)
which can be used for step 1 and 2do operator?
@ericmjl raised the point that step 2, where we condition on data, is basically the do operator from Pearl/causal inference. So the API for step 2 could be called conditioning, or something like
do
to make the link with causal inference stronger. I think this would be pretty cool!Beta Was this translation helpful? Give feedback.
All reactions