Inverse filtering for hidden markov models
Consider a weather markov chain, it is sunny or rainy (2 states, X = {xsunny, xrainy}) and there is a transition matrix which gives us the conditional probability table of Xt|Xt-1:
So, if we know that P(xsunny, 1) = known P(xsunny, t) = Σ P(xsunny, t|xt-1)P(xt-1) summation over xt-1 where x ∈ X
Now, in a HMM, we have observations as well, so that we can keep our belief upto date
X is the state (rainy? Sunny?), E is the evidence variable, (did the prof carry umbrella?). The evidence E depends only on the state of X
Transitions: defines how the state at t depends on states in the past (this is independent of the evidence, the evidence is just observation, not influencing the transitions)
Emissions: what is the probability of seeing the evidence for each underlying state (so, if there is rain, the umbrella is present with probability 0.9 for eg)
And we have B - the belief (the posterior) - which represents what we know about the state of the system So, we have: Bt(X) = Pt(Xt|e1, ...., et)
We start with B1(X) in an initial setting, usually uniform
If the state space (|X|) is too large, it will be difficult to make inference from the belief (posterior) distribution So, we can do approximate inference:
- track samples of X, not all values
- samples are called particles
- more the samples, more the probability of the state being that
So, here 🔝 instead of having 9 numbers, (we have 9 possible states here) (9 numbers representing the probability of being there, inferred from the posterior), we can have 10 particles and the probability of being in a particular state ∝ #particles there are in that square
We start with particles uniformly placed, and then we move them. When we observe evidence, the samples get weights, and then we resample according to the samples weight in their state
The paper aims to take a HMM, and looking at the sequence of it’s state posteriors, give us the :
- corresponding sequence of observations
- observation likelihoods probabilities (the matrix which gives us the accuracy of the sensors)
We have:
- X → the state vector
- yk → the evidence vector (what we observed)
- P → the transition probability matrix for X, Pij is the P of going from i to j
- π0 → posterior (at time 0 here)
- B → the matrix representing how accurate the sensors are (what we want to find out)
We’ll get X, B from π
The update equation is simple:
We’ll assume we know P
3 problems we’ll solve:
- Inverse filtering with unknown observations
- we have P, B, sequence of posteriors
- we’ll reconstruct observed evidence (observations)
- we have P, B, sequence of posteriors
- Inverse filtering with unknown sensor
- we have P, sequence of posteriors, sequence of observations
- we’ll reconstruct B (how accurate are the sensors?)
- we have P, sequence of posteriors, sequence of observations
- Inverse filtering with unknown sensor and unknown observations
- we have P, sequence of posteriors (noisy posteriors okay)
- we’ll reconstruct B (how accurate are the sensors?) And sequence of observations
- we have P, sequence of posteriors (noisy posteriors okay)