-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should models agree more? #73
Comments
@lawinslow Here is the relevant part of the email I sent you regarding this issue (raised by N Lottig, so thanks to him!): Testing/ experimenting So I ran some experiments. I took the noise out of the simulated data, then ran 2 scenarios: one where K was 0, and one where K was 0.4. Expectations With no noise, Kalman == Bayes == MLE. I would expect OLS to be somewhat close to those, too. Then I would expect BK to be a bit different, because it's model structure is so unique. With no noise AND no K, I would expect Kalman == Bayes == MLE == OLS. This is b/c OLS has the same model structure as the dynamic models (updated at every time step, Kalman, Bayes, kinda MLE), but now w/o the K the per time step updates upon which the next prediction is formed should matter less. Results: No Noise, No K OLS == Kalman == Bayes. BK is close to these, but not quite the same. But MLE is like wtf low. This is confusing to me. Results: No Noise, with K Kalman == OLS ~~ BK. MLE is still wtf low. But now Bayes is higher ... why? I would have expected Kalman and Bayes to be the most similar pair of methods in this scenario. It might be possible this has something to do with permitting uncertainty on K. The Bayes model is written so that the variance of K is 10% of the mean K value that day; except when K is 0, then the model says that the variance around K is 1E-9 ... so super tiny. K is not a parameter, so its value isn't fitted at all. However, by putting uncertainty on K, you allow for the estimates of DO at each time step to be more uncertain due to the influence of uncertain gas exchange. As a result, this should make the process variance higher, at least relative to the observation variance. In the case where K is 0, process variance should be more similar to the observation variance. Remember that the key in fitting the parameters isn't just the size of these variances, but their sizes relative to each other, because that tells the model how much to trust the model prediction vs. the observations when deciding what "true" DO is. So, digging into the attributes of model output for Bayes, I see that with K obs variance = 0.008 and process variance = 0.021. In the no K scenario, obs variance = 0.012 and process variance = 0.029. So there isn't much change, but variances go up, and observation variance gets slightly closer to process variance. So maybe that's part of why Bayes differs from Kalman so much when K is turned on. I cannot find any reason why MLE is so screwed up on its R. BTW, I didn't mention this yet I don't think: It is the R that is screwed up, and I'm not just inferring that from the NEP.
So the MLE R parameter is ~ double what it is for the others, whereas the MLE GPP parameter is only marginally higher. Looking at a quick histogram of do.obs - do.sat for the simulated data shows that most of the time the fake DO data is undersaturated. On average, the water DO is 1.18 mg/L O2 lower than atmospheric equilibrium. In the No K scenario, no net change in DO simply means that NEP should be 0 (all models except MLE basically get this answer). Int he scenario where K is not 0, things become a bit more complicated in this regard, but over time it should be clear that no net change in DO that is undersaturated must be maintained by negative NEP. So Bayes (and to a lesser extent BK) get this wrong. Kalman and OLS get this right, but MLE gets it a little too right. If MLE was only showing hugely negative R in the with-K scenario, I would say that something is wrong in the MLE model where it's saying that the atmospheric exchange is too high, and this is being counterbalanced by really high R. But the fact that the problem exists without K is really stumping me. Conclusion I'm tempted to retest these models with a simplified version of the exchange – i.e., one where we don't do the fancy calculus between time steps, because that would make it much easier to ensure that something isn't screwed up in that part of the model. A final proposed solution is that the optimization is incomplete. BK and OLS are just closed-form, and they consistently perform well. Kalman somehow produces reasonable answers, even though it uses optim() too. With Bayes, it may be that the chains aren't being run for long enough, and for MLE it could be an oddity with getting stuck in a crappy local minimum. So my 2 guesses are 1) something is wrong with the models and I don't know what it is; 2) the variability among models is the result of shitty fitting of parameters. |
Wow, that's a little concerning. Thanks for bringing this up, Ryan (and
would say that something is wrong in the MLE model where it's saying Yeah that is really weird. Did you ever explore trying different initial
both b/c it doesn't agree with Kalman, and because it says that NEP is Bayes is really positive! Maybe worth reexamining the bayes code too. I'm tempted to retest these models with a simplified version of the This is a good idea and a quick check. Kalman somehow produces reasonable answers, even though it uses optim() Or maybe something is wrong with the MLE code, typo, etc... I'll look On Mon, Dec 1, 2014 at 8:37 AM, Ryan Batt [email protected] wrote:
Jacob A. Zwart |
@lawinslow : we've talked about this a bit.
Using the simulated data at the end of
?metab
produces the following:And here is the associated R code:
The text was updated successfully, but these errors were encountered: