You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cc @dvukcevic - I would love a second opinion on this - when you get a chance of course.
This PR was motivated by the (somewhat embarrassing) realization that our summary() function did not work for the class conditional or hierarchical Dawid-Skene model. This PR fixes this but in doing so made some other major changes.
In particular I realized there was a large inconsistency with how point_estimate(), posterior_interval() and posterior_samples() interacted with the theta parameter in the class conditional model.
At some point in the past I had decided that for the CC model point_estimate() would return the full theta array (i.e. the same number of parameters as in the full DS model) calculated from it's own J by K theta parameter. I made this decision so that plotting theta for the class conditional model would work and be directly comparable to the full DS model. (With the idea that the CC model would generally be fit after and then compared to the full DS model).
At the time posterior_samples() and posterior_interval() were implemented however I had forgotten about the behavior of point_estimate() so posterior_samples() only returned the actual 'reduced' theta parameter of the CC model and this caused problems in posterior_interval() which was (one of) the problems in the summary method...
(NB: We don't support returning the theta parameter from the hierarchical DS model at all because it can't be interpreted in the same way as the full and CC DS model)
To get around this problem I decided the easiest thing to do was to simply ALWAYS convert the theta parameter in the CC model to the full DS form (and that is what is implemented in this PR). This ensures consistenty across all functions in the interface from point_estimate() to mcmc_diagnostics(). I think this the best approach but I'd appreciate your opinion on it!
Advantages:
Easier for users to compare the CC and full DS
Simplest implementation by far (full and CC methods can have identical code to deal with theta),
Disadvantages:
May confuse the user about what the CC model is and how it differs from the full DS model.
It's important to note the 'true' parameters of the CC model can easily be extract from all the returned functions.
This sounds like a good approach for the initial version of the package.
Later on I think it would be worth discussing in more detail about the return values and behaviour for all of these functions for all of the different models (that's quite a few combinations!). We might want to have a standardised way of extracting/viewing either the underlying parameters specific to each model (the 'true' parameters, as you called them) or the full set of parameters derived from them (corresponding to the parameters of the standard DS model).
Jeffrey:
I agree that some standardized way of extracting 'true' vs Dawid-Skeneised parameter values is the way to go long term.
I am going to think a bit more about whether the approach implemented here is the correct one for now though I think it probably is.
Thanks for your feedback!
The text was updated successfully, but these errors were encountered:
Context:
Jeffrey:
cc @dvukcevic - I would love a second opinion on this - when you get a chance of course.
This PR was motivated by the (somewhat embarrassing) realization that our summary() function did not work for the class conditional or hierarchical Dawid-Skene model. This PR fixes this but in doing so made some other major changes.
In particular I realized there was a large inconsistency with how point_estimate(), posterior_interval() and posterior_samples() interacted with the theta parameter in the class conditional model.
At some point in the past I had decided that for the CC model point_estimate() would return the full theta array (i.e. the same number of parameters as in the full DS model) calculated from it's own J by K theta parameter. I made this decision so that plotting theta for the class conditional model would work and be directly comparable to the full DS model. (With the idea that the CC model would generally be fit after and then compared to the full DS model).
At the time posterior_samples() and posterior_interval() were implemented however I had forgotten about the behavior of point_estimate() so posterior_samples() only returned the actual 'reduced' theta parameter of the CC model and this caused problems in posterior_interval() which was (one of) the problems in the summary method...
(NB: We don't support returning the theta parameter from the hierarchical DS model at all because it can't be interpreted in the same way as the full and CC DS model)
To get around this problem I decided the easiest thing to do was to simply ALWAYS convert the theta parameter in the CC model to the full DS form (and that is what is implemented in this PR). This ensures consistenty across all functions in the interface from point_estimate() to mcmc_diagnostics(). I think this the best approach but I'd appreciate your opinion on it!
Advantages:
Disadvantages:
It's important to note the 'true' parameters of the CC model can easily be extract from all the returned functions.
Fixes #133 + a lot more besides
Damjan:
This sounds like a good approach for the initial version of the package.
Later on I think it would be worth discussing in more detail about the return values and behaviour for all of these functions for all of the different models (that's quite a few combinations!). We might want to have a standardised way of extracting/viewing either the underlying parameters specific to each model (the 'true' parameters, as you called them) or the full set of parameters derived from them (corresponding to the parameters of the standard DS model).
Jeffrey:
I agree that some standardized way of extracting 'true' vs Dawid-Skeneised parameter values is the way to go long term.
I am going to think a bit more about whether the approach implemented here is the correct one for now though I think it probably is.
Thanks for your feedback!
The text was updated successfully, but these errors were encountered: