-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
additional tests of pivoting logic; bootstrapped effects for mixed models #75
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #75 +/- ##
==========================================
+ Coverage 99.45% 99.53% +0.07%
==========================================
Files 6 6
Lines 185 214 +29
==========================================
+ Hits 184 213 +29
Misses 1 1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing I'm not really clear on is the logic of the new test employing \beta
: I think I'm getting mixed up about which coefficients map to what. Would be useful to spell the logic out in a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ext/EffectsMixedModelsExt.jl
Outdated
bootdf = unstack(DataFrame(boot.β), :coefname, :β) | ||
β = Vector{Float64}(undef, ncol(bootdf) - 1) | ||
boot_err = mapreduce(vcat, 1:nrow(bootdf)) do iter | ||
copyto!(β, @view bootdf[iter, 2:end]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a minute to realize why it's indexed like this: you're doing unstack
above. Is that easier than grouping by coefname
and using the β
column in the multiplication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on why that's the case? It's not obvious to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because that's how I thought about it first, but here ya go 😄 51e1aab
copyto!(β, @view bootdf[iter, 2:end]) | ||
return X * β | ||
end | ||
err = map(std, eachcol(boot_err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't boot_err
only ever have one column? X * β
should be a vector, unless I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a small error here that lead to it always be a single column, but the aggregation was correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking my understanding now: Because of the transpose above, boot_err
is an iteration × observation matrix, so std
over each column is the standard deviation across iterations for each observation. Assuming that's correct, 👍
ext/EffectsMixedModelsExt.jl
Outdated
end | ||
|
||
reference_grid[!, lower_col] = first.(ci) | ||
reference_grid[!, upper_col] = last.(ci) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be transformed with the link function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
level=nothing) | ||
grid = expand_grid(design) | ||
dv = something(eff_col, _responsename(model)) | ||
level = isnothing(level) ? 0.68 : level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just make the level
keyword argument default to 0.68. Also, can you add a comment about why 0.68?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is explained in the other methods for effects
-- the nothing
is chosen to match their behavior in terms of arguments accepted.
The pivoting logic is very tricky and easy to get wrong, so I added a few more tests to increase our confidence in the results.
I also added an experimental method for computing effects from a
MixedModelBootstrap
. This may be interesting on its own (though I seriously doubt it will differ much from the vcov-based method because it's a parametric bootstrap), but it also doesn't rely on pivoting to work with rank deficient models and so provides an additional way to cross-check the pivoted vcov-vased method. (The pivoted coefficients are stored in-place as -0.0, so they just cancel out for the prediction itself, but the associated entries in the vcov matrix are NaN, which are poisonous and thus must be pivoted out.)It also looks like there is a complex compatibility problem between certain Julia, MixedModels and StatsModels 0.6 versions, which becomes apparent in the use of the bootstrap. I'm just going to drop explicit StatsModels 0.6 testing. We still test against it via the Julia 1.6 test, which limits the MixedModels version to a version requiring StatsModels 0.6.
Otherwise, we can backport bugfixes to the 1.2 series and StatsModels 0.7 is well supported nowadays.