Rethinking `degroup()` for cross-classified data

_Follow-up up @jmgirard's #520_

I think the current implementation of cross-classified disaggregation is missing a desiderata.

First let's note some desiderata that we _do_ have:
1. The "between" variable are simply the separate group means.

<details><summary>Make some crossed data</summary>

```R
mu <- 100
ul <- setNames(c(-1, -3, 0, 4), nm = letters[1:4])
uL <- setNames(c(10, 30, 0, -40), nm = LETTERS[1:4])
um <- setNames(c(100, 150, -250), nm = month.abb[1:3])

dat <- expand.grid(l = letters[1:4], L = LETTERS[1:4], m = month.abb[1:3])

set.seed(111)
e <- rnorm(nrow(dat)-1) |> round(2)
e <- append(e, -sum(e))

dat$y <- mu + ul[dat$l] + uL[dat$L] + um[dat$m] + e
dat$z <- mu + ul[dat$l] + uL[dat$L] + um[dat$m] + 10*e
```
</details>

```R
dat_dem <- datawizard::demean(dat, by = c("l", "L", "m"), select = c("y","z"))

all.equal(c(dat_dem$y_l_between), ave(dat$y, dat$l))
#> TRUE
all.equal(c(dat_dem$y_L_between), ave(dat$y, dat$L))
#> TRUE
all.equal(c(dat_dem$y_m_between), ave(dat$y, dat$m))
#> TRUE
```

2. The sum of an observation's "between"/"within" variables is equal to the original observation

```R
all.equal(rowSums(dat_dem[grepl("^y_", colnames(dat_dem))]), dat$y)
#> TRUE
```

What we don't have is that -- unlike with a single grouping variable or with nested designs -- the "within" variable is mean centered:

```R
mean(dat_dem$y_within)
#> -200
```

This is equal to $(-\bar{Y})\times (\text{number of grouping vars} - 1)$

```R
-mean(dat$y) * (3-1)
#> -200
```

I think this is something we want, for consistency (typically "within" is considered to be automatically double-centered), however with a crossed design this cannot be achieved without compromising on desiderata 1 or 2. 






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Rethinking `degroup()` for cross-classified data #637

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Rethinking degroup() for cross-classified data #637

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Rethinking `degroup()` for cross-classified data #637