Nested random intercepts? #406

slamander · 2025-02-18T18:28:07Z

Hi sdmTMB team,

I'm curious if it is possible to use nested random intercepts. Below I create a nested grouping variable, made up of group and site, based on rounded coordinates. However, I can only implement them as crossed random intercepts, instead of nested ones:

library(tidyverse)
library(sdmTMB)
#> Warning: package 'sdmTMB' was built under R version 4.3.3

# create lower level grouping variable, "site":
sites <- pcod %>%
  mutate(
         x_round1 = round(X, -1) %>% as.factor(),
         y_round1 = round(Y, -1) %>% as.factor()) %>%
  group_by(x_round1, y_round1) %>%
  mutate(site = row_number()) 

# create higher level grouping variable, "group":
groups <- pcod %>%
  mutate(x_round2 = round(X, -2) %>% as.factor(),
         y_round2 = round(Y, -2) %>% as.factor()) %>%
  group_by(x_round2, y_round2) %>%
  mutate(group = row_number()) 

# incorporate site and group into the pcod data:
pcod_nesting <- pcod %>%
  mutate(x_round1 = round(X, -1) %>% as.factor(),
         y_round1 = round(Y, -1) %>% as.factor(),
         x_round2 = round(X, -2) %>% as.factor(),
         y_round2 = round(Y, -2) %>% as.factor()) %>%
  left_join(sites) %>%
  left_join(groups) %>%
  select(-c(x_round1, x_round2, y_round1, y_round2)) %>%
  mutate(site = as.factor(site),
         group = as.factor(site))

# create mesh
mesh <- make_mesh(pcod_nesting, c("X", "Y"), cutoff = 10)

# run sdmTMB with crossed random intercepts for "group" and "site":
cross <- sdmTMB(
  formula = present ~ depth_scaled + depth_scaled2 + (1 | group) + (1 | site),
  mesh = mesh, # can be omitted for a non-spatial model
  family = binomial(link = "logit"),
  spatial = "off",
  data = pcod_nesting)

# run sdmTMB with nested random intercepts for "group" and "site":
nest <- sdmTMB(
  formula = present ~ depth_scaled + depth_scaled2 + (1 | group / site),
  mesh = mesh, # can be omitted for a non-spatial model
  family = binomial(link = "logit"),
  spatial = "off",
  data = pcod_nesting)
#> Error: Random effect group column `site:group` is not a factor.

An error is thrown indicating that it recognizes the syntax of the nested term, but it is looking for a single column for the data?

Any help would be appreciated.

-Alex.

The text was updated successfully, but these errors were encountered:

seananderson · 2025-02-18T19:34:32Z

The nesting syntax is just a convenience as far as I know. It's useful if subgroup IDs are not unique across groups. But you can make those subgroup IDs unique. I believe you can always fit the same model by recoding the IDs. E.g.

library(dplyr)
set.seed(99)
d <- tibble::tibble(
  frog_dens = rlnorm(100), 
  vegetation = rnorm(100), 
  pond = gl(25, 4), # generates factor levels
  transect = rep(gl(2, 2), 25))
d <- d |> mutate(transect2 = gl(50, 2))
d
#> # A tibble: 100 × 5
#>    frog_dens vegetation pond  transect transect2
#>        <dbl>      <dbl> <fct> <fct>    <fct>    
#>  1     1.24      -1.53  1     1        1        
#>  2     1.62      -0.501 1     1        1        
#>  3     1.09      -1.21  1     2        2        
#>  4     1.56      -0.630 1     2        2        
#>  5     0.696     -1.45  2     1        3        
#>  6     1.13      -0.167 2     1        3        
#>  7     0.422      1.59  2     2        4        
#>  8     1.63      -0.230 2     2        4        
#>  9     0.695     -0.573 3     1        5        
#> 10     0.274      0.563 3     1        5        
#> # ℹ 90 more rows

(m1 <- lmer(log(frog_dens) ~ vegetation + (1 | pond/transect), data = d))
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: log(frog_dens) ~ vegetation + (1 | pond/transect)
#>    Data: d
#> REML criterion at convergence: 265.1362
#> Random effects:
#>  Groups        Name        Std.Dev.
#>  transect:pond (Intercept) 0.2107  
#>  pond          (Intercept) 0.3100  
#>  Residual                  0.8243  
#> Number of obs: 100, groups:  transect:pond, 50; pond, 25
#> Fixed Effects:
#> (Intercept)   vegetation  
#>    -0.09739      0.07563

(m2 <- lmer(log(frog_dens) ~ vegetation + (1 | pond) + (1 | pond:transect), data = d))
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: log(frog_dens) ~ vegetation + (1 | pond) + (1 | pond:transect)
#>    Data: d
#> REML criterion at convergence: 265.1362
#> Random effects:
#>  Groups        Name        Std.Dev.
#>  pond:transect (Intercept) 0.2107  
#>  pond          (Intercept) 0.3100  
#>  Residual                  0.8243  
#> Number of obs: 100, groups:  pond:transect, 50; pond, 25
#> Fixed Effects:
#> (Intercept)   vegetation  
#>    -0.09739      0.07563

(m3 <- lmer(log(frog_dens) ~ vegetation +  (1 | pond) + (1 | transect2), data = d))
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: log(frog_dens) ~ vegetation + (1 | pond) + (1 | transect2)
#>    Data: d
#> REML criterion at convergence: 265.1362
#> Random effects:
#>  Groups    Name        Std.Dev.
#>  transect2 (Intercept) 0.2107  
#>  pond      (Intercept) 0.3100  
#>  Residual              0.8243  
#> Number of obs: 100, groups:  transect2, 50; pond, 25
#> Fixed Effects:
#> (Intercept)   vegetation  
#>    -0.09739      0.07563

^{Created on 2025-02-18 with reprex v2.1.1.9000}

On a related topic, we have random slopes working in a branch and should merge into main soon. I'm not sure if we get the nested syntax for free in that branch.

slamander · 2025-02-18T21:07:57Z

Thanks for getting back to me so quickly! I'm not sure if you were identifying my factor formatting or random intercept syntax as the focal issue... I should have also mentioned that the alternative syntax for a nested intercept fails.

But, I took your example and made fake coordinates so I could put it into a sdmTMB model. I get the same original error for the first two nested intercept syntax calls. Except, on the final call (m3), which uses a crossed random effect and your modified transect factor (transect2) it actually does run successfully! So, I'm curious, is this a work around--in which you can use the syntax of a crossed random intercept but get a nested random intercept in function?

library(tidyverse)
library(sdmTMB)

d <- tibble::tibble(
  x = runif(100, 70, 82),
  y = runif(100, 120, 150),
  frog_dens = rlnorm(100), 
  vegetation = rnorm(100), 
  pond = gl(25, 4), # generates factor levels
  transect = rep(gl(2, 2), 25))

d <- d |> mutate(transect2 = gl(50, 2))

mesh <- make_mesh(d, c("x", "y"), n_knots = 85); plot(mesh)

m1 <- sdmTMB(log(frog_dens) ~ vegetation + (1 | pond/transect), mesh = mesh, data = d)
#> Error: Random effect group column `transect:pond` is not a factor.

m2 <- sdmTMB(log(frog_dens) ~ vegetation + (1 | pond) + (1 | pond:transect), mesh = mesh, data = d)
#> Error: Random effect group column `transect:pond` is not a factor.

m3 <- sdmTMB(log(frog_dens) ~ vegetation +  (1 | pond) + (1 | transect2), mesh = mesh, data = d)

Any clarification about what's happening on my end would be great!

Best,

-Alex.

seananderson · 2025-02-18T21:30:35Z

I'm not sure if you were identifying my factor formatting or random intercept syntax as the focal issue

It's sort of both. A given factor coding requires a given random effect grouping syntax.

is this a work around--in which you can use the syntax of a crossed random intercept but get a nested random intercept in function?

You could think of it as a workaround to the identical model. The nested random effect syntax is just a convenience you can use if your random effects are coded as:

> dplyr::select(d, pond, transect)
   pond  transect
 1 1     1       
 2 1     1       
 3 1     2       
 4 1     2       
 5 2     1       
 6 2     1       
 7 2     2       
 8 2     2

instead of:

> dplyr::select(d, pond, transect2)
   pond  transect2
 1 1     1        
 2 1     1        
 3 1     2        
 4 1     2        
 5 2     3        
 6 2     3        
 7 2     4        
 8 2     4

I.e., the subgroups don't have unique IDs across groups.

A quick way to fix this would be to paste the group and subgroup IDs together into a new unique subgroup ID. That's what the pond:transect syntax above basically does by using the interaction syntax. And, yeah, that's probably not set up to work with our more simplistic parsing in sdmTMB.

So, these are all identical models:

set.seed(99)
d <- tibble::tibble(
  x = runif(100, 70, 82),
  y = runif(100, 120, 150),
  frog_dens = rlnorm(100), 
  vegetation = rnorm(100), 
  pond = gl(25, 4), # generates factor levels
  transect = rep(gl(2, 2), 25))
d <- d |> dplyr::mutate(transect2 = gl(50, 2))

# all identical models:
m1 <- glmmTMB::glmmTMB(log(frog_dens) ~ vegetation +  (1 | pond/transect), data = d)
m2 <- glmmTMB::glmmTMB(log(frog_dens) ~ vegetation +  (1 | pond) + (1 | transect2), data = d)
m_sdmTMB <- sdmTMB::sdmTMB(log(frog_dens) ~ vegetation +  (1 | pond) + (1 | transect2), spatial = "off", data = d)

logLik(m1)
#> 'log Lik.' -146.6231 (df=5)
logLik(m2)
#> 'log Lik.' -146.6231 (df=5)
logLik(m_sdmTMB)
#> 'log Lik.' -146.6231 (df=5)

^{Created on 2025-02-18 with reprex v2.1.1.9000}

So if you want a 'nested' random intercept in sdmTMB, just make sure to code your subgroups as unique.

slamander · 2025-02-19T18:25:28Z

Thank you for spelling this out for me! So would this be an example of using "natural" versus "implicit nesting" notation--in which the subgroupings is the only thing affecting the interaction variance? E.g. Schielzeth & Nakagawa, 2012.

Also, apologies for my misleading example--my real data do indeed have unique subgroupings. I hadn't considered making them unique in the example, as I assumed the issue arose elsewhere.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nested random intercepts? #406

Nested random intercepts? #406

slamander commented Feb 18, 2025

seananderson commented Feb 18, 2025

slamander commented Feb 18, 2025 •

edited

Loading

seananderson commented Feb 18, 2025

slamander commented Feb 19, 2025

Nested random intercepts? #406

Nested random intercepts? #406

Comments

slamander commented Feb 18, 2025

seananderson commented Feb 18, 2025

slamander commented Feb 18, 2025 • edited Loading

seananderson commented Feb 18, 2025

slamander commented Feb 19, 2025

slamander commented Feb 18, 2025 •

edited

Loading