Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Environment variable state breaks mice() #527

Open
mculbert opened this issue Dec 16, 2022 · 3 comments
Open

Environment variable state breaks mice() #527

mculbert opened this issue Dec 16, 2022 · 3 comments

Comments

@mculbert
Copy link

A somewhat obscure error—if:

  1. The data frame passed to mice() contains a variable called state that consists of either: (a) a character vector (of potentially different values) or (b) only a single (repeated) value (of any type), AND
  2. There is a variable called state available in the environment (either the global environment or an attached data frame),

then mice() throws the error:

Error in s$it : $ operator is invalid for atomic vectors

Examples:

library(mice) # version 3.15.0
mynhanes <- mice::nhanes
state <- "zen"

mynhanes$state <- rnorm(25)
imp <- mice(mynhanes)  # No error

mynhanes$state <- sample(c("WA", "OR", "CA"), 25, replace=T)
imp <- mice(mynhanes)  # Error

mynhanes$state <- 3.1415
imp <- mice(mynhanes)  # Error

rm(state)
imp <- mice(mynhanes)  # No error (warning about logged events)

attach(mynhanes)
imp <- mice(mynhanes)  # Error

The error is coming from here:

it = s$it,

because the call to ma_exists("state", ...) on either line 100 or 103 is apparently accessing the wrong variable in the environment through some kind of iterated search of parent environments here:
pos <- parent.frame(n = nn)

The intended state variable (wherever it comes from) should perhaps be encapsulated a little more explicitly in a mice-specific data structure, rather than doing an open search of the environment. But, as I'm not familiar with mice()'s innards, I'm not sure what the best fix would be. Maybe it's as simple as renaming state to something a little less generic, like mice_internal_state_ so there is less likely to be a conflict with user variable names.

@stefvanbuuren
Copy link
Member

Thanks for noting.

mice uses a list named "state" for logging. I never realised that its name could clash with a variable named "state", which is quite common. Renaming it to something less used could be a quick and practical fix. Need to think about side effects renaming may have.

@dannychu1108
Copy link

Hi, I was trying to use on my dataset without variable name of "state" and still having this issue. May I know how should I solve that?
It could run before but suddenly there's an error today.

@stefvanbuuren
Copy link
Member

I would expect that rm(state) and not using attach() (or attach-like operations) should evade the problem, as suggested by the original post.

If this does not work for you, you might have hit a new problem case. I would then need a few more details to reproduce.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants