effects coding #46

KarinOudshoorn · 2023-07-18T13:54:02Z

When using logitr with effects coded categorical variables (contrast.sum) the transformation from the categorical variable to the dummy variables gives only one dummy column irrespective of how many categorical variables are in the model. To solve this, a check should be done on the contrast of the categorical variables and an alternative to fastDummies should be used.

jhelvy · 2023-07-18T14:14:01Z

I can't reproduce this issue. Here is an example.

{logitr} uses dummy coding by default for categorical variables. In this case, you get 3 brand coefficients as expected:

library(logitr)

model <- logitr(
  data = yogurt, outcome = 'choice', obsID = 'obsID',
  pars = c('price', 'feat', 'brand')
)

coef(model)
#>       price         feat  brandhiland  brandweight brandyoplait 
#>  -0.3665546    0.4914392   -3.7154773   -0.6411384    0.7345195

Now if I use contr.sum to use effects coding, I still get 3 brand coefficient:

yogurt$brand <- as.factor(yogurt$brand)
contrasts(yogurt$brand) = contr.sum(4)

model <- logitr(
  data = yogurt, outcome = 'choice', obsID = 'obsID',
  pars = c('price', 'feat', 'brand')
)

coef(model)

#>     price       feat     brand1     brand2     brand3 
#> -0.3665883  0.4913432  0.9055508 -2.8100654  0.2643329

KarinOudshoorn · 2023-07-18T14:20:38Z

Dear John, I should have been more precise. Indeed with the conditional logit model everything works fine. But it is with the mixed logit when you have categorical variables with a random distribution. Best wishes, Karin From: John Helveston ***@***.***> Sent: Tuesday, July 18, 2023 4:14 PM To: jhelvy/logitr ***@***.***> Cc: Oudshoorn, Karin (UT-BMS) ***@***.***>; Author ***@***.***> Subject: Re: [jhelvy/logitr] effects coding (Issue #46) I can't reproduce this issue. Here is an example. {logitr} uses dummy coding by default for categorical variables. In this case, you get 3 brand coefficients as expected: library(logitr) model <- logitr( data = yogurt, outcome = 'choice', obsID = 'obsID', pars = c('price', 'feat', 'brand') ) coef(model) #> price feat brandhiland brandweight brandyoplait #> -0.3665546 0.4914392 -3.7154773 -0.6411384 0.7345195 Now if I use contr.sum to use effects coding, I still get 3 brand coefficient: yogurt$brand <- as.factor(yogurt$brand) contrasts(yogurt$brand) = contr.sum(4) model <- logitr( data = yogurt, outcome = 'choice', obsID = 'obsID', pars = c('price', 'feat', 'brand') ) coef(model) #> price feat brand1 brand2 brand3 #> -0.3665883 0.4913432 0.9055508 -2.8100654 0.2643329 - Reply to this email directly, view it on GitHub<#46 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AGB6NZGXQHSMDZZSE45D7T3XQ2K3JANCNFSM6AAAAAA2OOEBZI>. You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>

jhelvy · 2023-07-18T14:29:42Z

Ah yes I see. Yes when I include randPars with effects coding, it appears to be ignored. I just get back the same prior model results where brand is modeled with fixed parameters:

library(logitr)

yogurt$brand <- as.factor(yogurt$brand)
contrasts(yogurt$brand) = contr.sum(4)

model <- logitr(
  data = yogurt, outcome = 'choice', obsID = 'obsID',
  pars = c('price', 'feat', 'brand'),
  randPars = c(brand = 'n')
)

coef(model)
#>     price       feat     brand1     brand2     brand3
#> -0.3665883  0.4913432  0.9055508 -2.8100654  0.2643329

I believe this is probably a pretty small issue in the code. It looks like it might be rooted in the names of the variables changing when using effects coding. I'll look into it.

I may also show this as an example in the documentation for those who want to use different coding schemes.

KarinOudshoorn · 2023-07-18T14:35:10Z

I get one sd being estimated, but the others not. It would be great if that is added. Hereby I send you my code on my own (simulated) data (which I generated to use for a tutorial we are writing at the moment), Best wishes, Karin From: John Helveston ***@***.***> Sent: Tuesday, July 18, 2023 4:30 PM To: jhelvy/logitr ***@***.***> Cc: Oudshoorn, Karin (UT-BMS) ***@***.***>; Author ***@***.***> Subject: Re: [jhelvy/logitr] effects coding (Issue #46) Ah yes I see. Yes when I include randPars with effects coding, it appears to be ignored. I just get back the same prior model results where brand is modeled with fixed parameters: library(logitr) yogurt$brand <- as.factor(yogurt$brand) contrasts(yogurt$brand) = contr.sum(4) model <- logitr( data = yogurt, outcome = 'choice', obsID = 'obsID', pars = c('price', 'feat', 'brand'), randPars = c(brand = 'n') ) coef(model) #> price feat brand1 brand2 brand3 #> -0.3665883 0.4913432 0.9055508 -2.8100654 0.2643329 I believe this is probably a pretty small issue in the code. It looks like it might be rooted in the names of the variables changing when using effects coding. I'll look into it. I may also show this as an example in the documentation for those who want to use different coding schemes. - Reply to this email directly, view it on GitHub<#46 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AGB6NZEN2S54UFBDW2JIAU3XQ2MWDANCNFSM6AAAAAA2OOEBZI>. You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>

jhelvy mentioned this issue Oct 17, 2023

TODO #51

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

effects coding #46

effects coding #46

KarinOudshoorn commented Jul 18, 2023

jhelvy commented Jul 18, 2023

KarinOudshoorn commented Jul 18, 2023 via email

jhelvy commented Jul 18, 2023

KarinOudshoorn commented Jul 18, 2023 via email

effects coding #46

effects coding #46

Comments

KarinOudshoorn commented Jul 18, 2023

jhelvy commented Jul 18, 2023

KarinOudshoorn commented Jul 18, 2023 via email

jhelvy commented Jul 18, 2023

KarinOudshoorn commented Jul 18, 2023 via email