-
Notifications
You must be signed in to change notification settings - Fork 2
/
Computer_Whisperers_Stepwise_Regression_Analyses_for_pub.Rmd
224 lines (145 loc) · 9.45 KB
/
Computer_Whisperers_Stepwise_Regression_Analyses_for_pub.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
---
title: 'Computer Whisperers: Stepwise Regression Analyses'
author: "Chantel S. Prat, Tara M. Madhyastha, Malayka J. Mottarella, Chu-Hsuan Kuo"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
pdf_document:
toc: yes
html_document:
df_print: paged
toc: yes
---
\pagebreak
# Intro
Included in this R Markdown file are our stepwise regression analyses for predicting 1) Learning Rate, 2) Programming Accuracy, and 3) Declarative Knowledge (please see the corresponding readme.txt file in our github for detailed information on our dependent variables and our predictors).
For our stepwise regressions, we utilized the package olsrr developed by Aravind Hebbali.
https://www.rdocumentation.org/packages/olsrr/versions/0.5.2
Specifically, the stepwise regression method we chose (ols_step_both_p) builds a "regression model from a set of candidate predictor variables by entering and removing predictors based on p values, in a stepwise manner until there is no variable left to enter or remove any more." Missing values are removed in a listwise manner prior to running the stepwise regression for every dependent variable. Only participants with an EEG Usability score of 1 or greater (indicating that they had at least one detectable alpha peak in the O2 channel) were utilized in analyses.
Finally, Bayesian Information Criterion (BIC) values are reported for every step of the regression analyses for every dependent variable, up to the final model output determined by the stepwise regression. BIC values are also reported for models that include an additional excluded variable on top of the variables included in the final model to allow for comparison with the final model determined by the stepwise regression.
```{r setup, include=FALSE}
knitr::opts_chunk$set(warning=FALSE, message=FALSE)
library(tidyverse, warn.conflicts=FALSE, quietly=TRUE)
library(knitr)
library(olsrr)
rawdata <- read.csv("Computer_Whisperers_Stepwise_Regression_Data_for_pub.csv") %>%
filter(EEG_Usability >= 1)
```
\pagebreak
# DV #1: Learning Rate
## Predictors:
* Fluid Intelligence
* Language Aptitude
* Numeracy
* Working Memory Updating
* Working Memory Span
* Right Fronto-Temporal Beta Power
```{r, include=FALSE}
learning_rate <- rawdata[, c("Learning_Rate", "Fluid_Intelligence", "Language_Aptitude", "Numeracy",
"Working_Memory_Updating", "Working_Memory_Span", "Right_FrontoTemporal_Beta_Power")] %>% na.omit()
```
## Stepwise Regression: Predicting Learning Rate
All six (6) predictors are entered into the stepwise regression. The final model consists of Fluid Intelligence, Right Fronto-Temporal Beta Power, and Numeracy.
```{r, echo=FALSE}
learning_rate_model <- lm(Learning_Rate ~ ., data = learning_rate)
learning_rate_model_summary <- ols_step_both_p(learning_rate_model, details = T)
```
## Obtaining Bayesian Information Criterion Values
m1 through m4 are models produced by each step of the stepwise regression, with m4 being the final model with the best BIC value. m5a (+ Working Memory Span) and m5b (+ Working Memory Updating) are models that include an additional variable on top of the variables included in the final model; they have worse BIC values compared to m4.
```{r, echo=FALSE}
m1_LR <- lm(Learning_Rate ~ Language_Aptitude, learning_rate)
m2_LR <- lm(Learning_Rate ~ Language_Aptitude + Fluid_Intelligence, learning_rate)
m3_LR <- lm(Learning_Rate ~ Language_Aptitude + Fluid_Intelligence + Right_FrontoTemporal_Beta_Power, learning_rate)
m4_LR <- lm(Learning_Rate ~ Language_Aptitude + Fluid_Intelligence + Right_FrontoTemporal_Beta_Power + Numeracy, learning_rate)
m5a_LR <- lm(Learning_Rate ~ Language_Aptitude + Fluid_Intelligence + Right_FrontoTemporal_Beta_Power + Numeracy + Working_Memory_Span, learning_rate)
m5b_LR <- lm(Learning_Rate ~ Language_Aptitude + Fluid_Intelligence + Right_FrontoTemporal_Beta_Power + Numeracy + Working_Memory_Updating, learning_rate)
kable(BIC(m1_LR, m2_LR, m3_LR, m4_LR, m5a_LR, m5b_LR))
```
```{r, include=FALSE}
#These summaries were used to obtain R squared and p-values for models m5a and m5b.
summary(m5a_LR)
summary(m5b_LR)
```
\pagebreak
# DV #2: Programming Accuracy
## Predictors:
* Fluid Intelligence
* Language Aptitude
* Numeracy
* Working Memory Updating
* Working Memory Span
* Left Posterior Beta Coherence
* Left Posterior Theta Coherence
```{r, include=FALSE}
prog_accuracy <- rawdata[, c("Programming_Accuracy", "Fluid_Intelligence", "Language_Aptitude", "Numeracy",
"Working_Memory_Updating", "Working_Memory_Span", "Left_Posterior_Beta_Coherence",
"Left_Posterior_Theta_Coherence")] %>% na.omit()
```
## Stepwise Regression: Predicting Programming Accuracy
All seven (7) predictors are entered into the stepwise regression. The final model consists of Fluid Intelligence, Language Aptitude, and Working Memory Updating.
```{r, echo=FALSE}
prog_accuracy_model <- lm(Programming_Accuracy ~ ., data = prog_accuracy)
prog_accuracy_model_summary <- ols_step_both_p(prog_accuracy_model, details = T)
```
## Obtaining Bayesian Information Criterion Values
m1 through m3 are models produced by each step of the stepwise regression, with m3 being the final model with the best BIC value. m4a (+ Numeracy), m4b (+ Working Memory Span), m4c (+ Left Posterior Beta Coherence), and m4d (+ Left Posterior Theta Coherence) are models that include an additional variable on top of the variables included in the final model; they have worse BIC values compared to m3.
```{r, echo=FALSE}
m1_PA <- lm(Programming_Accuracy ~ Fluid_Intelligence, prog_accuracy)
m2_PA <- lm(Programming_Accuracy ~ Fluid_Intelligence + Language_Aptitude, prog_accuracy)
m3_PA <- lm(Programming_Accuracy ~ Fluid_Intelligence + Language_Aptitude + Working_Memory_Updating, prog_accuracy)
m4a_PA <- lm(Programming_Accuracy ~ Fluid_Intelligence + Language_Aptitude + Working_Memory_Updating + Numeracy, prog_accuracy)
m4b_PA <- lm(Programming_Accuracy ~ Fluid_Intelligence + Language_Aptitude + Working_Memory_Updating + Working_Memory_Span, prog_accuracy)
m4c_PA <- lm(Programming_Accuracy ~ Fluid_Intelligence + Language_Aptitude + Working_Memory_Updating + Left_Posterior_Beta_Coherence, prog_accuracy)
m4d_PA <- lm(Programming_Accuracy ~ Fluid_Intelligence + Language_Aptitude + Working_Memory_Updating + Left_Posterior_Theta_Coherence, prog_accuracy)
kable(BIC(m1_PA, m2_PA, m3_PA, m4a_PA, m4b_PA, m4c_PA, m4d_PA))
```
```{r, include=FALSE}
#These summaries were used to obtain R squared and p-values for models m4a through m4d.
summary(m4a_PA)
summary(m4b_PA)
summary(m4c_PA)
summary(m4d_PA)
```
\pagebreak
# DV #3: Declarative Knowledge
## Predictors:
* Fluid Intelligence
* Language Aptitude
* Numeracy
* Working Memory Updating
* Inhibitory Control
* Vocabulary
* Right Posterior Low Gamma Power
* Right Fronto-Temporal Gamma Power
```{r, include=FALSE}
declarative <- rawdata[, c("Declarative_Knowledge", "Fluid_Intelligence", "Language_Aptitude", "Numeracy",
"Working_Memory_Updating", "Inhibitory_Control", "Vocabulary",
"Right_Posterior_Low_Gamma_Power", "Right_FrontoTemporal_Low_Gamma_Power")] %>% na.omit()
```
## Stepwise Regression: Predicting Declarative Knowledge
All eight (8) predictors are entered into the stepwise regression. The final model consists of Fluid Intelligence and Right Fronto-Temporal Gamma Power.
```{r, echo=FALSE}
dec_knowledge_model <- lm(Declarative_Knowledge ~ ., data = declarative)
dec_knowledge_model_summary <- ols_step_both_p(dec_knowledge_model, details = T)
```
## Obtaining Bayesian Information Criterion Values
m1 and m2 are models produced by each step of the stepwise regression, with m2 being the final model with the best BIC value. m3a (+ Language Aptitude), m3b (+ Numeracy), m3c (+ Working Memory Updating), m3d (+ Inhibitory Control), m3e (+ Vocabulary), and m3f (+ Right Posterior Low Gamma Power) are models that include an additional variable on top of the variables included in the final model; they have worse BIC values compared to m2.
```{r, echo=FALSE}
m1_DK <- lm(Declarative_Knowledge ~ Fluid_Intelligence, declarative)
m2_DK <- lm(Declarative_Knowledge ~ Fluid_Intelligence + Right_FrontoTemporal_Low_Gamma_Power, declarative)
m3a_DK <- lm(Declarative_Knowledge ~ Fluid_Intelligence + Right_FrontoTemporal_Low_Gamma_Power + Language_Aptitude, declarative)
m3b_DK <- lm(Declarative_Knowledge ~ Fluid_Intelligence + Right_FrontoTemporal_Low_Gamma_Power + Numeracy , declarative)
m3c_DK <- lm(Declarative_Knowledge ~ Fluid_Intelligence + Right_FrontoTemporal_Low_Gamma_Power + Working_Memory_Updating, declarative)
m3d_DK <- lm(Declarative_Knowledge ~ Fluid_Intelligence + Right_FrontoTemporal_Low_Gamma_Power + Inhibitory_Control, declarative)
m3e_DK <- lm(Declarative_Knowledge ~ Fluid_Intelligence + Right_FrontoTemporal_Low_Gamma_Power + Vocabulary, declarative)
m3f_DK <- lm(Declarative_Knowledge ~ Fluid_Intelligence + Right_FrontoTemporal_Low_Gamma_Power + Right_Posterior_Low_Gamma_Power, declarative)
kable(BIC(m1_DK, m2_DK, m3a_DK, m3b_DK, m3c_DK, m3d_DK, m3e_DK, m3f_DK))
```
```{r, include=FALSE}
#These summaries were used to obtain R squared and p-values for models m3a through m3f.
summary(m3a_DK)
summary(m3b_DK)
summary(m3c_DK)
summary(m3d_DK)
summary(m3e_DK)
summary(m3f_DK)
```