1️⃣ test the nonlinear term, if significant leave it in
2️⃣ if you have enough dfs, include the nonlinear term regardless of significance
3️⃣ never include nonlinear terms
4️⃣ comment
📃 source: onlinelibrary.wiley.com/doi/abs/10.100…
In the boy who cried wolf 🙆♂️🗣🐺, the villagers ☝️ first committed a Type 1 error (thinking the wolf was there when it was not!) then ✌️ they committed a Type 2 error (thinking the wolf was not there when it was!)
install.packages(“rms”) in R
sim_1 <- function(){
y <- rnorm(30)
x <- rnorm(30)
mod <- ols(y ~ rcs(x))
if (anova(mod)[[" Nonlinear", "P"]] > 0.05){
# if non-linearity is not "significant", remove terms
mod <- ols(y ~ x)
}
anova(mod)[["x", "P"]]
}
0.05){
# if non-linearity is not "significant", remove terms
mod <- ols(y ~ x)
}
anova(mod)[["x", "P"]]
}" src="/images/1px.png" data-src="https://pbs.twimg.com/media/EGYrw1lUUAA9gpU.jpg">
test <- replicate(10000, sim_1())
mean(test <= 0.05)
When I run this, I get 0.0832 - uh oh! that’s definitely above our 0.05 target 🎯
y <- rnorm(30)
x <- rnorm(30)
mod <- ols(y ~ rcs(x))
anova(mod)[["x", "P"]]
}
test <- replicate(10000, sim_2())
mean(test <= 0.05)
Now I get 0.0522 - much better!
}
test <- replicate(10000, sim_2())
mean(test <= 0.05)
# [1] 0.0522" src="/images/1px.png" data-src="https://pbs.twimg.com/media/EGYtYLbUcAA0B_9.png">