Selecting models from lists and running anova()
I was making a notebook with a list of related models, like
m <- list(
mpg_wt = lm(mpg ~ wt, mtcars),
mpg_disp = lm(mpg ~ disp, mtcars),
mpg_disp_wt = lm(mpg ~ disp + wt, mtcars),
mpg_disp_wt_int = lm(mpg ~ disp * wt, mtcars),
hp_wt = lm(hp ~ wt, mtcars),
hp_disp = lm(hp ~ disp, mtcars),
hp_disp_wt = lm(hp ~ disp + wt, mtcars),
hp_disp_wt_int = lm(hp ~ disp * wt, mtcars)
)
I want to select a subset of models and anova() them, so I made a quick
tidyselect function and an S3 method method of lists with anova():
library(tidyverse)
#> Warning: package 'tibble' was built under R version 4.5.2
#> Warning: package 'tidyr' was built under R version 4.5.2
#> Warning: package 'readr' was built under R version 4.5.2
#> Warning: package 'purrr' was built under R version 4.5.2
#> Warning: package 'dplyr' was built under R version 4.5.2
#> Warning: package 'stringr' was built under R version 4.5.2
#> Warning: package 'lubridate' was built under R version 4.5.2
#> ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
#> ✔ dplyr 1.2.0 ✔ readr 2.2.0
#> ✔ forcats 1.0.1 ✔ stringr 1.6.0
#> ✔ ggplot2 4.0.2 ✔ tibble 3.3.1
#> ✔ lubridate 1.9.5 ✔ tidyr 1.3.2
#> ✔ purrr 1.2.1
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
#> ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
list_select <- function(l, ...) {
pos <- tidyselect::eval_select(c(...), l)
rlang::set_names(l[pos], names(pos))
}
anova.list <- function(object) {
do.call(anova, unname(object))
}
m |>
list_select(starts_with("mpg")) |>
list_select(matches("disp")) |>
anova()
#> Analysis of Variance Table
#>
#> Model 1: mpg ~ disp
#> Model 2: mpg ~ disp + wt
#> Model 3: mpg ~ disp * wt
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 30 317.16
#> 2 29 246.68 1 70.476 11.694 0.001942 **
#> 3 28 168.75 1 77.934 12.931 0.001227 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(m$mpg_disp, m$mpg_disp_wt, m$mpg_disp_wt_int)
#> Analysis of Variance Table
#>
#> Model 1: mpg ~ disp
#> Model 2: mpg ~ disp + wt
#> Model 3: mpg ~ disp * wt
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 30 317.16
#> 2 29 246.68 1 70.476 11.694 0.001942 **
#> 3 28 168.75 1 77.934 12.931 0.001227 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
list_select() is extremely basic. It’s based on the examples in the
tidyselect::eval_select() documentation.
For list_select(), the closest option available from purrr (the
tidyverse package for working on lists) is purrr::keep_at():
m |>
purrr::keep_at(function(x) startsWith(x, "mpg")) |>
str(max.level = 1)
#> List of 4
#> $ mpg_wt :List of 12
#> ..- attr(*, "class")= chr "lm"
#> $ mpg_disp :List of 12
#> ..- attr(*, "class")= chr "lm"
#> $ mpg_disp_wt :List of 12
#> ..- attr(*, "class")= chr "lm"
#> $ mpg_disp_wt_int:List of 12
#> ..- attr(*, "class")= chr "lm"
Leave a comment