December 2015
install.packages("devtools") devtools::install_github("garthtarr/mplot") require(mplot)
install.packages("mplot")
vis()
for variable inclusion and model stability plotsaf()
for the adaptive fencebglmnet()
for bootstrapping glmnetmplot()
for an interactive shiny interface Aim: To provide scientists/researchers/analysts with tools that give them more information about the model selection choices that they are making.
Concept of model stability independently introduced by Meinshausen and Bühlmann (2010) and Müller and Welsh (2010) for different linear regression situations.
Key idea: small changes should have small effects
With important special cases:
Variable | Description |
---|---|
age | Age |
sex | Gender |
bmi | Body mass index |
map | Mean arterial pressure (average blood pressure) |
tc | Total cholesterol (mg/dL) |
ldl | Low-density lipoprotein ("bad" cholesterol) |
hdl | High-density lipoprotein ("good" cholesterol) |
tch | Blood serum measurement |
ltg | Blood serum measurement |
glu | Blood serum measurement (glucose?) |
y | A quantitative measure of disease progression one year after baseline |
Aim: To visualise inclusion probabilities as a function of the penalty multiplier \(\lambda\in [0,2\log(n)]\).
require(mplot) lm.d = lm(y ~ ., data = diabetes) vis.d = vis(lm.d, B = 200) plot(vis.d, which = "vip")
plot(vis.d, which = "lvk", highlight = "hdl")
plot(vis.d, which = "lvk", highlight = "bmi")
Aim: To add value to the loss against size plots using a symbol size proportional to a measure of stability.
plot(vis.d, which = "boot", highlight = "ltg")
The fence is based aroung the inequality: \[Q(\alpha) \leq Q(\alpha_f) + c .\]
Jiang et al. (2008)
Source: Jiang, Nguyen, and Rao (2009)
af.d = af(lm.d, B = 200, n.c = 100, c.max = 100) plot(af.d)
Tibshirani (1996) did regression with an \(L_1\) norm penalty and called it the lasso (least absolute shrinkage and selection operator).
The lasso parameter estimates are obtained by minimising the residual sum of squares subject to the constraint that \[\sum_j |\beta_j| \leq t.\]
require(lars) x = as.matrix(subset(diabetes, select = -y)) y = as.matrix(subset(diabetes, select = y)) par(mar = c(2.7, 2.7, 1, 2), mgp = c(1.5, 0.5, 0), bg = "transparent") plot(art.lars, xvar = "step", breaks = FALSE, lwd = 2)
plot(art.lars, xvar = "step", plottype = "Cp", lwd = 2)
bgn.g = bglmnet(lm.d) plot(bgn.g, which = "boot", highlight = "ltg")
bgn.d = bglmnet(lm.d) plot(bgn.d, which = "vip")
Slides: garthtarr.com/pres/hobart2015
sessionInfo()
## R version 3.2.2 (2015-08-14) ## Platform: x86_64-apple-darwin13.4.0 (64-bit) ## Running under: OS X 10.10.5 (Yosemite) ## ## locale: ## [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8 ## ## attached base packages: ## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages: ## [1] lars_1.2 mplot_0.7.7 knitr_1.11 ## ## loaded via a namespace (and not attached): ## [1] Rcpp_0.12.2 codetools_0.2-14 digest_0.6.8 ## [4] foreach_1.4.3 mime_0.4 R6_2.1.1 ## [7] xtable_1.8-0 formatR_1.2.1 magrittr_1.5 ## [10] evaluate_0.8 stringi_1.0-1 googleVis_0.5.10 ## [13] rmarkdown_0.8.1 RJSONIO_1.3-0 iterators_1.0.8 ## [16] tools_3.2.2 stringr_1.0.0 shiny_0.12.2 ## [19] httpuv_1.3.3 yaml_2.1.13 parallel_3.2.2 ## [22] shinydashboard_0.5.1 htmltools_0.2.6
Jiang, Jiming, Thuan Nguyen, and J. Sunil Rao. 2009. “A Simplified Adaptive Fence Procedure.” Statistics & Probability Letters 79 (5): 625–29. doi:10.1016/j.spl.2008.10.014.
Jiang, Jiming, J. Sunil Rao, Zhonghua Gu, and Thuan Nguyen. 2008. “Fence Methods for Mixed Model Selection.” The Annals of Statistics 36 (4): 1669–92. doi:10.1214/07-AOS517.
Meinshausen, N, and P Bühlmann. 2010. “Stability Selection.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72 (4): 417–73. doi:10.1111/j.1467-9868.2010.00740.x.
Murray, K, S Heritier, and S Müller. 2013. “Graphical Tools for Model Selection in Generalized Linear Models.” Statistics in Medicine 32 (25): 4438–51. doi:10.1002/sim.5855.
Müller, S, and AH Welsh. 2010. “On Model Selection Curves.” International Statistical Review 78 (2): 240–56. doi:10.1111/j.1751-5823.2010.00108.x.
Tibshirani, Robert. 1996. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society: Series B (Methodological), 267–88.