- Shiny: the democratisation of R
- Graphical network example
- The
mplot
package- Variable inclusion plots
- Model stability plots
- Bootstrapping the lasso
- Diabetes data example
- Technology involved
mplot
package
I wanted a way to interactively explore the results of a mega-simulation study from my thesis.
There's a button in Rstudio that lets you push it to their server.
shiny::runGitHub('network', 'garthtarr')
More info: http://shiny.rstudio.com/articles/deployment-local.html
require(mplot) lm.art = lm(y~., data=artificialeg)
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | -0.10 | 0.33 | -0.31 | 0.76 |
x1 | 0.64 | 0.69 | 0.92 | 0.36 |
x2 | 0.26 | 0.62 | 0.42 | 0.68 |
x3 | -0.51 | 1.24 | -0.41 | 0.68 |
x4 | -0.30 | 0.25 | -1.18 | 0.24 |
x5 | 0.36 | 0.60 | 0.59 | 0.56 |
x6 | -0.54 | 0.96 | -0.56 | 0.58 |
x7 | -0.43 | 0.63 | -0.68 | 0.50 |
x8 | 0.15 | 0.62 | 0.24 | 0.81 |
x9 | 0.40 | 0.64 | 0.63 | 0.53 |
step(lm.art)
## Start: AIC=79.3 ## y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 ## ## Df Sum of Sq RSS AIC ## - x8 1 0.2423 163.94 77.374 ## - x3 1 0.6946 164.39 77.512 ## - x2 1 0.7107 164.41 77.517 ## - x6 1 1.3051 165.00 77.698 ## - x5 1 1.4425 165.14 77.739 ## - x9 1 1.6065 165.31 77.789 ## - x7 1 1.8835 165.58 77.873 ## - x1 1 3.4999 167.20 78.358 ## - x4 1 5.7367 169.44 79.023 ## <none> 163.70 79.301 ## ## Step: AIC=77.37 ## y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x9 ## ...
## ## Call: ## lm(formula = y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x9, data = artificialeg) ## ## Coefficients: ## (Intercept) x1 x2 x3 x4 ## -0.1143 0.8019 0.4011 -0.8083 -0.3514 ## x5 x6 x7 x9 ## 0.4927 -0.7738 -0.5772 0.5478
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | -0.11 | 0.32 | -0.36 | 0.72 |
x1 | 0.80 | 0.19 | 4.13 | 0.00 |
x2 | 0.40 | 0.18 | 2.26 | 0.03 |
x3 | -0.81 | 0.19 | -4.22 | 0.00 |
x4 | -0.35 | 0.12 | -2.94 | 0.01 |
x5 | 0.49 | 0.19 | 2.55 | 0.01 |
x6 | -0.77 | 0.15 | -5.19 | 0.00 |
x7 | -0.58 | 0.15 | -3.94 | 0.00 |
x9 | 0.55 | 0.19 | 2.90 | 0.01 |
x8
.The true data generating process is:
\[y = 0.6x_8 + \varepsilon.\]
art.true = lm(y~x8,data=artificialeg)
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | 0.03 | 0.29 | 0.11 | 0.91 |
x8 | 0.55 | 0.05 | 10.43 | 0.00 |
The true data generating process is:
\[y = 0.6x_8 + \varepsilon.\]
art.true = lm(y~x8,data=artificialeg)
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | 0.03 | 0.29 | 0.11 | 0.91 |
x8 | 0.55 | 0.05 | 10.43 | 0.00 |
anova(art.true,step.art)
Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) | |
---|---|---|---|---|---|---|
1 | 48 | 200.97 | ||||
2 | 41 | 163.94 | 7 | 37.03 | 1.32 | 0.2643 |
With important special cases:
To provide scientists/researchers with tools that give them more information about the model selection choices that they are making.
Concept of model stability independently introduced by Meinshausen and Bühlmann (2010) and Müller and Welsh (2010) for different linear regression situations.
To visualise inclusion probabilities as a function of the penalty multiplier \(\lambda\in [0,2\log(n)]\).
require(mplot) vis.art = vis(lm.art) plot(vis.art, which = "vip")
plot(vis.art, which = "lvk")
plot(vis.art, which = "lvk", highlight = "x6")
To add value to the loss against size plots by choosing a symbol size proportional to a measure of stability.
plot(vis.art, which = "boot")
The fence (Jiang et al. 2008) is based aroung the inequality: \[ Q(\alpha) \leq Q(\alpha_f) + c \]
Source: Jiang, Nguyen, and Rao (2009)
leaps
and bestglm
packagesaf.art = af(lm.art, B = 100, n.c = 50, n.cores = 2) plot(af.art)
Tibshirani (1996) suggested doing regression with an \(L_1\) norm penalty and called it the lasso (least absolute shrinkage and selection operator).
The lasso parameter estimates are obtained by minimising the residual sum of squares subject to the constraint that \[\sum_j |\beta_j| \leq t.\]
"The encouraging results reported here suggest that absolute value constraints might prove to be useful in a wide variety of statistical estimation problems. Further study is needed to investigate these possibilities." - Tibshirani (1996)
require(lars) x = as.matrix(subset(artificialeg, select = -y)) y = as.matrix(subset(artificialeg, select = y)) art.lars = lars(x, y)
require(lars) x = as.matrix(subset(artificialeg, select = -y)) y = as.matrix(subset(artificialeg, select = y)) art.lars = lars(x, y)
## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x8 ## 8 ## [1] "x8" ## [1] "x8" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x6 ## 6 ## [1] "x6" "x8" ## [1] "x6+x8" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x1 ## 1 ## [1] "x1" "x6" "x8" ## [1] "x1+x6+x8" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x4 ## 4 ## [1] "x1" "x4" "x6" "x8" ## [1] "x1+x4+x6+x8" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x3 ## 3 ## [1] "x1" "x3" "x4" "x6" "x8" ## [1] "x1+x3+x4+x6+x8" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x7 ## 7 ## [1] "x1" "x3" "x4" "x6" "x7" "x8" ## [1] "x1+x3+x4+x6+x7+x8" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x5 ## 5 ## [1] "x1" "x3" "x4" "x5" "x6" "x7" "x8" ## [1] "x1+x3+x4+x5+x6+x7+x8" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x2 ## 2 ## [1] "x1" "x2" "x3" "x4" "x5" "x6" "x7" "x8" ## [1] "x1+x2+x3+x4+x5+x6+x7+x8" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x5 ## -5 ## [1] "x1" "x2" "x3" "x4" "x6" "x7" "x8" ## [1] "x1+x2+x3+x4+x6+x7+x8" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x6 ## -6 ## [1] "x1" "x2" "x3" "x4" "x7" "x8" ## [1] "x1+x2+x3+x4+x7+x8" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x5 ## 5 ## [1] "x1" "x2" "x3" "x4" "x5" "x7" "x8" ## [1] "x1+x2+x3+x4+x5+x7+x8" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x9 ## 9 ## [1] "x1" "x2" "x3" "x4" "x5" "x7" "x8" "x9" ## [1] "x1+x2+x3+x4+x5+x7+x8+x9" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x6 ## 6 ## [1] "x1" "x2" "x3" "x4" "x5" "x6" "x7" "x8" "x9" ## [1] "x1+x2+x3+x4+x5+x6+x7+x8+x9" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x2 ## -2 ## [1] "x1" "x3" "x4" "x5" "x6" "x7" "x8" "x9" ## [1] "x1+x3+x4+x5+x6+x7+x8+x9" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x2 ## 2 ## [1] "x1" "x2" "x3" "x4" "x5" "x6" "x7" "x8" "x9" ## [1] "x1+x2+x3+x4+x5+x6+x7+x8+x9" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x3 ## -3 ## [1] "x1" "x2" "x4" "x5" "x6" "x7" "x8" "x9" ## [1] "x1+x2+x4+x5+x6+x7+x8+x9" ## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ## x3 ## 3 ## [1] "x1" "x2" "x3" "x4" "x5" "x6" "x7" "x8" "x9" ## [1] "x1+x2+x3+x4+x5+x6+x7+x8+x9"
bgn.art = bglmnet(lm.art, lambda = seq(0.05, 2.75, 0.1)) plot(bgn.art, which = "models")
bgn.art = bglmnet(lm.art) plot(bgn.art, which = "variables")
install.packages("devtools") require(devtools) install_github("garthtarr/mplot",quick=TRUE) require(mplot)
vignette("mplot-guide",package="mplot") vignette("mplot-stepwise",package="mplot")
vis()
for VIP and model stability plotsaf()
for the adaptive fencebglmnet()
bootstrapping glmnetmplot()
for an interactive shiny interfaceVariable | Description |
---|---|
age | Age |
sex | Gender |
bmi | Body mass index |
map | Mean arterial pressure (average blood pressure) |
tc | Total cholesterol (mg/dL) |
ldl | Low-density lipoprotein ("bad" cholesterol) |
hdl | High-density lipoprotein ("good" cholesterol) |
tch | Blood serum measurement |
ltg | Blood serum measurement |
glu | Blood serum measurement (glucose?) |
y | A quantitative measure of disease progression one year after baseline |
Source: Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., (2004). "Least angle regression. The Annals of Statistics 32 (2): 407-499. doi:10.1214/009053604000000067
roxygen2
devtools
is an essential package for building and loading R packageshttp://joey711.github.io/shiny-phyloseq
install.packages("shiny") shiny::runGitHub("shiny-phyloseq","joey711")
sessionInfo()
## R version 3.1.2 (2014-10-31) ## Platform: x86_64-apple-darwin13.4.0 (64-bit) ## ## locale: ## [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8 ## ## attached base packages: ## [1] parallel stats graphics grDevices utils datasets methods ## [8] base ## ## other attached packages: ## [1] lars_1.2 xtable_1.7-4 mplot_0.5.0 glmnet_1.9-8 ## [5] Matrix_1.1-5 mvoutlier_2.0.6 sgeostat_1.0-25 shiny_0.11.1 ## [9] googleVis_0.5.8 doParallel_1.0.8 iterators_1.0.7 foreach_1.4.2 ## [13] bestglm_0.34 leaps_2.9 knitr_1.9 ## ## loaded via a namespace (and not attached): ## [1] cluster_2.0.1 codetools_0.2-11 colorspace_1.2-6 ## [4] DEoptimR_1.0-2 digest_0.6.8 evaluate_0.5.5 ## [7] formatR_1.0 GGally_0.5.0 ggplot2_1.0.0 ## [10] grid_3.1.2 gtable_0.1.2 htmltools_0.2.6 ## [13] httpuv_1.3.2 lattice_0.20-30 MASS_7.3-39 ## [16] mime_0.2 munsell_0.4.2 mvtnorm_1.0-2 ## [19] pcaPP_1.9-60 pls_2.4-3 plyr_1.8.1 ## [22] proto_0.3-10 R6_2.0.1 Rcpp_0.11.5 ## [25] reshape_0.8.5 reshape2_1.4.1 RJSONIO_1.3-0 ## [28] rmarkdown_0.5.1 robCompositions_1.9.0 robustbase_0.92-3 ## [31] rrcov_1.3-8 scales_0.2.4 stats4_3.1.2 ## [34] stringr_0.6.2 tools_3.1.2 yaml_2.1.13
Efron, Bradley, Trevor Hastie, Iain Johnstone, and Robert Tibshirani. 2004. “Least Angle Regression.” The Annals of Statistics 32 (2): 407–51. doi:10.1214/009053604000000067.
Jiang, Jiming, Thuan Nguyen, and J. Sunil Rao. 2009. “A Simplified Adaptive Fence Procedure.” Statistics & Probability Letters 79 (5): 625–29. doi:10.1016/j.spl.2008.10.014.
Jiang, Jiming, J. Sunil Rao, Zhonghua Gu, and Thuan Nguyen. 2008. “Fence Methods for Mixed Model Selection.” The Annals of Statistics 36 (4): 1669–92. doi:10.1214/07-AOS517.
Meinshausen, N, and P Bühlmann. 2010. “Stability Selection.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72 (4): 417–73. doi:10.1111/j.1467-9868.2010.00740.x.
Murray, K, S Heritier, and S Müller. 2013. “Graphical Tools for Model Selection in Generalized Linear Models.” Statistics in Medicine 32 (25): 4438–51. doi:10.1002/sim.5855.
Müller, S, and AH Welsh. 2010. “On Model Selection Curves.” International Statistical Review 78 (2): 240–56. doi:10.1111/j.1751-5823.2010.00108.x.
Tibshirani, Robert. 1996. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society: Series B (Methodological), 267–88. http://www.jstor.org/stable/10.2307/2346178.