Visualising variable selection:The mplot package

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	-0.10	0.33	-0.31	0.76
x1	0.64	0.69	0.92	0.36
x2	0.26	0.62	0.42	0.68
x3	-0.51	1.24	-0.41	0.68
x4	-0.30	0.25	-1.18	0.24
x5	0.36	0.60	0.59	0.56
x6	-0.54	0.96	-0.56	0.58
x7	-0.43	0.63	-0.68	0.50
x8	0.15	0.62	0.24	0.81
x9	0.40	0.64	0.63	0.53

Estimate

Std. Error

t value

Pr(>|t|)

(Intercept)

-0.10

0.33

-0.31

0.76

0.64

0.69

0.92

0.36

0.26

0.62

0.42

0.68

-0.51

1.24

-0.41

0.68

-0.30

0.25

-1.18

0.24

0.36

0.60

0.59

0.56

-0.54

0.96

-0.56

0.58

-0.43

0.63

-0.68

0.50

0.15

0.62

0.24

0.81

0.40

0.64

0.63

0.53

## Start: AIC=79.3 ## y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 ## ## Df Sum of Sq RSS AIC ## - x8 1 0.2423 163.94 77.374 ## - x3 1 0.6946 164.39 77.512 ## - x2 1 0.7107 164.41 77.517 ## - x6 1 1.3051 165.00 77.698 ## - x5 1 1.4425 165.14 77.739 ## - x9 1 1.6065 165.31 77.789 ## - x7 1 1.8835 165.58 77.873 ## - x1 1 3.4999 167.20 78.358 ## - x4 1 5.7367 169.44 79.023 ## <none> 163.70 79.301 ## ## Step: AIC=77.37 ## y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x9 ## ...

## ## Call: ## lm(formula = y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x9, data = artificialeg) ## ## Coefficients: ## (Intercept) x1 x2 x3 x4 ## -0.1143 0.8019 0.4011 -0.8083 -0.3514 ## x5 x6 x7 x9 ## 0.4927 -0.7738 -0.5772 0.5478

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	-0.11	0.32	-0.36	0.72
x1	0.80	0.19	4.13	0.00
x2	0.40	0.18	2.26	0.03
x3	-0.81	0.19	-4.22	0.00
x4	-0.35	0.12	-2.94	0.01
x5	0.49	0.19	2.55	0.01
x6	-0.77	0.15	-5.19	0.00
x7	-0.58	0.15	-3.94	0.00
x9	0.55	0.19	2.90	0.01

Estimate

Std. Error

t value

Pr(>|t|)

(Intercept)

-0.11

0.32

-0.36

0.72

0.80

0.19

4.13

0.00

0.40

0.18

2.26

0.03

-0.81

0.19

-4.22

0.00

-0.35

0.12

-2.94

0.01

0.49

0.19

2.55

0.01

-0.77

0.15

-5.19

0.00

-0.58

0.15

-3.94

0.00

0.55

0.19

2.90

0.01

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	0.03	0.29	0.11	0.91
x8	0.55	0.05	10.43	0.00

Estimate

Std. Error

t value

Pr(>|t|)

(Intercept)

0.03

0.29

0.11

0.91

0.55

0.05

10.43

0.00

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	0.03	0.29	0.11	0.91
x8	0.55	0.05	10.43	0.00

Estimate

Std. Error

t value

Pr(>|t|)

(Intercept)

0.03

0.29

0.11

0.91

0.55

0.05

10.43

0.00

	Res.Df	RSS	Df	Sum of Sq	F	Pr(>F)
1	48	200.97
2	41	163.94	7	37.03	1.32	0.2643

Res.Df

RSS

Sum of Sq

Pr(>F)

200.97

163.94

37.03

1.32

0.2643

## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x8 
##  8 
## [1] "x8"
## [1] "x8"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x6 
##  6 
## [1] "x6" "x8"
## [1] "x6+x8"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x1 
##  1 
## [1] "x1" "x6" "x8"
## [1] "x1+x6+x8"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x4 
##  4 
## [1] "x1" "x4" "x6" "x8"
## [1] "x1+x4+x6+x8"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x3 
##  3 
## [1] "x1" "x3" "x4" "x6" "x8"
## [1] "x1+x3+x4+x6+x8"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x7 
##  7 
## [1] "x1" "x3" "x4" "x6" "x7" "x8"
## [1] "x1+x3+x4+x6+x7+x8"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x5 
##  5 
## [1] "x1" "x3" "x4" "x5" "x6" "x7" "x8"
## [1] "x1+x3+x4+x5+x6+x7+x8"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x2 
##  2 
## [1] "x1" "x2" "x3" "x4" "x5" "x6" "x7" "x8"
## [1] "x1+x2+x3+x4+x5+x6+x7+x8"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x5 
## -5 
## [1] "x1" "x2" "x3" "x4" "x6" "x7" "x8"
## [1] "x1+x2+x3+x4+x6+x7+x8"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x6 
## -6 
## [1] "x1" "x2" "x3" "x4" "x7" "x8"
## [1] "x1+x2+x3+x4+x7+x8"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x5 
##  5 
## [1] "x1" "x2" "x3" "x4" "x5" "x7" "x8"
## [1] "x1+x2+x3+x4+x5+x7+x8"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x9 
##  9 
## [1] "x1" "x2" "x3" "x4" "x5" "x7" "x8" "x9"
## [1] "x1+x2+x3+x4+x5+x7+x8+x9"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x6 
##  6 
## [1] "x1" "x2" "x3" "x4" "x5" "x6" "x7" "x8" "x9"
## [1] "x1+x2+x3+x4+x5+x6+x7+x8+x9"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x2 
## -2 
## [1] "x1" "x3" "x4" "x5" "x6" "x7" "x8" "x9"
## [1] "x1+x3+x4+x5+x6+x7+x8+x9"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x2 
##  2 
## [1] "x1" "x2" "x3" "x4" "x5" "x6" "x7" "x8" "x9"
## [1] "x1+x2+x3+x4+x5+x6+x7+x8+x9"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x3 
## -3 
## [1] "x1" "x2" "x4" "x5" "x6" "x7" "x8" "x9"
## [1] "x1+x2+x4+x5+x6+x7+x8+x9"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## x3 
##  3 
## [1] "x1" "x2" "x3" "x4" "x5" "x6" "x7" "x8" "x9"
## [1] "x1+x2+x3+x4+x5+x6+x7+x8+x9"
## 
## Call:
## lars(x = x, y = y)
## R-squared: 0.751 
## Sequence of LASSO moves:
##      x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3
## Var   8  6  1  4  3  7  5  2 -5 -6  5  9  6 -2  2 -3  3
## Step  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17

## ## Call: ## lars(x = x, y = y) ## R-squared: 0.751 ## Sequence of LASSO moves: ## x8 x6 x1 x4 x3 x7 x5 x2 x5 x6 x5 x9 x6 x2 x2 x3 x3 ## Var 8 6 1 4 3 7 5 2 -5 -6 5 9 6 -2 2 -3 3 ## Step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] ## [1,] 1 2 3 4 5 6 7 8 9 10 ## [2,] 1 2 3 4 5 6 7 8 9 10 ## [3,] 1 2 3 4 5 6 7 8 9 10 ## [4,] 1 2 3 4 5 6 7 8 9 10

require(doMC,quietly=TRUE) require(foreach) registerDoMC(cores=ncores) n=10 result = foreach(j = 1:n, .combine=rbind) %dopar% { # EXPERIMENT # last line is returned as a row in the result matrix rep(j,4) } head(result)

## [,1] [,2] [,3] [,4] ## result.1 1 1 1 1 ## result.2 2 2 2 2 ## result.3 3 3 3 3 ## result.4 4 4 4 4 ## result.5 5 5 5 5 ## result.6 6 6 6 6

## R version 3.1.1 (2014-07-10) ## Platform: x86_64-apple-darwin13.1.0 (64-bit) ## ## locale: ## [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8 ## ## attached base packages: ## [1] parallel stats graphics grDevices utils datasets methods ## [8] base ## ## other attached packages: ## [1] doMC_1.3.3 iterators_1.0.7 foreach_1.4.2 lars_1.2 ## [5] xtable_1.7-4 mplot_0.4.7 shiny_0.10.2.1 googleVis_0.5.6 ## [9] bestglm_0.34 leaps_2.9 knitr_1.7 ## ## loaded via a namespace (and not attached): ## [1] codetools_0.2-9 compiler_3.1.1 digest_0.6.4 evaluate_0.5.5 ## [5] formatR_1.0 htmltools_0.2.6 httpuv_1.3.2 mime_0.2 ## [9] R6_2.0.1 Rcpp_0.11.3 RJSONIO_1.3-0 rmarkdown_0.3.10 ## [13] stringr_0.6.2 tools_3.1.1 yaml_2.1.13

Overview

What's wrong with existing variable selection techniques?

Current state of variable selection

Google

Google Scholar

Do we really need more?

Current approaches to variable selection

Information Criterion

Stepwise procedures

Regularisation methods

Artificial example

Artificial example

Artificial example – Stepwise

Artificial example – Stepwise

Artificial example – Stepwise

Artificial example – Stepwise

A short history of the lasso

Artificial example – Lasso

Artificial example – Lasso

Artificial example – Lasso

Artificial example – Lasso

Artificial example – Lasso

Artificial example – Lasso

Artificial example – Lasso

Artificial example – Lasso

Artificial example – Lasso

Artificial example – Lasso

Artificial example – Lasso

Artificial example – Lasso

Artificial example – Lasso

Artificial example – Lasso

Artificial example – Lasso

A different approach

Aim

Some notation

Variable inclusion plots

Variable inclusion plots

Aim

Procedure

References

Artificial example – VIP

Model stability plots

Artificial example – Loss against size

Artificial example – Loss against size

Artificial example – Loss against size

Model stability plots

Aim

Procedure

References

Artificial example – Model stability plot

The adaptive fence

The fence

Notation

Main idea

Illustration

Illustration

Problem: how to choose \(c\)?

Solution: Bootstrap over a range of values of \(c\).

Procedure

What does this look like?

Our implementation

Core innovations

Optional innovations

Artificial example – adaptive fence

Speed (linear models; B=50; n.c=25)

Speed (linear models; B=50; n.c=25)

The mplot package

Get it on Github

Vignettes

Main functions

Examples

Underlying technology

R packages

Web

Package development

Future work

Parallel processing in R

Parallel processing

Basic syntax

Basic syntax