Feature Selection with Shadow Variable Search
Source:R/FSelectorBatchShadowVariableSearch.R
mlr_fselectors_shadow_variable_search.Rd
Feature selection using the Shadow Variable Search Algorithm. Shadow variable search creates for each feature a permutated copy and stops when one of them is selected.
Source
Thomas J, Hepp T, Mayr A, Bischl B (2017). “Probing for Sparse and Fast Variable Selection with Model-Based Boosting.” Computational and Mathematical Methods in Medicine, 2017, 1–8. doi:10.1155/2017/1421409 .
Wu Y, Boos DD, Stefanski LA (2007). “Controlling Variable Selection by the Addition of Pseudovariables.” Journal of the American Statistical Association, 102(477), 235–243. doi:10.1198/016214506000000843 .
Details
The feature selection terminates itself when the first shadow variable is selected. It is not necessary to set a termination criterion.
Resources
The gallery features a collection of case studies and demos about optimization.
Run a feature selection with Shadow Variable Search.
Super classes
mlr3fselect::FSelector
-> mlr3fselect::FSelectorBatch
-> FSelectorBatchShadowVariableSearch
Methods
Method optimization_path()
Returns the optimization path.
Arguments
inst
(FSelectInstanceBatchSingleCrit)
Instance optimized with FSelectorBatchShadowVariableSearch.
Examples
# Feature Selection
# \donttest{
# retrieve task and load learner
task = tsk("penguins")
learner = lrn("classif.rpart")
# run feature selection on the Palmer Penguins data set
instance = fselect(
fselector = fs("shadow_variable_search"),
task = task,
learner = learner,
resampling = rsmp("holdout"),
measure = msr("classif.ce"),
)
# best performing feature subset
instance$result
#> bill_depth bill_length body_mass flipper_length island sex year
#> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl>
#> 1: FALSE TRUE FALSE TRUE TRUE FALSE FALSE
#> features n_features classif.ce
#> <list> <int> <num>
#> 1: bill_length,flipper_length,island 3 0.04347826
# all evaluated feature subsets
as.data.table(instance$archive)
#> bill_depth bill_length body_mass flipper_length island sex year
#> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl>
#> 1: TRUE FALSE FALSE FALSE FALSE FALSE FALSE
#> 2: FALSE TRUE FALSE FALSE FALSE FALSE FALSE
#> 3: FALSE FALSE TRUE FALSE FALSE FALSE FALSE
#> 4: FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> 5: FALSE FALSE FALSE FALSE TRUE FALSE FALSE
#> 6: FALSE FALSE FALSE FALSE FALSE TRUE FALSE
#> 7: FALSE FALSE FALSE FALSE FALSE FALSE TRUE
#> 8: FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> 9: FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> 10: FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> 11: FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> 12: FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> 13: FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> 14: FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> 15: TRUE FALSE FALSE TRUE FALSE FALSE FALSE
#> 16: FALSE TRUE FALSE TRUE FALSE FALSE FALSE
#> 17: FALSE FALSE TRUE TRUE FALSE FALSE FALSE
#> 18: FALSE FALSE FALSE TRUE TRUE FALSE FALSE
#> 19: FALSE FALSE FALSE TRUE FALSE TRUE FALSE
#> 20: FALSE FALSE FALSE TRUE FALSE FALSE TRUE
#> 21: FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> 22: FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> 23: FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> 24: FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> 25: FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> 26: FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> 27: FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> 28: TRUE TRUE FALSE TRUE FALSE FALSE FALSE
#> 29: FALSE TRUE TRUE TRUE FALSE FALSE FALSE
#> 30: FALSE TRUE FALSE TRUE TRUE FALSE FALSE
#> 31: FALSE TRUE FALSE TRUE FALSE TRUE FALSE
#> 32: FALSE TRUE FALSE TRUE FALSE FALSE TRUE
#> 33: FALSE TRUE FALSE TRUE FALSE FALSE FALSE
#> 34: FALSE TRUE FALSE TRUE FALSE FALSE FALSE
#> 35: FALSE TRUE FALSE TRUE FALSE FALSE FALSE
#> 36: FALSE TRUE FALSE TRUE FALSE FALSE FALSE
#> 37: FALSE TRUE FALSE TRUE FALSE FALSE FALSE
#> 38: FALSE TRUE FALSE TRUE FALSE FALSE FALSE
#> 39: FALSE TRUE FALSE TRUE FALSE FALSE FALSE
#> 40: TRUE TRUE FALSE TRUE TRUE FALSE FALSE
#> 41: FALSE TRUE TRUE TRUE TRUE FALSE FALSE
#> 42: FALSE TRUE FALSE TRUE TRUE TRUE FALSE
#> 43: FALSE TRUE FALSE TRUE TRUE FALSE TRUE
#> 44: FALSE TRUE FALSE TRUE TRUE FALSE FALSE
#> 45: FALSE TRUE FALSE TRUE TRUE FALSE FALSE
#> 46: FALSE TRUE FALSE TRUE TRUE FALSE FALSE
#> 47: FALSE TRUE FALSE TRUE TRUE FALSE FALSE
#> 48: FALSE TRUE FALSE TRUE TRUE FALSE FALSE
#> 49: FALSE TRUE FALSE TRUE TRUE FALSE FALSE
#> 50: FALSE TRUE FALSE TRUE TRUE FALSE FALSE
#> bill_depth bill_length body_mass flipper_length island sex year
#> classif.ce runtime_learners timestamp batch_nr
#> <num> <num> <POSc> <int>
#> 1: 0.32173913 0.012 2024-11-07 21:47:49 1
#> 2: 0.25217391 0.012 2024-11-07 21:47:49 1
#> 3: 0.26956522 0.013 2024-11-07 21:47:49 1
#> 4: 0.20869565 0.012 2024-11-07 21:47:49 1
#> 5: 0.28695652 0.011 2024-11-07 21:47:49 1
#> 6: 0.54782609 0.013 2024-11-07 21:47:49 1
#> 7: 0.54782609 0.013 2024-11-07 21:47:49 1
#> 8: 0.50434783 0.010 2024-11-07 21:47:49 1
#> 9: 0.58260870 0.011 2024-11-07 21:47:49 1
#> 10: 0.53043478 0.010 2024-11-07 21:47:49 1
#> 11: 0.56521739 0.011 2024-11-07 21:47:49 1
#> 12: 0.59130435 0.013 2024-11-07 21:47:49 1
#> 13: 0.54782609 0.015 2024-11-07 21:47:49 1
#> 14: 0.54782609 0.011 2024-11-07 21:47:49 1
#> 15: 0.20869565 0.014 2024-11-07 21:47:49 2
#> 16: 0.05217391 0.013 2024-11-07 21:47:49 2
#> 17: 0.20000000 0.012 2024-11-07 21:47:49 2
#> 18: 0.13913043 0.012 2024-11-07 21:47:49 2
#> 19: 0.20869565 0.012 2024-11-07 21:47:49 2
#> 20: 0.20000000 0.012 2024-11-07 21:47:49 2
#> 21: 0.20869565 0.035 2024-11-07 21:47:49 2
#> 22: 0.20869565 0.017 2024-11-07 21:47:49 2
#> 23: 0.21739130 0.015 2024-11-07 21:47:49 2
#> 24: 0.19130435 0.017 2024-11-07 21:47:49 2
#> 25: 0.20869565 0.017 2024-11-07 21:47:49 2
#> 26: 0.20869565 0.019 2024-11-07 21:47:49 2
#> 27: 0.20869565 0.019 2024-11-07 21:47:49 2
#> 28: 0.05217391 0.013 2024-11-07 21:47:50 3
#> 29: 0.05217391 0.012 2024-11-07 21:47:50 3
#> 30: 0.04347826 0.013 2024-11-07 21:47:50 3
#> 31: 0.05217391 0.021 2024-11-07 21:47:50 3
#> 32: 0.05217391 0.016 2024-11-07 21:47:50 3
#> 33: 0.05217391 0.012 2024-11-07 21:47:50 3
#> 34: 0.05217391 0.013 2024-11-07 21:47:50 3
#> 35: 0.05217391 0.012 2024-11-07 21:47:50 3
#> 36: 0.05217391 0.014 2024-11-07 21:47:50 3
#> 37: 0.05217391 0.014 2024-11-07 21:47:50 3
#> 38: 0.05217391 0.012 2024-11-07 21:47:50 3
#> 39: 0.05217391 0.012 2024-11-07 21:47:50 3
#> 40: 0.04347826 0.015 2024-11-07 21:47:50 4
#> 41: 0.04347826 0.024 2024-11-07 21:47:50 4
#> 42: 0.04347826 0.018 2024-11-07 21:47:50 4
#> 43: 0.04347826 0.014 2024-11-07 21:47:50 4
#> 44: 0.04347826 0.012 2024-11-07 21:47:50 4
#> 45: 0.04347826 0.013 2024-11-07 21:47:50 4
#> 46: 0.04347826 0.014 2024-11-07 21:47:50 4
#> 47: 0.04347826 0.014 2024-11-07 21:47:50 4
#> 48: 0.04347826 0.013 2024-11-07 21:47:50 4
#> 49: 0.04347826 0.012 2024-11-07 21:47:50 4
#> 50: 0.04347826 0.013 2024-11-07 21:47:50 4
#> classif.ce runtime_learners timestamp batch_nr
#> permuted__bill_depth permuted__bill_length permuted__body_mass
#> <lgcl> <lgcl> <lgcl>
#> 1: FALSE FALSE FALSE
#> 2: FALSE FALSE FALSE
#> 3: FALSE FALSE FALSE
#> 4: FALSE FALSE FALSE
#> 5: FALSE FALSE FALSE
#> 6: FALSE FALSE FALSE
#> 7: FALSE FALSE FALSE
#> 8: TRUE FALSE FALSE
#> 9: FALSE TRUE FALSE
#> 10: FALSE FALSE TRUE
#> 11: FALSE FALSE FALSE
#> 12: FALSE FALSE FALSE
#> 13: FALSE FALSE FALSE
#> 14: FALSE FALSE FALSE
#> 15: FALSE FALSE FALSE
#> 16: FALSE FALSE FALSE
#> 17: FALSE FALSE FALSE
#> 18: FALSE FALSE FALSE
#> 19: FALSE FALSE FALSE
#> 20: FALSE FALSE FALSE
#> 21: TRUE FALSE FALSE
#> 22: FALSE TRUE FALSE
#> 23: FALSE FALSE TRUE
#> 24: FALSE FALSE FALSE
#> 25: FALSE FALSE FALSE
#> 26: FALSE FALSE FALSE
#> 27: FALSE FALSE FALSE
#> 28: FALSE FALSE FALSE
#> 29: FALSE FALSE FALSE
#> 30: FALSE FALSE FALSE
#> 31: FALSE FALSE FALSE
#> 32: FALSE FALSE FALSE
#> 33: TRUE FALSE FALSE
#> 34: FALSE TRUE FALSE
#> 35: FALSE FALSE TRUE
#> 36: FALSE FALSE FALSE
#> 37: FALSE FALSE FALSE
#> 38: FALSE FALSE FALSE
#> 39: FALSE FALSE FALSE
#> 40: FALSE FALSE FALSE
#> 41: FALSE FALSE FALSE
#> 42: FALSE FALSE FALSE
#> 43: FALSE FALSE FALSE
#> 44: TRUE FALSE FALSE
#> 45: FALSE TRUE FALSE
#> 46: FALSE FALSE TRUE
#> 47: FALSE FALSE FALSE
#> 48: FALSE FALSE FALSE
#> 49: FALSE FALSE FALSE
#> 50: FALSE FALSE FALSE
#> permuted__bill_depth permuted__bill_length permuted__body_mass
#> permuted__flipper_length permuted__island permuted__sex permuted__year
#> <lgcl> <lgcl> <lgcl> <lgcl>
#> 1: FALSE FALSE FALSE FALSE
#> 2: FALSE FALSE FALSE FALSE
#> 3: FALSE FALSE FALSE FALSE
#> 4: FALSE FALSE FALSE FALSE
#> 5: FALSE FALSE FALSE FALSE
#> 6: FALSE FALSE FALSE FALSE
#> 7: FALSE FALSE FALSE FALSE
#> 8: FALSE FALSE FALSE FALSE
#> 9: FALSE FALSE FALSE FALSE
#> 10: FALSE FALSE FALSE FALSE
#> 11: TRUE FALSE FALSE FALSE
#> 12: FALSE TRUE FALSE FALSE
#> 13: FALSE FALSE TRUE FALSE
#> 14: FALSE FALSE FALSE TRUE
#> 15: FALSE FALSE FALSE FALSE
#> 16: FALSE FALSE FALSE FALSE
#> 17: FALSE FALSE FALSE FALSE
#> 18: FALSE FALSE FALSE FALSE
#> 19: FALSE FALSE FALSE FALSE
#> 20: FALSE FALSE FALSE FALSE
#> 21: FALSE FALSE FALSE FALSE
#> 22: FALSE FALSE FALSE FALSE
#> 23: FALSE FALSE FALSE FALSE
#> 24: TRUE FALSE FALSE FALSE
#> 25: FALSE TRUE FALSE FALSE
#> 26: FALSE FALSE TRUE FALSE
#> 27: FALSE FALSE FALSE TRUE
#> 28: FALSE FALSE FALSE FALSE
#> 29: FALSE FALSE FALSE FALSE
#> 30: FALSE FALSE FALSE FALSE
#> 31: FALSE FALSE FALSE FALSE
#> 32: FALSE FALSE FALSE FALSE
#> 33: FALSE FALSE FALSE FALSE
#> 34: FALSE FALSE FALSE FALSE
#> 35: FALSE FALSE FALSE FALSE
#> 36: TRUE FALSE FALSE FALSE
#> 37: FALSE TRUE FALSE FALSE
#> 38: FALSE FALSE TRUE FALSE
#> 39: FALSE FALSE FALSE TRUE
#> 40: FALSE FALSE FALSE FALSE
#> 41: FALSE FALSE FALSE FALSE
#> 42: FALSE FALSE FALSE FALSE
#> 43: FALSE FALSE FALSE FALSE
#> 44: FALSE FALSE FALSE FALSE
#> 45: FALSE FALSE FALSE FALSE
#> 46: FALSE FALSE FALSE FALSE
#> 47: TRUE FALSE FALSE FALSE
#> 48: FALSE TRUE FALSE FALSE
#> 49: FALSE FALSE TRUE FALSE
#> 50: FALSE FALSE FALSE TRUE
#> permuted__flipper_length permuted__island permuted__sex permuted__year
#> warnings errors features n_features
#> <int> <int> <list> <list>
#> 1: 0 0 bill_depth 1
#> 2: 0 0 bill_length 1
#> 3: 0 0 body_mass 1
#> 4: 0 0 flipper_length 1
#> 5: 0 0 island 1
#> 6: 0 0 sex 1
#> 7: 0 0 year 1
#> 8: 0 0 0
#> 9: 0 0 0
#> 10: 0 0 0
#> 11: 0 0 0
#> 12: 0 0 0
#> 13: 0 0 0
#> 14: 0 0 0
#> 15: 0 0 bill_depth,flipper_length 2
#> 16: 0 0 bill_length,flipper_length 2
#> 17: 0 0 body_mass,flipper_length 2
#> 18: 0 0 flipper_length,island 2
#> 19: 0 0 flipper_length,sex 2
#> 20: 0 0 flipper_length,year 2
#> 21: 0 0 flipper_length 1
#> 22: 0 0 flipper_length 1
#> 23: 0 0 flipper_length 1
#> 24: 0 0 flipper_length 1
#> 25: 0 0 flipper_length 1
#> 26: 0 0 flipper_length 1
#> 27: 0 0 flipper_length 1
#> 28: 0 0 bill_depth,bill_length,flipper_length 3
#> 29: 0 0 bill_length,body_mass,flipper_length 3
#> 30: 0 0 bill_length,flipper_length,island 3
#> 31: 0 0 bill_length,flipper_length,sex 3
#> 32: 0 0 bill_length,flipper_length,year 3
#> 33: 0 0 bill_length,flipper_length 2
#> 34: 0 0 bill_length,flipper_length 2
#> 35: 0 0 bill_length,flipper_length 2
#> 36: 0 0 bill_length,flipper_length 2
#> 37: 0 0 bill_length,flipper_length 2
#> 38: 0 0 bill_length,flipper_length 2
#> 39: 0 0 bill_length,flipper_length 2
#> 40: 0 0 bill_depth,bill_length,flipper_length,island 4
#> 41: 0 0 bill_length,body_mass,flipper_length,island 4
#> 42: 0 0 bill_length,flipper_length,island,sex 4
#> 43: 0 0 bill_length,flipper_length,island,year 4
#> 44: 0 0 bill_length,flipper_length,island 3
#> 45: 0 0 bill_length,flipper_length,island 3
#> 46: 0 0 bill_length,flipper_length,island 3
#> 47: 0 0 bill_length,flipper_length,island 3
#> 48: 0 0 bill_length,flipper_length,island 3
#> 49: 0 0 bill_length,flipper_length,island 3
#> 50: 0 0 bill_length,flipper_length,island 3
#> warnings errors features n_features
#> resample_result
#> <list>
#> 1: <ResampleResult>
#> 2: <ResampleResult>
#> 3: <ResampleResult>
#> 4: <ResampleResult>
#> 5: <ResampleResult>
#> 6: <ResampleResult>
#> 7: <ResampleResult>
#> 8: <ResampleResult>
#> 9: <ResampleResult>
#> 10: <ResampleResult>
#> 11: <ResampleResult>
#> 12: <ResampleResult>
#> 13: <ResampleResult>
#> 14: <ResampleResult>
#> 15: <ResampleResult>
#> 16: <ResampleResult>
#> 17: <ResampleResult>
#> 18: <ResampleResult>
#> 19: <ResampleResult>
#> 20: <ResampleResult>
#> 21: <ResampleResult>
#> 22: <ResampleResult>
#> 23: <ResampleResult>
#> 24: <ResampleResult>
#> 25: <ResampleResult>
#> 26: <ResampleResult>
#> 27: <ResampleResult>
#> 28: <ResampleResult>
#> 29: <ResampleResult>
#> 30: <ResampleResult>
#> 31: <ResampleResult>
#> 32: <ResampleResult>
#> 33: <ResampleResult>
#> 34: <ResampleResult>
#> 35: <ResampleResult>
#> 36: <ResampleResult>
#> 37: <ResampleResult>
#> 38: <ResampleResult>
#> 39: <ResampleResult>
#> 40: <ResampleResult>
#> 41: <ResampleResult>
#> 42: <ResampleResult>
#> 43: <ResampleResult>
#> 44: <ResampleResult>
#> 45: <ResampleResult>
#> 46: <ResampleResult>
#> 47: <ResampleResult>
#> 48: <ResampleResult>
#> 49: <ResampleResult>
#> 50: <ResampleResult>
#> resample_result
# subset the task and fit the final model
task$select(instance$result_feature_set)
learner$train(task)
# }