Feature Selection with Genetic Search
Source:R/FSelectorGeneticSearch.R
mlr_fselectors_genetic_search.Rd
Feature selection using the Genetic Algorithm from the package genalg.
Control Parameters
For the meaning of the control parameters, see genalg::rbga.bin()
.
genalg::rbga.bin()
internally terminates after iters
iteration.
We set ìters = 100000
to allow the termination via our terminators.
If more iterations are needed, set ìters
to a higher value in the parameter set.
Super class
mlr3fselect::FSelector
-> FSelectorGeneticSearch
Examples
# Feature Selection
# \donttest{
# retrieve task and load learner
task = tsk("penguins")
learner = lrn("classif.rpart")
# run feature selection on the Palmer Penguins data set
instance = fselect(
fselector = fs("genetic_search"),
task = task,
learner = learner,
resampling = rsmp("holdout"),
measure = msr("classif.ce"),
term_evals = 10
)
# best performing feature set
instance$result
#> bill_depth bill_length body_mass flipper_length island sex year
#> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl>
#> 1: FALSE TRUE FALSE TRUE FALSE FALSE FALSE
#> features n_features classif.ce
#> <list> <int> <num>
#> 1: bill_length,flipper_length 2 0.06956522
# all evaluated feature sets
as.data.table(instance$archive)
#> bill_depth bill_length body_mass flipper_length island sex year
#> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl>
#> 1: FALSE FALSE FALSE FALSE FALSE TRUE FALSE
#> 2: TRUE FALSE FALSE FALSE FALSE FALSE FALSE
#> 3: FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> 4: FALSE FALSE FALSE FALSE FALSE TRUE FALSE
#> 5: FALSE FALSE FALSE FALSE FALSE FALSE TRUE
#> 6: FALSE TRUE FALSE TRUE FALSE FALSE FALSE
#> 7: FALSE FALSE FALSE FALSE FALSE TRUE FALSE
#> 8: FALSE FALSE FALSE FALSE TRUE FALSE FALSE
#> 9: FALSE FALSE FALSE FALSE TRUE FALSE FALSE
#> 10: FALSE FALSE FALSE FALSE FALSE FALSE TRUE
#> classif.ce runtime_learners timestamp batch_nr warnings errors
#> <num> <num> <POSc> <int> <int> <int>
#> 1: 0.63478261 0.006 2024-03-09 11:41:38 1 0 0
#> 2: 0.25217391 0.005 2024-03-09 11:41:38 2 0 0
#> 3: 0.25217391 0.004 2024-03-09 11:41:38 3 0 0
#> 4: 0.63478261 0.005 2024-03-09 11:41:38 4 0 0
#> 5: 0.66956522 0.004 2024-03-09 11:41:38 5 0 0
#> 6: 0.06956522 0.005 2024-03-09 11:41:38 6 0 0
#> 7: 0.63478261 0.005 2024-03-09 11:41:38 7 0 0
#> 8: 0.32173913 0.005 2024-03-09 11:41:38 8 0 0
#> 9: 0.32173913 0.021 2024-03-09 11:41:38 9 0 0
#> 10: 0.66956522 0.004 2024-03-09 11:41:38 10 0 0
#> features n_features resample_result
#> <list> <list> <list>
#> 1: sex 1 <ResampleResult>
#> 2: bill_depth 1 <ResampleResult>
#> 3: flipper_length 1 <ResampleResult>
#> 4: sex 1 <ResampleResult>
#> 5: year 1 <ResampleResult>
#> 6: bill_length,flipper_length 2 <ResampleResult>
#> 7: sex 1 <ResampleResult>
#> 8: island 1 <ResampleResult>
#> 9: island 1 <ResampleResult>
#> 10: year 1 <ResampleResult>
# subset the task and fit the final model
task$select(instance$result_feature_set)
learner$train(task)
# }