
Feature Selection with Genetic Search
Source:R/FSelectorGeneticSearch.R
mlr_fselectors_genetic_search.Rd
Feature selection using the Genetic Algorithm from the package genalg.
Dictionary
This FSelector can be instantiated with the associated sugar function fs()
:
fs("genetic_search")
Control Parameters
For the meaning of the control parameters, see genalg::rbga.bin()
.
genalg::rbga.bin()
internally terminates after iters
iteration.
We set ìters = 100000
to allow the termination via our terminators.
If more iterations are needed, set ìters
to a higher value in the parameter set.
Super class
mlr3fselect::FSelector
-> FSelectorGeneticSearch
Examples
# Feature Selection
# \donttest{
# retrieve task and load learner
task = tsk("penguins")
learner = lrn("classif.rpart")
# run feature selection on the Palmer Penguins data set
instance = fselect(
method = "genetic_search",
task = task,
learner = learner,
resampling = rsmp("holdout"),
measure = msr("classif.ce"),
term_evals = 10
)
# best performing feature set
instance$result
#> bill_depth bill_length body_mass flipper_length island sex year
#> 1: TRUE TRUE FALSE FALSE FALSE FALSE FALSE
#> features classif.ce
#> 1: bill_depth,bill_length 0.1304348
# all evaluated feature sets
as.data.table(instance$archive)
#> bill_depth bill_length body_mass flipper_length island sex year
#> 1: TRUE FALSE FALSE FALSE FALSE FALSE FALSE
#> 2: FALSE FALSE TRUE FALSE FALSE FALSE FALSE
#> 3: TRUE TRUE FALSE FALSE FALSE FALSE FALSE
#> 4: FALSE FALSE FALSE FALSE FALSE FALSE TRUE
#> 5: FALSE TRUE FALSE FALSE FALSE FALSE FALSE
#> 6: FALSE FALSE FALSE FALSE FALSE TRUE FALSE
#> 7: FALSE FALSE FALSE FALSE FALSE TRUE FALSE
#> 8: FALSE FALSE TRUE FALSE FALSE FALSE FALSE
#> 9: TRUE FALSE FALSE FALSE FALSE FALSE FALSE
#> 10: TRUE FALSE FALSE FALSE FALSE FALSE FALSE
#> classif.ce runtime_learners timestamp batch_nr warnings errors
#> 1: 0.3043478 0.007 2023-01-26 18:34:00 1 0 0
#> 2: 0.2695652 0.007 2023-01-26 18:34:00 2 0 0
#> 3: 0.1304348 0.007 2023-01-26 18:34:00 3 0 0
#> 4: 0.5826087 0.005 2023-01-26 18:34:00 4 0 0
#> 5: 0.2695652 0.007 2023-01-26 18:34:00 5 0 0
#> 6: 0.5826087 0.006 2023-01-26 18:34:00 6 0 0
#> 7: 0.5826087 0.005 2023-01-26 18:34:00 7 0 0
#> 8: 0.2695652 0.007 2023-01-26 18:34:01 8 0 0
#> 9: 0.3043478 0.008 2023-01-26 18:34:01 9 0 0
#> 10: 0.3043478 0.006 2023-01-26 18:34:01 10 0 0
#> resample_result
#> 1: <ResampleResult[21]>
#> 2: <ResampleResult[21]>
#> 3: <ResampleResult[21]>
#> 4: <ResampleResult[21]>
#> 5: <ResampleResult[21]>
#> 6: <ResampleResult[21]>
#> 7: <ResampleResult[21]>
#> 8: <ResampleResult[21]>
#> 9: <ResampleResult[21]>
#> 10: <ResampleResult[21]>
# subset the task and fit the final model
task$select(instance$result_feature_set)
learner$train(task)
# }