Feature Selection with Genetic Search
Source:R/FSelectorBatchGeneticSearch.R
mlr_fselectors_genetic_search.Rd
Feature selection using the Genetic Algorithm from the package genalg.
Control Parameters
For the meaning of the control parameters, see genalg::rbga.bin()
.
genalg::rbga.bin()
internally terminates after iters
iteration.
We set ìters = 100000
to allow the termination via our terminators.
If more iterations are needed, set ìters
to a higher value in the parameter set.
Super classes
mlr3fselect::FSelector
-> mlr3fselect::FSelectorBatch
-> FSelectorBatchGeneticSearch
Examples
# Feature Selection
# \donttest{
# retrieve task and load learner
task = tsk("penguins")
learner = lrn("classif.rpart")
# run feature selection on the Palmer Penguins data set
instance = fselect(
fselector = fs("genetic_search"),
task = task,
learner = learner,
resampling = rsmp("holdout"),
measure = msr("classif.ce"),
term_evals = 10
)
# best performing feature set
instance$result
#> bill_depth bill_length body_mass flipper_length island sex year
#> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl>
#> 1: TRUE TRUE FALSE FALSE FALSE FALSE FALSE
#> features n_features classif.ce
#> <list> <int> <num>
#> 1: bill_depth,bill_length 2 0.1130435
# all evaluated feature sets
as.data.table(instance$archive)
#> bill_depth bill_length body_mass flipper_length island sex year
#> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl>
#> 1: FALSE FALSE FALSE FALSE FALSE FALSE TRUE
#> 2: FALSE TRUE FALSE FALSE FALSE FALSE FALSE
#> 3: FALSE FALSE FALSE FALSE FALSE FALSE TRUE
#> 4: FALSE FALSE TRUE FALSE FALSE FALSE FALSE
#> 5: TRUE FALSE FALSE FALSE FALSE FALSE FALSE
#> 6: FALSE FALSE TRUE FALSE FALSE FALSE TRUE
#> 7: TRUE TRUE FALSE FALSE FALSE FALSE FALSE
#> 8: TRUE FALSE FALSE FALSE FALSE FALSE FALSE
#> 9: FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> 10: TRUE FALSE TRUE TRUE FALSE FALSE FALSE
#> classif.ce runtime_learners timestamp batch_nr warnings errors
#> <num> <num> <POSc> <int> <int> <int>
#> 1: 0.6869565 0.005 2024-11-07 21:50:19 1 0 0
#> 2: 0.2434783 0.004 2024-11-07 21:50:19 2 0 0
#> 3: 0.6869565 0.004 2024-11-07 21:50:19 3 0 0
#> 4: 0.2956522 0.005 2024-11-07 21:50:19 4 0 0
#> 5: 0.3043478 0.005 2024-11-07 21:50:19 5 0 0
#> 6: 0.2956522 0.026 2024-11-07 21:50:19 6 0 0
#> 7: 0.1130435 0.005 2024-11-07 21:50:20 7 0 0
#> 8: 0.3043478 0.005 2024-11-07 21:50:20 8 0 0
#> 9: 0.2260870 0.004 2024-11-07 21:50:20 9 0 0
#> 10: 0.2086957 0.005 2024-11-07 21:50:20 10 0 0
#> features n_features resample_result
#> <list> <list> <list>
#> 1: year 1 <ResampleResult>
#> 2: bill_length 1 <ResampleResult>
#> 3: year 1 <ResampleResult>
#> 4: body_mass 1 <ResampleResult>
#> 5: bill_depth 1 <ResampleResult>
#> 6: body_mass,year 2 <ResampleResult>
#> 7: bill_depth,bill_length 2 <ResampleResult>
#> 8: bill_depth 1 <ResampleResult>
#> 9: flipper_length 1 <ResampleResult>
#> 10: bill_depth,body_mass,flipper_length 3 <ResampleResult>
# subset the task and fit the final model
task$select(instance$result_feature_set)
learner$train(task)
# }