Feature Selection with Genetic Search
Source:R/FSelectorBatchGeneticSearch.R
mlr_fselectors_genetic_search.Rd
Feature selection using the Genetic Algorithm from the package genalg.
Control Parameters
For the meaning of the control parameters, see genalg::rbga.bin()
.
genalg::rbga.bin()
internally terminates after iters
iteration.
We set ìters = 100000
to allow the termination via our terminators.
If more iterations are needed, set ìters
to a higher value in the parameter set.
Super classes
mlr3fselect::FSelector
-> mlr3fselect::FSelectorBatch
-> FSelectorBatchGeneticSearch
Examples
# Feature Selection
# \donttest{
# retrieve task and load learner
task = tsk("penguins")
learner = lrn("classif.rpart")
# run feature selection on the Palmer Penguins data set
instance = fselect(
fselector = fs("genetic_search"),
task = task,
learner = learner,
resampling = rsmp("holdout"),
measure = msr("classif.ce"),
term_evals = 10
)
# best performing feature set
instance$result
#> bill_depth bill_length body_mass flipper_length island sex year
#> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl>
#> 1: FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> features n_features classif.ce
#> <list> <int> <num>
#> 1: flipper_length 1 0.1913043
# all evaluated feature sets
as.data.table(instance$archive)
#> bill_depth bill_length body_mass flipper_length island sex year
#> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl> <lgcl>
#> 1: FALSE FALSE FALSE FALSE FALSE TRUE FALSE
#> 2: FALSE FALSE FALSE FALSE FALSE FALSE TRUE
#> 3: FALSE FALSE FALSE FALSE FALSE FALSE TRUE
#> 4: FALSE FALSE FALSE FALSE FALSE TRUE FALSE
#> 5: FALSE FALSE FALSE FALSE TRUE FALSE FALSE
#> 6: FALSE TRUE FALSE FALSE FALSE FALSE FALSE
#> 7: FALSE TRUE FALSE FALSE FALSE FALSE FALSE
#> 8: FALSE FALSE FALSE FALSE TRUE FALSE FALSE
#> 9: TRUE FALSE FALSE FALSE TRUE FALSE FALSE
#> 10: FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> classif.ce runtime_learners timestamp batch_nr warnings errors
#> <num> <num> <POSc> <int> <int> <int>
#> 1: 0.6000000 0.004 2024-12-10 11:07:05 1 0 0
#> 2: 0.6000000 0.004 2024-12-10 11:07:05 2 0 0
#> 3: 0.6000000 0.005 2024-12-10 11:07:05 3 0 0
#> 4: 0.6000000 0.004 2024-12-10 11:07:05 4 0 0
#> 5: 0.2347826 0.004 2024-12-10 11:07:05 5 0 0
#> 6: 0.2521739 0.005 2024-12-10 11:07:05 6 0 0
#> 7: 0.2521739 0.005 2024-12-10 11:07:05 7 0 0
#> 8: 0.2347826 0.004 2024-12-10 11:07:05 8 0 0
#> 9: 0.2260870 0.005 2024-12-10 11:07:05 9 0 0
#> 10: 0.1913043 0.005 2024-12-10 11:07:05 10 0 0
#> features n_features resample_result
#> <list> <list> <list>
#> 1: sex 1 <ResampleResult>
#> 2: year 1 <ResampleResult>
#> 3: year 1 <ResampleResult>
#> 4: sex 1 <ResampleResult>
#> 5: island 1 <ResampleResult>
#> 6: bill_length 1 <ResampleResult>
#> 7: bill_length 1 <ResampleResult>
#> 8: island 1 <ResampleResult>
#> 9: bill_depth,island 2 <ResampleResult>
#> 10: flipper_length 1 <ResampleResult>
# subset the task and fit the final model
task$select(instance$result_feature_set)
learner$train(task)
# }