
Feature Selection with Design Points
Source:R/FSelectorDesignPoints.R
mlr_fselectors_design_points.Rd
Feature selection using user-defined feature sets.
Details
The feature sets are evaluated in order as given.
The feature selection terminates itself when all feature sets are evaluated. It is not necessary to set a termination criterion.
Dictionary
This FSelector can be instantiated with the associated sugar function fs()
:
fs("design_points")
Parameters
batch_size
integer(1)
Maximum number of configurations to try in a batch.design
data.table::data.table
Design points to try in search, one per row.
Super classes
mlr3fselect::FSelector
-> mlr3fselect::FSelectorFromOptimizer
-> FSelectorDesignPoints
Examples
# Feature Selection
# \donttest{
# retrieve task and load learner
task = tsk("pima")
learner = lrn("classif.rpart")
# create design
design = mlr3misc::rowwise_table(
~age, ~glucose, ~insulin, ~mass, ~pedigree, ~pregnant, ~pressure, ~triceps,
TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, FALSE, TRUE,
TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, FALSE,
TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, FALSE, FALSE,
TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE
)
# run feature selection on the Pima Indians diabetes data set
instance = fselect(
fselector = fs("design_points", design = design),
task = task,
learner = learner,
resampling = rsmp("holdout"),
measure = msr("classif.ce")
)
# best performing feature set
instance$result
#> age glucose insulin mass pedigree pregnant pressure triceps
#> 1: TRUE TRUE FALSE TRUE FALSE TRUE FALSE FALSE
#> features classif.ce
#> 1: age,glucose,mass,pregnant 0.2851562
# all evaluated feature sets
as.data.table(instance$archive)
#> age glucose insulin mass pedigree pregnant pressure triceps classif.ce
#> 1: TRUE FALSE TRUE TRUE FALSE TRUE FALSE TRUE 0.3007812
#> 2: TRUE TRUE FALSE TRUE FALSE TRUE FALSE FALSE 0.2851562
#> 3: TRUE FALSE TRUE TRUE FALSE TRUE FALSE FALSE 0.3007812
#> 4: TRUE FALSE TRUE TRUE FALSE TRUE TRUE TRUE 0.3007812
#> runtime_learners timestamp batch_nr warnings errors
#> 1: 0.015 2023-03-02 12:42:31 1 0 0
#> 2: 0.015 2023-03-02 12:42:31 2 0 0
#> 3: 0.023 2023-03-02 12:42:31 3 0 0
#> 4: 0.020 2023-03-02 12:42:31 4 0 0
#> features resample_result
#> 1: age,insulin,mass,pregnant,triceps <ResampleResult[21]>
#> 2: age,glucose,mass,pregnant <ResampleResult[21]>
#> 3: age,insulin,mass,pregnant <ResampleResult[21]>
#> 4: age,insulin,mass,pregnant,pressure,triceps <ResampleResult[21]>
# subset the task and fit the final model
task$select(instance$result_feature_set)
learner$train(task)
# }