Skip to contents

Feature selection using user-defined feature sets.

Details

The feature sets are evaluated in order as given.

The feature selection terminates itself when all feature sets are evaluated. It is not necessary to set a termination criterion.

Dictionary

This FSelector can be instantiated with the associated sugar function fs():

fs("design_points")

Parameters

batch_size

integer(1)
Maximum number of configurations to try in a batch.

design

data.table::data.table
Design points to try in search, one per row.

Super classes

mlr3fselect::FSelector -> mlr3fselect::FSelectorFromOptimizer -> FSelectorDesignPoints

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.

Usage


Method clone()

The objects of this class are cloneable with this method.

Usage

FSelectorDesignPoints$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

# Feature Selection
# \donttest{

# retrieve task and load learner
task = tsk("pima")
learner = lrn("classif.rpart")

# create design
design = mlr3misc::rowwise_table(
  ~age, ~glucose, ~insulin, ~mass, ~pedigree, ~pregnant, ~pressure, ~triceps,
  TRUE, FALSE,    TRUE,     TRUE,  FALSE,     TRUE,       FALSE,    TRUE,
  TRUE, TRUE,     FALSE,    TRUE,  FALSE,     TRUE,       FALSE,    FALSE,
  TRUE, FALSE,    TRUE,     TRUE,  FALSE,     TRUE,       FALSE,    FALSE,
  TRUE, FALSE,    TRUE,     TRUE,  FALSE,     TRUE,       TRUE,     TRUE
)

# run feature selection on the Pima Indians diabetes data set
instance = fselect(
  fselector = fs("design_points", design = design),
  task = task,
  learner = learner,
  resampling = rsmp("holdout"),
  measure = msr("classif.ce")
)

# best performing feature set
instance$result
#>     age glucose insulin mass pedigree pregnant pressure triceps
#> 1: TRUE    TRUE   FALSE TRUE    FALSE     TRUE    FALSE   FALSE
#>                     features classif.ce
#> 1: age,glucose,mass,pregnant  0.2851562

# all evaluated feature sets
as.data.table(instance$archive)
#>     age glucose insulin mass pedigree pregnant pressure triceps classif.ce
#> 1: TRUE   FALSE    TRUE TRUE    FALSE     TRUE    FALSE    TRUE  0.3007812
#> 2: TRUE    TRUE   FALSE TRUE    FALSE     TRUE    FALSE   FALSE  0.2851562
#> 3: TRUE   FALSE    TRUE TRUE    FALSE     TRUE    FALSE   FALSE  0.3007812
#> 4: TRUE   FALSE    TRUE TRUE    FALSE     TRUE     TRUE    TRUE  0.3007812
#>    runtime_learners           timestamp batch_nr warnings errors
#> 1:            0.015 2023-03-02 12:42:31        1        0      0
#> 2:            0.015 2023-03-02 12:42:31        2        0      0
#> 3:            0.023 2023-03-02 12:42:31        3        0      0
#> 4:            0.020 2023-03-02 12:42:31        4        0      0
#>                                      features      resample_result
#> 1:          age,insulin,mass,pregnant,triceps <ResampleResult[21]>
#> 2:                  age,glucose,mass,pregnant <ResampleResult[21]>
#> 3:                  age,insulin,mass,pregnant <ResampleResult[21]>
#> 4: age,insulin,mass,pregnant,pressure,triceps <ResampleResult[21]>

# subset the task and fit the final model
task$select(instance$result_feature_set)
learner$train(task)
# }