Skip to contents

Design points uses feature sets specified by the user.

The feature sets are evaluated in order as given. The feature selection terminates itself when all feature sets are evaluated. It is not necessary to set a termination criterion.

Dictionary

This FSelector can be instantiated via the dictionary mlr_fselectors or with the associated sugar function fs():

mlr_fselectors$get("design_points")
fs("design_points")

Parameters

batch_size

integer(1)
Maximum number of configurations to try in a batch.

design

data.table::data.table
Design points to try in search, one per row.

Super classes

mlr3fselect::FSelector -> mlr3fselect::FSelectorFromOptimizer -> FSelectorDesignPoints

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.

Usage


Method clone()

The objects of this class are cloneable with this method.

Usage

FSelectorDesignPoints$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

library(mlr3misc)

# retrieve task
task = tsk("pima")

# load learner
learner = lrn("classif.rpart")

# create design
design = rowwise_table(
  ~age, ~glucose, ~insulin, ~mass, ~pedigree, ~pregnant, ~pressure, ~triceps,
  TRUE, FALSE,    TRUE,     TRUE,  FALSE,     TRUE,       FALSE,    TRUE,
  TRUE, TRUE,     FALSE,    TRUE,  FALSE,     TRUE,       FALSE,    FALSE,
  TRUE, FALSE,    TRUE,     TRUE,  FALSE,     TRUE,       FALSE,    FALSE,
  TRUE, FALSE,    TRUE,     TRUE,  FALSE,     TRUE,       TRUE,     TRUE
)

# \donttest{
# feature selection on the pima indians diabetes data set
instance = fselect(
  method = "design_points",
  task = task,
  learner = learner,
  resampling = rsmp("cv", folds = 3),
  measure = msr("classif.ce"),
  design = design
)

# best performing feature subset
instance$result
#>     age glucose insulin mass pedigree pregnant pressure triceps
#> 1: TRUE    TRUE   FALSE TRUE    FALSE     TRUE    FALSE   FALSE
#>                     features classif.ce
#> 1: age,glucose,mass,pregnant  0.2682292

# all evaluated feature subsets
as.data.table(instance$archive)
#>     age glucose insulin mass pedigree pregnant pressure triceps classif.ce
#> 1: TRUE   FALSE    TRUE TRUE    FALSE     TRUE    FALSE    TRUE  0.3164062
#> 2: TRUE    TRUE   FALSE TRUE    FALSE     TRUE    FALSE   FALSE  0.2682292
#> 3: TRUE   FALSE    TRUE TRUE    FALSE     TRUE    FALSE   FALSE  0.3138021
#> 4: TRUE   FALSE    TRUE TRUE    FALSE     TRUE     TRUE    TRUE  0.3151042
#>    runtime_learners           timestamp batch_nr      resample_result
#> 1:            0.214 2022-08-25 10:40:25        1 <ResampleResult[21]>
#> 2:            0.216 2022-08-25 10:40:26        2 <ResampleResult[21]>
#> 3:            0.221 2022-08-25 10:40:26        3 <ResampleResult[21]>
#> 4:            0.221 2022-08-25 10:40:27        4 <ResampleResult[21]>

# subset the task and fit the final model
task$select(instance$result_feature_set)
learner$train(task)
# }