Skip to contents

The FSelectInstanceMultiCrit specifies a feature selection problem for FSelectors. The function fsi() creates a FSelectInstanceMultiCrit and the function fselect() creates an instance internally.

Resources

Analysis

For analyzing the feature selection results, it is recommended to pass the archive to as.data.table(). The returned data table is joined with the benchmark result which adds the mlr3::ResampleResult for each feature set.

The archive provides various getters (e.g. $learners()) to ease the access. All getters extract by position (i) or unique hash (uhash). For a complete list of all getters see the methods section.

The benchmark result ($benchmark_result) allows to score the feature sets again on a different measure. Alternatively, measures can be supplied to as.data.table().

Super classes

bbotk::OptimInstance -> bbotk::OptimInstanceMultiCrit -> FSelectInstanceMultiCrit

Active bindings

result_feature_set

(list of character())
Feature sets for task subsetting.

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.

Usage

FSelectInstanceMultiCrit$new(
  task,
  learner,
  resampling,
  measures,
  terminator,
  store_benchmark_result = TRUE,
  store_models = FALSE,
  check_values = FALSE
)

Arguments

task

(mlr3::Task)
Task to operate on.

learner

(mlr3::Learner)
Learner to optimize the feature subset for.

resampling

(mlr3::Resampling)
Resampling that is used to evaluated the performance of the feature subsets. Uninstantiated resamplings are instantiated during construction so that all feature subsets are evaluated on the same data splits. Already instantiated resamplings are kept unchanged.

measures

(list of mlr3::Measure)
Measures to optimize. If NULL, mlr3's default measure is used.

terminator

(Terminator)
Stop criterion of the feature selection.

store_benchmark_result

(logical(1))
Store benchmark result in archive?

store_models

(logical(1)). Store models in benchmark result?

check_values

(logical(1))
Check the parameters before the evaluation and the results for validity?


Method assign_result()

The FSelector object writes the best found feature subsets and estimated performance values here. For internal use.

Usage

FSelectInstanceMultiCrit$assign_result(xdt, ydt)

Arguments

xdt

(data.table::data.table())
x values as data.table. Each row is one point. Contains the value in the search space of the FSelectInstanceMultiCrit object. Can contain additional columns for extra information.

ydt

(data.table::data.table())
Optimal outcomes, e.g. the Pareto front.


Method print()

Printer.

Usage

FSelectInstanceMultiCrit$print(...)

Arguments

...

(ignored).


Method clone()

The objects of this class are cloneable with this method.

Usage

FSelectInstanceMultiCrit$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

# Feature selection on Palmer Penguins data set
task = tsk("penguins")

# Construct feature selection instance
instance = fsi(
  task = task,
  learner = lrn("classif.rpart"),
  resampling = rsmp("cv", folds = 3),
  measures = msrs(c("classif.ce", "time_train")),
  terminator = trm("evals", n_evals = 4)
)

# Choose optimization algorithm
fselector = fs("random_search", batch_size = 2)

# Run feature selection
fselector$optimize(instance)
#>    bill_depth bill_length body_mass flipper_length island  sex year
#> 1:       TRUE        TRUE     FALSE           TRUE  FALSE TRUE TRUE
#> 2:      FALSE        TRUE      TRUE          FALSE  FALSE TRUE TRUE
#>                                          features classif.ce time_train
#> 1: bill_depth,bill_length,flipper_length,sex,year 0.06384439 0.02966667
#> 2:                 bill_length,body_mass,sex,year 0.09593186 0.02866667

# Optimal feature sets
instance$result_feature_set
#> [[1]]
#> [1] "bill_depth"     "bill_length"    "flipper_length" "sex"           
#> [5] "year"          
#> 
#> [[2]]
#> [1] "bill_length" "body_mass"   "sex"         "year"       
#> 

# Inspect all evaluated sets
as.data.table(instance$archive)
#>    bill_depth bill_length body_mass flipper_length island   sex  year
#> 1:       TRUE        TRUE     FALSE           TRUE  FALSE  TRUE  TRUE
#> 2:      FALSE        TRUE      TRUE          FALSE  FALSE  TRUE  TRUE
#> 3:      FALSE        TRUE      TRUE          FALSE  FALSE FALSE FALSE
#> 4:       TRUE       FALSE      TRUE          FALSE   TRUE  TRUE  TRUE
#>    classif.ce time_train runtime_learners           timestamp batch_nr warnings
#> 1: 0.06384439 0.02966667            0.221 2022-11-25 12:09:24        1        0
#> 2: 0.09593186 0.02866667            0.176 2022-11-25 12:09:24        1        0
#> 3: 0.09883041 0.03000000            0.184 2022-11-25 12:09:24        2        0
#> 4: 0.19181287 0.03666667            0.204 2022-11-25 12:09:24        2        0
#>    errors      resample_result
#> 1:      0 <ResampleResult[21]>
#> 2:      0 <ResampleResult[21]>
#> 3:      0 <ResampleResult[21]>
#> 4:      0 <ResampleResult[21]>