Skip to contents

Sequential search iteratively adds features to the set.

Sequential forward selection (strategy = fsf) extends the feature set in each iteration with the feature that increases the models performance the most. Sequential backward selection (strategy = fsb) follows the same idea but starts with all features and removes features from the set.

The feature selection terminates itself when min_features or max_features is reached. It is not necessary to set a termination criterion.

Dictionary

This FSelector can be instantiated via the dictionary mlr_fselectors or with the associated sugar function fs():

mlr_fselectors$get("sequential")
fs("sequential")

Parameters

min_features

integer(1)
Minimum number of features. By default, 1.

max_features

integer(1)
Maximum number of features. By default, number of features in mlr3::Task.

strategy

character(1)
Search method sfs (forward search) or sbs (backward search).

Super class

mlr3fselect::FSelector -> FSelectorSequential

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.`

Usage


Method optimization_path()

Returns the optimization path.

Usage

FSelectorSequential$optimization_path(inst, include_uhash = FALSE)

Arguments

inst

(FSelectInstanceSingleCrit)
Instance optimized with FSelectorSequential.

include_uhash

(logical(1))
Include uhash column?


Method clone()

The objects of this class are cloneable with this method.

Usage

FSelectorSequential$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

# retrieve task
task = tsk("pima")

# load learner
learner = lrn("classif.rpart")

# \donttest{
# feature selection on the pima indians diabetes data set
instance = fselect(
  method = "sequential",
  task = task,
  learner = learner,
  resampling = rsmp("holdout"),
  measure = msr("classif.ce"),
  term_evals = 10
)

# best performing feature subset
instance$result
#>      age glucose insulin  mass pedigree pregnant pressure triceps
#> 1: FALSE    TRUE   FALSE FALSE    FALSE     TRUE    FALSE   FALSE
#>            features classif.ce
#> 1: glucose,pregnant  0.2382812

# all evaluated feature subsets
as.data.table(instance$archive)
#>       age glucose insulin  mass pedigree pregnant pressure triceps classif.ce
#>  1:  TRUE   FALSE   FALSE FALSE    FALSE    FALSE    FALSE   FALSE  0.3242188
#>  2: FALSE    TRUE   FALSE FALSE    FALSE    FALSE    FALSE   FALSE  0.2890625
#>  3: FALSE   FALSE    TRUE FALSE    FALSE    FALSE    FALSE   FALSE  0.5000000
#>  4: FALSE   FALSE   FALSE  TRUE    FALSE    FALSE    FALSE   FALSE  0.3203125
#>  5: FALSE   FALSE   FALSE FALSE     TRUE    FALSE    FALSE   FALSE  0.3554688
#>  6: FALSE   FALSE   FALSE FALSE    FALSE     TRUE    FALSE   FALSE  0.2890625
#>  7: FALSE   FALSE   FALSE FALSE    FALSE    FALSE     TRUE   FALSE  0.3398438
#>  8: FALSE   FALSE   FALSE FALSE    FALSE    FALSE    FALSE    TRUE  0.3593750
#>  9:  TRUE    TRUE   FALSE FALSE    FALSE    FALSE    FALSE   FALSE  0.2539062
#> 10: FALSE    TRUE    TRUE FALSE    FALSE    FALSE    FALSE   FALSE  0.2539062
#> 11: FALSE    TRUE   FALSE  TRUE    FALSE    FALSE    FALSE   FALSE  0.2578125
#> 12: FALSE    TRUE   FALSE FALSE     TRUE    FALSE    FALSE   FALSE  0.2773438
#> 13: FALSE    TRUE   FALSE FALSE    FALSE     TRUE    FALSE   FALSE  0.2382812
#> 14: FALSE    TRUE   FALSE FALSE    FALSE    FALSE     TRUE   FALSE  0.2460938
#> 15: FALSE    TRUE   FALSE FALSE    FALSE    FALSE    FALSE    TRUE  0.2656250
#>     runtime_learners           timestamp batch_nr      resample_result
#>  1:            0.068 2022-08-25 10:41:04        1 <ResampleResult[21]>
#>  2:            0.081 2022-08-25 10:41:04        1 <ResampleResult[21]>
#>  3:            0.082 2022-08-25 10:41:04        1 <ResampleResult[21]>
#>  4:            0.061 2022-08-25 10:41:04        1 <ResampleResult[21]>
#>  5:            0.062 2022-08-25 10:41:04        1 <ResampleResult[21]>
#>  6:            0.085 2022-08-25 10:41:04        1 <ResampleResult[21]>
#>  7:            0.060 2022-08-25 10:41:04        1 <ResampleResult[21]>
#>  8:            0.062 2022-08-25 10:41:04        1 <ResampleResult[21]>
#>  9:            0.063 2022-08-25 10:41:05        2 <ResampleResult[21]>
#> 10:            0.084 2022-08-25 10:41:05        2 <ResampleResult[21]>
#> 11:            0.069 2022-08-25 10:41:05        2 <ResampleResult[21]>
#> 12:            0.063 2022-08-25 10:41:05        2 <ResampleResult[21]>
#> 13:            0.081 2022-08-25 10:41:05        2 <ResampleResult[21]>
#> 14:            0.069 2022-08-25 10:41:05        2 <ResampleResult[21]>
#> 15:            0.061 2022-08-25 10:41:05        2 <ResampleResult[21]>

# subset the task and fit the final model
task$select(instance$result_feature_set)
learner$train(task)
# }