Skip to contents

Extract inner feature selection archives of nested resampling. Implemented for mlr3::ResampleResult and mlr3::BenchmarkResult. The function iterates over the AutoFSelector objects and binds the archives to a data.table::data.table(). AutoFSelector must be initialized with store_fselect_instance = TRUE and resample() or benchmark() must be called with store_models = TRUE.

Usage

extract_inner_fselect_archives(x, unnest = NULL, exclude_columns = "uhash")

Arguments

x

(mlr3::ResampleResult | mlr3::BenchmarkResult).

unnest

(character())
Transforms list columns to separate columns. Set to NULL if no column should be unnested.

exclude_columns

(character())
Exclude columns from result table. Set to NULL if no column should be excluded.

Data structure

The returned data table has the following columns:

  • experiment (integer(1))
    Index, giving the according row number in the original benchmark grid.

  • iteration (integer(1))
    Iteration of the outer resampling.

  • One column for each feature of the task.

  • One column for each performance measure.

  • runtime_learners (numeric(1))
    Sum of training and predict times logged in learners per mlr3::ResampleResult / evaluation. This does not include potential overhead time.

  • timestamp (POSIXct)
    Time stamp when the evaluation was logged into the archive.

  • batch_nr (integer(1))
    Feature sets are evaluated in batches. Each batch has a unique batch number.

  • resample_result (mlr3::ResampleResult)
    Resample result of the inner resampling.

  • task_id (character(1)).

  • learner_id (character(1)).

  • resampling_id (character(1)).

Examples

at = auto_fselector(
  method = "random_search",
  learner = lrn("classif.rpart"),
  resampling = rsmp ("holdout"),
  measure = msr("classif.ce"),
  term_evals = 4)

resampling_outer = rsmp("cv", folds = 2)
rr = resample(tsk("iris"), at, resampling_outer, store_models = TRUE)

extract_inner_fselect_archives(rr)
#>    iteration Petal.Length Petal.Width Sepal.Length Sepal.Width classif.ce
#> 1:         1         TRUE       FALSE         TRUE        TRUE       0.00
#> 2:         1         TRUE        TRUE        FALSE       FALSE       0.00
#> 3:         1         TRUE        TRUE         TRUE       FALSE       0.00
#> 4:         1        FALSE       FALSE         TRUE        TRUE       0.44
#> 5:         2        FALSE        TRUE        FALSE       FALSE       0.00
#> 6:         2        FALSE       FALSE         TRUE       FALSE       0.20
#> 7:         2        FALSE        TRUE        FALSE       FALSE       0.00
#> 8:         2         TRUE        TRUE         TRUE       FALSE       0.00
#>    runtime_learners           timestamp batch_nr      resample_result task_id
#> 1:            0.064 2022-08-25 10:40:16        1 <ResampleResult[21]>    iris
#> 2:            0.063 2022-08-25 10:40:17        2 <ResampleResult[21]>    iris
#> 3:            0.062 2022-08-25 10:40:17        3 <ResampleResult[21]>    iris
#> 4:            0.062 2022-08-25 10:40:17        4 <ResampleResult[21]>    iris
#> 5:            0.062 2022-08-25 10:40:17        1 <ResampleResult[21]>    iris
#> 6:            0.063 2022-08-25 10:40:18        2 <ResampleResult[21]>    iris
#> 7:            0.063 2022-08-25 10:40:18        3 <ResampleResult[21]>    iris
#> 8:            0.061 2022-08-25 10:40:18        4 <ResampleResult[21]>    iris
#>                 learner_id resampling_id
#> 1: classif.rpart.fselector            cv
#> 2: classif.rpart.fselector            cv
#> 3: classif.rpart.fselector            cv
#> 4: classif.rpart.fselector            cv
#> 5: classif.rpart.fselector            cv
#> 6: classif.rpart.fselector            cv
#> 7: classif.rpart.fselector            cv
#> 8: classif.rpart.fselector            cv