The ArchiveFSelect stores all evaluated feature sets and performance scores.
Details
The ArchiveFSelect is a container around a data.table::data.table()
.
Each row corresponds to a single evaluation of a feature set.
See the section on Data Structure for more information.
The archive stores additionally a mlr3::BenchmarkResult ($benchmark_result
) that records the resampling experiments.
Each experiment corresponds to a single evaluation of a feature set.
The table ($data
) and the benchmark result ($benchmark_result
) are linked by the uhash
column.
If the archive is passed to as.data.table()
, both are joined automatically.
Data structure
The table ($data
) has the following columns:
One column for each feature of the task (
$search_space
).One column for each performance measure (
$codomain
).runtime_learners
(numeric(1)
)
Sum of training and predict times logged in learners per mlr3::ResampleResult / evaluation. This does not include potential overhead time.timestamp
(POSIXct
)
Time stamp when the evaluation was logged into the archive.batch_nr
(integer(1)
)
Feature sets are evaluated in batches. Each batch has a unique batch number.uhash
(character(1)
)
Connects each feature set to the resampling experiment stored in the mlr3::BenchmarkResult.
Analysis
For analyzing the feature selection results, it is recommended to pass the archive to as.data.table()
.
The returned data table is joined with the benchmark result which adds the mlr3::ResampleResult for each feature set.
The archive provides various getters (e.g. $learners()
) to ease the access.
All getters extract by position (i
) or unique hash (uhash
).
For a complete list of all getters see the methods section.
The benchmark result ($benchmark_result
) allows to score the feature sets again on a different measure.
Alternatively, measures can be supplied to as.data.table()
.
S3 Methods
as.data.table.ArchiveFSelect(x, exclude_columns = "uhash", measures = NULL)
Returns a tabular view of all evaluated feature sets.
ArchiveFSelect ->data.table::data.table()
x
(ArchiveFSelect)exclude_columns
(character()
)
Exclude columns from table. Set toNULL
if no column should be excluded.measures
(list of mlr3::Measure)
Score feature sets on additional measures.
Super class
bbotk::Archive
-> ArchiveFSelect
Public fields
benchmark_result
(mlr3::BenchmarkResult)
Benchmark result.
Methods
Method new()
Creates a new instance of this R6 class.
Usage
ArchiveFSelect$new(
search_space,
codomain,
check_values = TRUE,
ties_method = "least_features"
)
Arguments
search_space
(paradox::ParamSet)
Search space. Internally created from provided mlr3::Task by instance.codomain
(bbotk::Codomain)
Specifies codomain of objective function i.e. a set of performance measures. Internally created from provided mlr3::Measures by instance.check_values
(
logical(1)
)
IfTRUE
(default), hyperparameter configurations are check for validity.ties_method
(
character(1)
)
The method to break ties when selecting sets while optimizing and when selecting the best set. Can be"least_features"
or"random"
. The option"least_features"
(default) selects the feature set with the least features. If there are multiple best feature sets with the same number of features, one is selected randomly. Therandom
method returns a random feature set from the best feature sets. Ignored if multiple measures are used.
Method learner()
Retrieve mlr3::Learner of the i-th evaluation, by position or by unique hash uhash
.
i
and uhash
are mutually exclusive.
Learner does not contain a model. Use $learners()
to get learners with models.
Method learners()
Retrieve list of trained mlr3::Learner objects of the i-th evaluation,
by position or by unique hash uhash
. i
and uhash
are mutually
exclusive.
Method predictions()
Retrieve list of mlr3::Prediction objects of the i-th evaluation, by
position or by unique hash uhash
. i
and uhash
are mutually
exclusive.
Method resample_result()
Retrieve mlr3::ResampleResult of the i-th evaluation, by position
or by unique hash uhash
. i
and uhash
are mutually exclusive.
Method best()
Returns the best scoring feature sets.
Arguments
batch
(
integer()
)
The batch number(s) to limit the best results to. Default is all batches.ties_method
(
character(1)
)
Method to handle ties. IfNULL
(default), the global ties method set during initialization is used. The default global ties method isleast_features
which selects the feature set with the least features. If there are multiple best feature sets with the same number of features, one is selected randomly. Therandom
method returns a random feature set from the best feature sets.