The ArchiveFSelect stores all evaluated feature sets and performance scores.
Details
The ArchiveFSelect is a container around a data.table::data.table().
Each row corresponds to a single evaluation of a feature set.
See the section on Data Structure for more information.
The archive stores additionally a mlr3::BenchmarkResult ($benchmark_result) that records the resampling experiments.
Each experiment corresponds to a single evaluation of a feature set.
The table ($data) and the benchmark result ($benchmark_result) are linked by the uhash column.
If the archive is passed to as.data.table(), both are joined automatically.
Data structure
The table ($data) has the following columns:
One column for each feature of the task (
$search_space).One column for each performance measure (
$codomain).runtime_learners(numeric(1))
Sum of training and predict times logged in learners per mlr3::ResampleResult / evaluation. This does not include potential overhead time.timestamp(POSIXct)
Time stamp when the evaluation was logged into the archive.batch_nr(integer(1))
Feature sets are evaluated in batches. Each batch has a unique batch number.uhash(character(1))
Connects each feature set to the resampling experiment stored in the mlr3::BenchmarkResult.
Analysis
For analyzing the feature selection results, it is recommended to pass the archive to as.data.table().
The returned data table is joined with the benchmark result which adds the mlr3::ResampleResult for each feature set.
The archive provides various getters (e.g. $learners()) to ease the access.
All getters extract by position (i) or unique hash (uhash).
For a complete list of all getters see the methods section.
The benchmark result ($benchmark_result) allows to score the feature sets again on a different measure.
Alternatively, measures can be supplied to as.data.table().
S3 Methods
as.data.table.ArchiveFSelect(x, exclude_columns = "uhash", measures = NULL)
Returns a tabular view of all evaluated feature sets.
ArchiveFSelect ->data.table::data.table()x(ArchiveFSelect)exclude_columns(character())
Exclude columns from table. Set toNULLif no column should be excluded.measures(list of mlr3::Measure)
Score feature sets on additional measures.
Super class
bbotk::Archive -> ArchiveFSelect
Public fields
benchmark_result(mlr3::BenchmarkResult)
Benchmark result.
Methods
Method new()
Creates a new instance of this R6 class.
Usage
ArchiveFSelect$new(
search_space,
codomain,
check_values = TRUE,
ties_method = "least_features"
)Arguments
search_space(paradox::ParamSet)
Search space. Internally created from provided mlr3::Task by instance.codomain(bbotk::Codomain)
Specifies codomain of objective function i.e. a set of performance measures. Internally created from provided mlr3::Measures by instance.check_values(
logical(1))
IfTRUE(default), hyperparameter configurations are check for validity.ties_method(
character(1))
The method to break ties when selecting sets while optimizing and when selecting the best set. Can be"least_features"or"random". The option"least_features"(default) selects the feature set with the least features. If there are multiple best feature sets with the same number of features, one is selected randomly. Therandommethod returns a random feature set from the best feature sets. Ignored if multiple measures are used.
Method learner()
Retrieve mlr3::Learner of the i-th evaluation, by position or by unique hash uhash.
i and uhash are mutually exclusive.
Learner does not contain a model. Use $learners() to get learners with models.
Method learners()
Retrieve list of trained mlr3::Learner objects of the i-th evaluation,
by position or by unique hash uhash. i and uhash are mutually
exclusive.
Method predictions()
Retrieve list of mlr3::Prediction objects of the i-th evaluation, by
position or by unique hash uhash. i and uhash are mutually
exclusive.
Method resample_result()
Retrieve mlr3::ResampleResult of the i-th evaluation, by position
or by unique hash uhash. i and uhash are mutually exclusive.
Method best()
Returns the best scoring feature sets.
Arguments
batch(
integer())
The batch number(s) to limit the best results to. Default is all batches.ties_method(
character(1))
Method to handle ties. IfNULL(default), the global ties method set during initialization is used. The default global ties method isleast_featureswhich selects the feature set with the least features. If there are multiple best feature sets with the same number of features, one is selected randomly. Therandommethod returns a random feature set from the best feature sets.
