Recursive feature elimination iteratively removes features with a low importance score.

The learner is trained on all features at the start and importance scores are calculated for each feature (see section on optional extractors in Learner). Then the least important feature is removed and the learner is trained on the reduced feature set. The importance scores are calculated again and the procedure is repeated until the desired number of features is reached. The non-recursive option (recursive = FALSE) only uses the importance scores calculated in the first iteration.

The feature selection terminates itself when n_features is reached. It is not necessary to set a termination criterion.

## Dictionary

This FSelector can be instantiated via the dictionary mlr_fselectors or with the associated sugar function fs():

FSelectorRFE$clone(deep = FALSE) #### Arguments deep Whether to make a deep clone. ## Examples # retrieve task task = tsk("pima") # load learner learner = lrn("classif.rpart") # \donttest{ # feature selection on the pima indians diabetes data set instance = fselect( method = "rfe", task = task, learner = learner, resampling = rsmp("holdout"), measure = msr("classif.ce"), store_models = TRUE ) # best performing feature subset instance$result
#>     age glucose insulin mass pedigree pregnant pressure triceps
#> 1: TRUE    TRUE    TRUE TRUE     TRUE     TRUE     TRUE    TRUE
#>                                          features classif.ce
#> 1: age,glucose,insulin,mass,pedigree,pregnant,...  0.2617188

# all evaluated feature subsets
as.data.table(instance$archive) #> age glucose insulin mass pedigree pregnant pressure triceps classif.ce #> 1: TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 0.2617188 #> 2: TRUE TRUE FALSE TRUE TRUE FALSE FALSE FALSE 0.2617188 #> runtime_learners timestamp batch_nr #> 1: 0.103 2022-08-25 10:41:01 1 #> 2: 0.091 2022-08-25 10:41:02 2 #> importance #> 1: 57.430135,14.047061,12.133200, 8.675049, 7.561779, 3.360167,... #> 2: 57.57762,14.89596,14.48716,11.46111 #> resample_result #> 1: <ResampleResult[21]> #> 2: <ResampleResult[21]> # subset the task and fit the final model task$select(instance$result_feature_set) learner$train(task)
