Measuring feature importance with ensemble partial least squares.
Usage
enpls.fs(
x,
y,
maxcomp = NULL,
cvfolds = 5L,
reptimes = 500L,
method = c("mc", "boot"),
ratio = 0.8,
parallel = 1L
)
Arguments
- x
Predictor matrix.
- y
Response vector.
- maxcomp
Maximum number of components included within each model. If not specified, will use the maximum number possible (considering cross-validation and special cases where n is smaller than p).
- cvfolds
Number of cross-validation folds used in each model for automatic parameter selection, default is
5
.- reptimes
Number of models to build with Monte-Carlo resampling or bootstrapping.
- method
Resampling method.
"mc"
(Monte-Carlo resampling) or"boot"
(bootstrapping). Default is"mc"
.- ratio
Sampling ratio used when
method = "mc"
.- parallel
Integer. Number of CPU cores to use. Default is
1
(not parallelized).
Value
A list containing two components:
variable.importance
- a vector of variable importancecoefficient.matrix
- original coefficient matrix
Author
Nan Xiao <https://nanx.me>
Examples
data("alkanes")
x <- alkanes$x
y <- alkanes$y
set.seed(42)
fs <- enpls.fs(x, y, reptimes = 50)
print(fs)
#> Variable Importance by Ensemble Partial Least Squares
#> ---
#> Importance
#> MEDV.23 2.3438355
#> MEDV.33 2.1624571
#> Chi.P.4 2.1475160
#> Chi.C.3 2.0521822
#> Chi.P.5 1.4142498
#> Estate.1 1.2850053
#> MEDV.22 1.2822210
#> Chi.P.3 1.0533900
#> MEDV.12 1.0532281
#> MEDV.11 0.9379795
#> MEDV.13 0.8904436
#> Chi.PC.4 0.7798934
#> Estate.2 0.7473758
#> Chi.P.2 0.7221879
#> Kappa.3 0.7076264
#> Kappa.1 0.7043556
#> Kappa.2 0.4590357
#> Chi.P.1 0.4193437
#> Estate.3 0.3400528
#> Chi.P.0 0.2904166
#> Kappa.4 0.2455370
plot(fs)