Ensemble Partial Least Squares for Measuring Feature Importance

Measuring feature importance with ensemble partial least squares.

Usage

enpls.fs(
  x,
  y,
  maxcomp = NULL,
  cvfolds = 5L,
  reptimes = 500L,
  method = c("mc", "boot"),
  ratio = 0.8,
  parallel = 1L
)

Arguments

x: Predictor matrix.
y: Response vector.
maxcomp: Maximum number of components included within each model. If not specified, will use the maximum number possible (considering cross-validation and special cases where n is smaller than p).
cvfolds: Number of cross-validation folds used in each model for automatic parameter selection, default is 5.
reptimes: Number of models to build with Monte-Carlo resampling or bootstrapping.
method: Resampling method. "mc" (Monte-Carlo resampling) or "boot" (bootstrapping). Default is "mc".
ratio: Sampling ratio used when method = "mc".
parallel: Integer. Number of CPU cores to use. Default is 1 (not parallelized).

Value

A list containing two components:

variable.importance - a vector of variable importance
coefficient.matrix - original coefficient matrix

Author

Nan Xiao <https://nanx.me>

Examples

data("alkanes")
x <- alkanes$x
y <- alkanes$y

set.seed(42)
fs <- enpls.fs(x, y, reptimes = 50)
print(fs)
#> Variable Importance by Ensemble Partial Least Squares
#> ---
#>          Importance
#> Chi.C.3   2.4683701
#> MEDV.23   2.3787962
#> MEDV.33   2.2815314
#> Chi.P.4   2.0315902
#> Chi.P.5   1.6766926
#> MEDV.13   1.4392556
#> MEDV.22   1.4159863
#> Chi.P.3   1.2510102
#> Estate.1  1.2467426
#> MEDV.12   1.2000973
#> MEDV.11   1.0926947
#> Chi.P.2   1.0155413
#> Chi.PC.4  0.8968577
#> Kappa.1   0.8588444
#> Kappa.3   0.8552669
#> Estate.2  0.6537099
#> Chi.P.1   0.6405966
#> Kappa.2   0.6140915
#> Kappa.4   0.5323162
#> Chi.P.0   0.4964470
#> Estate.3  0.4209405
plot(fs)

Ensemble Partial Least Squares for Measuring Feature Importance

Usage

Arguments

Value

See also

Author

Examples