Outlier detection with ensemble sparse partial least squares.
Arguments
- x
Predictor matrix.
- y
Response vector.
- maxcomp
Maximum number of components included within each model. If not specified, will use
5
by default.- cvfolds
Number of cross-validation folds used in each model for automatic parameter selection, default is
5
.- alpha
Parameter (grid) controlling sparsity of the model. If not specified, default is
seq(0.2, 0.8, 0.2)
.- reptimes
Number of models to build with Monte-Carlo resampling or bootstrapping.
- method
Resampling method.
"mc"
(Monte-Carlo resampling) or"boot"
(bootstrapping). Default is"mc"
.- ratio
Sampling ratio used when
method = "mc"
.- parallel
Integer. Number of CPU cores to use. Default is
1
(not parallelized).
Value
A list containing four components:
error.mean
- error mean for all samples (absolute value)error.median
- error median for all sampleserror.sd
- error sd for all samplespredict.error.matrix
- the original prediction error matrix
Note
To maximize the probablity that each observation can
be selected in the test set (thus the prediction uncertainty
can be measured), please try setting a large reptimes
.
See also
See enspls.fs
for measuring feature importance
with ensemble sparse partial least squares regressions.
See enspls.fit
for fitting ensemble sparse
partial least squares regression models.
Author
Nan Xiao <https://nanx.me>