Outlier detection with ensemble partial least squares.
Usage
enpls.od(
x,
y,
maxcomp = NULL,
cvfolds = 5L,
reptimes = 500L,
method = c("mc", "boot"),
ratio = 0.8,
parallel = 1L
)
Arguments
- x
Predictor matrix.
- y
Response vector.
- maxcomp
Maximum number of components included within each model. If not specified, will use the maximum number possible (considering cross-validation and special cases where n is smaller than p).
- cvfolds
Number of cross-validation folds used in each model for automatic parameter selection, default is
5
.- reptimes
Number of models to build with Monte-Carlo resampling or bootstrapping.
- method
Resampling method.
"mc"
(Monte-Carlo resampling) or"boot"
(bootstrapping). Default is"mc"
.- ratio
Sampling ratio used when
method = "mc"
.- parallel
Integer. Number of CPU cores to use. Default is
1
(not parallelized).
Value
A list containing four components:
error.mean
- error mean for all samples (absolute value)error.median
- error median for all sampleserror.sd
- error sd for all samplespredict.error.matrix
- the original prediction error matrix
Note
To maximize the probablity that each observation can
be selected in the test set (thus the prediction uncertainty
can be measured), please try setting a large reptimes
.
Author
Nan Xiao <https://nanx.me>
Examples
data("alkanes")
x <- alkanes$x
y <- alkanes$y
set.seed(42)
od <- enpls.od(x, y, reptimes = 50)
print(od)
#> Outlier Detection by Ensemble Partial Least Squares
#> ---
#> Mean residual for each sample:
#> [1] 2.75921773 11.36519417 2.86997847 3.72358683 2.58864423 0.69950317
#> [7] 2.47046080 0.04090356 0.72374081 0.67761648 0.60657038 0.05203462
#> [13] 4.81214619 0.48047678 0.31426341 4.45439789 1.81680825 0.00486926
#> [19] 1.40762012 2.76733128 0.25816473 0.11014460 2.76375525 0.24664065
#> [25] 0.18795164 2.89461883 0.60293011 0.54016143 3.69778413 0.84421368
#> [31] 2.40482744 0.08033139 0.65464173 0.28846939 0.96347563 1.16604993
#> [37] 2.42467651 1.80624634 2.08209599 0.64821359 0.06219722 2.36510870
#> [43] 1.08902416 2.09141868 2.06910689 1.38871463 1.16310209 2.07153015
#> [49] 2.41222206 1.99819780 1.02797377 1.09358743 6.44005724 2.66972544
#> [55] 1.18031132 0.96520499 2.91887428 3.88938152 0.93689004 0.04277856
#> [61] 0.09383582 3.13664578 17.11229300 6.53632442 0.44617481 2.72369291
#> [67] 2.83683060 0.41927300 1.57410904 1.62112809 0.11859181 1.41609908
#> [73] 0.66340508 1.53271330 0.20100807 3.47461828 1.36388771 1.47275628
#> [79] 2.03870349 1.42598923 0.21947290 0.25046323 6.82282300 0.91802223
#> [85] 2.63298282 1.68299675 1.67646979 1.56930079 0.77600240 0.57822702
#> [91] 1.03364347 4.42455671 1.06359453 3.14333266 7.53801734 1.87522128
#> [97] 3.14718740 1.28679073 1.55473026 0.81210377 1.62480053 1.55781168
#> [103] 1.07627481 0.48852462 0.52741604 1.70822777 0.11695013 0.32088398
#> [109] 3.10218596 2.17914452 2.19815151 3.41522772 3.09238815 2.52872181
#> [115] 0.69580966 0.01307271 3.61323553 0.50675263 2.09524681 4.06314768
#> [121] 1.50053126 3.56251067 0.02603732 2.52175184 0.46881633 1.17158669
#> [127] 0.35500780 0.14109849 0.02583630 4.39330939 5.83211202 3.93926189
#> [133] 1.70661275 0.73174502 3.67211046 0.15568680 1.01687679 4.73752709
#> [139] 4.75850118 6.28985938 3.00910912 0.55664806 2.53554817 3.77837005
#> [145] 0.81599912 1.00770488 4.21709646 3.29187061 0.02806380 3.95040415
#> [151] 2.81500647 2.24470051 0.16533690 1.69112925 11.12871713 2.64622693
#> [157] 2.36619343 0.80239235 3.13253142 3.73983802 1.05180495 0.75120889
#> [163] 5.55274842 0.93965402 5.44634050 0.91052763 12.86047051 8.16501444
#> [169] 5.28568842 1.94774611 1.42523305 3.51056559 3.55810920 9.05314678
#> [175] 5.73253024 3.79497529 5.01317052 0.64924425 0.77081874 2.24917960
#> [181] 2.97614143 1.73162082 4.05394871 0.78070127 4.54159467 1.20684823
#> [187] 3.03099951 2.65903955 1.45986082 0.83727406 4.74072104 0.75646839
#> [193] 4.69534977 1.36865729 3.20102779 0.18763266 10.61033130 4.46728417
#> [199] 8.16220823 3.61534939 4.17841145 2.49200986 1.05991605 5.30421423
#> [205] 2.08243043 2.47518723 5.36082689
#> ---
#> Residual SD for each sample:
#> [1] 2.77082522 3.18518150 0.74165858 2.23594825 1.23169974 0.42687633
#> [7] 1.50625099 0.79975392 1.41280667 0.46584217 1.17084856 0.93096232
#> [13] 0.88984994 0.75917079 0.57559100 0.64465715 0.47396718 0.56017441
#> [19] 0.68267751 0.65140968 0.41747884 0.56358621 0.67966904 0.43427716
#> [25] 0.37536436 0.64008708 0.44791500 0.59112318 0.88170164 0.41258090
#> [31] 0.63087919 0.43212128 0.50448177 0.26486222 0.71583384 0.42491843
#> [37] 2.11640004 0.40073941 0.88944036 0.32170138 0.48828519 0.41031319
#> [43] 0.50217099 1.59534185 0.33617303 0.50260003 0.94105157 0.63972291
#> [49] 0.50587685 0.37551763 0.41497134 0.42771863 0.46207727 1.40571227
#> [55] 0.51642660 0.42393378 0.43527285 0.08752688 0.45192162 0.51780544
#> [61] 0.61052739 0.43776137 0.31398455 0.32995059 2.00011552 0.87425403
#> [67] 0.31162653 0.63105402 0.24649488 0.31711494 0.49936385 0.59978816
#> [73] 0.59653589 0.50852740 2.53110721 0.41851982 1.10184218 0.21710200
#> [79] 0.43937877 0.35242335 0.33627934 0.34095574 0.33341472 0.69920856
#> [85] 0.48311083 0.23781167 0.32765431 0.70053129 0.40945258 0.08585720
#> [91] 0.24244975 0.36514882 0.79025058 0.44402145 0.46959851 0.47803736
#> [97] 0.66037588 0.69332640 0.29156760 0.42604792 0.47781693 0.32571532
#> [103] 0.31530154 0.43322187 0.64879644 0.67484355 0.40328109 0.51379190
#> [109] 0.25584801 0.78167536 0.46379337 1.02646245 0.48038341 0.42723380
#> [115] 0.71363249 0.69063925 0.59453418 0.65063662 0.86515782 1.17342811
#> [121] 0.78617936 0.47367019 0.24439560 0.70316651 0.73449508 1.79527777
#> [127] 0.44260783 0.33808668 0.81899089 0.32662542 0.73804730 2.75960043
#> [133] 0.64233354 0.47841244 0.75136350 0.32979985 0.94649800 0.24178213
#> [139] 1.11533236 2.49301466 0.35001294 0.80453705 0.35011625 0.83446145
#> [145] 0.58759196 0.40320473 0.29256353 0.77961214 0.39795562 1.01770854
#> [151] 0.45468083 0.62303222 0.40692023 0.44973569 1.25950083 0.53901537
#> [157] 0.34171577 1.08900598 0.71282202 1.29978898 0.82451430 0.44599800
#> [163] 0.77330806 0.51100098 0.51146626 0.53756880 1.01862539 0.84995332
#> [169] 1.18180361 3.93102487 1.04156903 1.13106025 0.52150688 1.24405827
#> [175] 0.67447926 1.17640208 0.64074369 0.28467185 0.33626125 1.94356528
#> [181] 0.62001275 0.44932588 0.16634723 0.40215221 0.40529902 0.45155131
#> [187] 0.23942115 1.16054612 0.39605799 0.47297038 0.33345301 0.27533596
#> [193] 0.36658302 0.80698643 0.73877079 0.55089330 1.40387322 0.73798595
#> [199] 0.38540824 0.26297673 0.58085243 0.64409232 0.52326994 0.87634496
#> [205] 0.53404341 0.62252334 0.45961173
plot(od)
plot(od, criterion = "sd")