Ensemble Sparse Partial Least Squares for Model Applicability Domain Evaluation
Source:R/enspls.ad.R
enspls.ad.Rd
Model applicability domain evaluation with ensemble sparse partial least squares.
Arguments
- x
Predictor matrix of the training set.
- y
Response vector of the training set.
- xtest
List, with the i-th component being the i-th test set's predictor matrix (see example code below).
- ytest
List, with the i-th component being the i-th test set's response vector (see example code below).
- maxcomp
Maximum number of components included within each model. If not specified, will use
5
by default.- cvfolds
Number of cross-validation folds used in each model for automatic parameter selection, default is
5
.- alpha
Parameter (grid) controlling sparsity of the model. If not specified, default is
seq(0.2, 0.8, 0.2)
.- space
Space in which to apply the resampling method. Can be the sample space (
"sample"
) or the variable space ("variable"
).- method
Resampling method.
"mc"
(Monte-Carlo resampling) or"boot"
(bootstrapping). Default is"mc"
.- reptimes
Number of models to build with Monte-Carlo resampling or bootstrapping.
- ratio
Sampling ratio used when
method = "mc"
.- parallel
Integer. Number of CPU cores to use. Default is
1
(not parallelized).
Value
A list containing:
tr.error.mean
- absolute mean prediction error for training settr.error.median
- absolute median prediction error for training settr.error.sd
- prediction error sd for training settr.error.matrix
- raw prediction error matrix for training sette.error.mean
- list of absolute mean prediction error for test set(s)te.error.median
- list of absolute median prediction error for test set(s)te.error.sd
- list of prediction error sd for test set(s)te.error.matrix
- list of raw prediction error matrix for test set(s)
Note
Note that for space = "variable"
, method
could
only be "mc"
, since bootstrapping in the variable space
will create duplicated variables, and that could cause problems.
Author
Nan Xiao <https://nanx.me>
Examples
data("logd1k")
# remove low variance variables
x <- logd1k$x[, -c(17, 52, 59)]
y <- logd1k$y
# training set
x.tr <- x[1:300, ]
y.tr <- y[1:300]
# two test sets
x.te <- list(
"test.1" = x[301:400, ],
"test.2" = x[401:500, ]
)
y.te <- list(
"test.1" = y[301:400],
"test.2" = y[401:500]
)
set.seed(42)
ad <- enspls.ad(
x.tr, y.tr, x.te, y.te,
maxcomp = 3, alpha = c(0.3, 0.6, 0.9),
space = "variable", method = "mc",
ratio = 0.8, reptimes = 10
)
print(ad)
#> Model Applicability Domain Evaluation by ENSPLS
#> ---
#> Absolute mean prediction error for each training set sample:
#> [1] 0.422632298 0.815523615 0.648213606 0.870831328 0.257203074 0.490832846
#> [7] 0.701719681 0.429048791 0.749324907 0.638404365 0.843665524 0.687892385
#> [13] 1.017169872 0.763573955 0.759053761 0.871500956 0.236640925 0.088100279
#> [19] 0.472965624 0.634421708 0.735632351 0.091059839 1.594026249 1.189337666
#> [25] 0.340769966 0.604866100 1.303871973 0.961142660 0.092884283 0.100520976
#> [31] 0.059779983 0.412777313 0.135195202 2.478026712 0.397403818 0.005011676
#> [37] 0.877181685 0.145017597 0.023745363 0.280530155 0.058398364 0.568238661
#> [43] 0.251556521 0.494641002 0.163135341 0.263345408 0.331201449 0.257333694
#> [49] 0.210855050 0.221268755 0.057240160 0.384076723 1.179395348 0.390537444
#> [55] 0.726689121 0.528659319 0.212638767 0.161500350 0.362637352 0.256825754
#> [61] 0.732640202 2.477624629 0.191345044 0.378929068 1.560290268 1.490810969
#> [67] 0.296412512 1.550013282 0.055909397 0.480447843 0.022057600 0.205219974
#> [73] 0.533021269 0.061863659 0.325187195 0.792693950 0.171503502 1.268409310
#> [79] 0.095289006 0.145745742 0.258202295 0.186118843 0.072360285 0.001806612
#> [85] 0.053576611 0.542751777 0.122393517 0.701052137 0.336353318 0.176058594
#> [91] 0.684119003 1.427365586 0.699280144 0.406684185 0.611234923 0.156924933
#> [97] 0.562503385 1.119304806 0.471931939 0.291409291 1.002496298 0.915919886
#> [103] 1.371940567 0.159471334 0.458710686 0.352588032 0.662182370 0.092416830
#> [109] 0.381089817 0.777267167 0.723549214 0.086011861 0.504880794 0.442446365
#> [115] 2.068971171 0.677454033 0.275265965 0.146845629 0.145894573 0.380305366
#> [121] 0.175955943 0.495523895 0.095822143 0.108049721 0.421334995 0.406392115
#> [127] 0.427064993 0.475260473 0.390757565 0.386203722 0.389891939 0.638576864
#> [133] 0.514533894 0.702536133 1.016492859 0.036428252 0.068006633 0.158541714
#> [139] 0.358508958 0.126570666 0.383104094 0.316770984 0.315860883 0.367288327
#> [145] 0.005705279 0.511450671 0.776060683 0.717761225 1.248239275 0.315547146
#> [151] 0.678562080 0.436041833 0.302940412 0.333968661 0.335712517 0.490020503
#> [157] 0.027968631 0.131459236 1.330434909 0.534146052 0.054247912 0.470021073
#> [163] 0.230859468 1.113192328 0.934546316 0.167146638 1.009885792 0.124310929
#> [169] 1.056397529 1.023604883 0.225133982 0.442940412 0.583332596 0.126985869
#> [175] 0.069347790 0.653485515 0.970596830 0.195304683 0.801994753 1.199706935
#> [181] 0.864077011 0.954600519 0.260503102 1.575187384 0.241392038 0.640383510
#> [187] 0.281017068 0.421082178 0.252594496 0.580144400 0.215357014 0.462556069
#> [193] 0.224788390 0.029300831 0.277304150 0.320589756 0.214429238 0.237467730
#> [199] 0.235394840 0.414850699 0.719519720 0.354494279 0.072602051 0.007889332
#> [205] 0.390031470 0.789599200 0.266620727 0.428384211 1.065651077 0.372088798
#> [211] 0.937549606 0.090203048 0.090203048 0.256837822 1.342746701 0.574286923
#> [217] 0.789007735 0.426308453 0.412019834 0.713326957 0.394331204 0.857297967
#> [223] 0.843878184 0.347792353 0.039652246 0.200532641 0.096114491 0.544931258
#> [229] 0.337727332 0.401999859 0.059209044 1.022239155 1.165625666 0.685192754
#> [235] 0.007973348 0.462237983 1.251230872 0.166910524 0.257766196 0.709494592
#> [241] 1.283898665 0.724810413 0.141464864 1.025924894 0.848892347 0.710976980
#> [247] 0.315861373 0.185796431 0.270266859 0.106863048 0.951448651 0.890710089
#> [253] 0.900610648 1.294977014 0.980769268 0.238059821 0.849570466 0.035763040
#> [259] 1.324239041 1.380289181 0.400634228 0.560836939 1.519693134 0.926202509
#> [265] 0.911714531 0.877294549 0.059123701 1.193649120 0.360798335 0.837650147
#> [271] 1.082083058 0.422692742 0.417876957 0.839002022 0.944522512 0.136016291
#> [277] 0.965552453 1.197557941 0.999223117 1.102941147 0.749923726 1.160063450
#> [283] 1.995596507 0.876923724 0.527334455 1.275112467 0.126439904 1.522104432
#> [289] 0.730701343 0.652148077 0.484660849 1.146757330 0.118813278 0.802484059
#> [295] 1.043295173 1.189620988 0.894921760 1.299672426 0.025476701 0.518866931
#> ---
#> Prediction error SD for each training set sample:
#> [1] 0.17682492 0.13442655 0.10556516 0.11149364 0.14528873 0.12758414
#> [7] 0.12065164 0.07926194 0.08193757 0.06919434 0.15949739 0.11983705
#> [13] 0.11835357 0.20540697 0.10198030 0.11554543 0.13944593 0.15980624
#> [19] 0.08477330 0.13461282 0.23612843 0.15600606 0.21791446 0.19736729
#> [25] 0.15016986 0.09344745 0.21864433 0.21790961 0.25336042 0.15941805
#> [31] 0.09284126 0.16197923 0.09572618 0.08856759 0.17542166 0.14772163
#> [37] 0.19002387 0.09668287 0.14297462 0.11884520 0.18137131 0.09604546
#> [43] 0.13793796 0.19055507 0.11347154 0.07725716 0.12906606 0.07764695
#> [49] 0.08125217 0.07245476 0.06363757 0.11119898 0.16865215 0.10577447
#> [55] 0.21110265 0.09781289 0.11084679 0.14933465 0.07757903 0.22813192
#> [61] 0.09875702 0.10747134 0.13641438 0.16742019 0.27061775 0.23712666
#> [67] 0.13959506 0.28623757 0.13528523 0.12462349 0.05603801 0.09715490
#> [73] 0.10030104 0.08554512 0.16769181 0.10523460 0.09182054 0.25599278
#> [79] 0.19716493 0.14627301 0.15356982 0.09900334 0.08128891 0.20611741
#> [85] 0.17204179 0.25688856 0.12334344 0.09244068 0.20538923 0.10928922
#> [91] 0.29713211 0.34794542 0.13578921 0.12893068 0.15385871 0.09169239
#> [97] 0.13315023 0.21538582 0.18799248 0.13418807 0.19680767 0.19982114
#> [103] 0.26388690 0.19550901 0.20320043 0.06773431 0.17550233 0.21529091
#> [109] 0.13784328 0.20284702 0.19120467 0.15362606 0.16542940 0.18575542
#> [115] 0.23085802 0.15447198 0.11138301 0.09840187 0.25844855 0.17223314
#> [121] 0.15344701 0.13833191 0.14289706 0.13234967 0.11556001 0.11592906
#> [127] 0.10315136 0.08660667 0.09449550 0.16854420 0.18856090 0.15430410
#> [133] 0.09661260 0.28032529 0.21735301 0.12557330 0.20184812 0.09335452
#> [139] 0.18736053 0.19047223 0.20598213 0.14763848 0.13510977 0.12023468
#> [145] 0.20625454 0.18448368 0.14956688 0.16068554 0.10170993 0.18631713
#> [151] 0.15860217 0.09685768 0.31021140 0.07256171 0.18358676 0.11571334
#> [157] 0.11897874 0.28213843 0.17354708 0.54050221 0.21115082 0.16023562
#> [163] 0.21092455 0.15063153 0.16800965 0.25836353 0.20453967 0.20524870
#> [169] 0.18909853 0.15239066 0.19057706 0.31021140 0.17806749 0.18825405
#> [175] 0.23845054 0.16207528 0.14468703 0.22773153 0.09435823 0.20591807
#> [181] 0.09285024 0.14829706 0.18164147 0.11016136 0.12613775 0.13083017
#> [187] 0.15856212 0.14654668 0.11192022 0.09052024 0.18558041 0.14932045
#> [193] 0.19117145 0.16532975 0.23524530 0.17242759 0.16421283 0.15853904
#> [199] 0.18032470 0.27398573 0.21185651 0.08873293 0.17034044 0.20631894
#> [205] 0.17913107 0.27390014 0.14980278 0.10650226 0.12233789 0.15860333
#> [211] 0.07790451 0.17097513 0.17097513 0.16639106 0.13218349 0.19943419
#> [217] 0.14607979 0.09014777 0.16541099 0.21108869 0.21867079 0.13422304
#> [223] 0.15043220 0.27229967 0.14221377 0.22646860 0.12034653 0.12950380
#> [229] 0.12548953 0.45557358 0.11447949 0.12563839 0.22438920 0.25630906
#> [235] 0.09629856 0.13508804 0.16355792 0.18598939 0.36341332 0.18273404
#> [241] 0.12908144 0.11288943 0.19083150 0.16612828 0.11700099 0.13332849
#> [247] 0.12906516 0.24467925 0.17511908 0.19636436 0.13706344 0.09350376
#> [253] 0.11156260 0.07681448 0.10086700 0.18217069 0.08211883 0.18818291
#> [259] 0.12189647 0.12451514 0.28735398 0.14521698 0.23059541 0.04790616
#> [265] 0.06013623 0.22800985 0.18774418 0.27971046 0.13806851 0.03846176
#> [271] 0.14729532 0.24749122 0.21319807 0.18974506 0.20515873 0.22254970
#> [277] 0.13604209 0.22385117 0.09036809 0.08643619 0.17599821 0.11058212
#> [283] 0.22266788 0.90792902 0.25203713 0.09633983 0.21948222 0.14396076
#> [289] 0.26004314 0.21105243 0.25559727 0.18210567 0.13009828 0.38864150
#> [295] 0.16652945 0.11959406 0.36849624 0.19887907 0.24674728 0.18786913
#> ---
#> Absolute mean prediction error for each test set sample:
#> [[1]]
#> [1] 0.639812710 1.728652076 0.802728738 1.068664782 0.509884475 0.443834120
#> [7] 1.778795719 1.707951484 0.581364417 1.779236657 0.486854157 0.376468936
#> [13] 0.715406535 1.634078337 0.054522076 1.161650799 1.740578337 0.389893595
#> [19] 1.538714709 0.031311347 0.865520787 1.488199902 0.731535368 0.111052248
#> [25] 0.104213904 0.920199720 1.072970056 1.557923146 1.589685511 1.981304775
#> [31] 1.790737280 1.535657507 2.257757940 0.805144341 1.425035142 1.489361200
#> [37] 0.855622407 1.129081107 1.986525920 1.243347015 0.498346851 0.748741536
#> [43] 0.880363834 1.199213789 0.218156034 1.496401694 2.301150689 0.904462346
#> [49] 0.622586100 0.027112612 0.143570274 1.441423599 1.224283639 0.109178041
#> [55] 1.073204962 0.088209298 1.315967949 2.806981476 1.963810759 1.595818003
#> [61] 0.119306515 1.890562369 1.068519090 2.586166344 0.104238944 1.284283639
#> [67] 2.502535164 0.714377910 0.361077210 0.004964891 1.190182265 0.269064444
#> [73] 0.522665841 0.287735142 0.334417859 1.651955323 1.872898952 1.710504262
#> [79] 0.860576242 0.312156724 2.056917661 1.268912438 1.552370417 1.492865920
#> [85] 1.423811794 0.304902997 0.943973923 1.007302003 0.268239164 0.077852604
#> [91] 0.397227200 1.574999201 2.439927322 1.469443326 0.268010248 0.082140833
#> [97] 1.652266880 1.309732275 2.231149721 0.002849509
#>
#> [[2]]
#> [1] 0.3761254 1.0686362 0.9822736 0.1525924 0.8961379 2.2723739 2.0714944
#> [8] 1.4591463 1.7583215 1.2206085 1.7216143 2.1029318 0.9041916 1.9164734
#> [15] 1.6555303 1.5212896 0.1713064 1.6826830 0.4845723 1.0504444 0.5601090
#> [22] 0.5565668 2.0394634 1.7641510 1.1248075 2.1042120 1.1213167 1.3636992
#> [29] 0.3338557 1.8086985 1.3070499 1.9044063 0.1954925 2.1071527 0.2531252
#> [36] 1.8888268 1.5903581 2.2373363 2.1912922 2.2373363 2.4847163 1.7504916
#> [43] 0.8062721 0.9878200 1.8567469 1.6537348 1.8232545 1.8941021 2.0012004
#> [50] 2.0299914 2.0463856 6.8750855 2.0265552 2.1496461 2.0326761 2.0341021
#> [57] 2.2483376 2.2209414 1.4387582 2.1071983 2.2770486 2.3744691 1.9340489
#> [64] 2.2387601 0.3949453 2.3930403 2.9662227 1.7043344 1.9380597 2.2208547
#> [71] 1.8652574 2.0613709 2.6124837 2.2107478 2.7153936 1.8238502 2.3515942
#> [78] 0.4086407 2.4985282 2.5847385 1.7424177 1.9049894 0.8165998 1.1007456
#> [85] 3.2720013 2.5444853 2.7490399 1.9278361 2.6088437 1.0389593 2.3979256
#> [92] 2.1731800 2.3825046 6.6641455 2.7777074 2.7732829 0.8122010 9.6228069
#> [99] 1.0692105 2.0402506
#>
#> ---
#> Prediction error SD for each test set sample:
#> [[1]]
#> [1] 0.22083328 0.17054322 0.24153236 0.10511800 0.18976851 0.15932889
#> [7] 0.23968000 0.11121425 0.16555398 0.24991955 0.21038877 0.36450188
#> [13] 0.27819462 0.14341038 0.23339580 0.06099678 0.11001288 0.25935198
#> [19] 0.27807784 0.12768484 0.39398892 0.13011395 0.25895425 0.41446740
#> [25] 0.43185608 0.14108293 0.06591888 0.14312233 0.15275010 0.14042106
#> [31] 0.13693109 0.25348023 0.12806220 0.17876048 0.17077105 1.34738600
#> [37] 0.38800930 0.12158288 0.40278045 0.20050773 0.16542774 0.18043944
#> [43] 0.19275518 0.10633509 0.13346924 0.25936149 0.13643158 0.21946839
#> [49] 0.36433642 0.51283676 0.45020045 0.20389847 0.10552798 0.44284456
#> [55] 0.22361746 0.28826144 0.31265941 0.28732216 0.21618896 0.22854181
#> [61] 0.45194419 0.25694764 0.20225071 0.12415881 0.44521550 0.10552798
#> [67] 0.30201059 0.21971458 0.43446870 0.30375351 0.38239730 0.53411721
#> [73] 0.18460227 0.47744807 0.48077148 0.18288978 0.13848477 0.36567907
#> [79] 0.28023087 0.46569848 0.15442597 0.42189035 0.18716995 0.26155751
#> [85] 0.10627129 0.42819113 0.27761377 0.15701903 0.41389201 0.54313727
#> [91] 0.22618794 0.33465260 0.12879614 1.58229002 0.25411121 0.20038697
#> [97] 0.21076612 0.22063300 0.14627048 0.38272326
#>
#> [[2]]
#> [1] 0.49533369 0.21790280 0.21132863 0.50853899 0.12991737 0.21393065
#> [7] 0.37625454 0.15044426 0.18557630 0.19134102 0.31622632 0.13592326
#> [13] 0.31393839 0.18071248 0.16602440 0.12783851 0.41975925 0.24059637
#> [19] 0.50266192 0.16308872 0.52542505 0.14443274 0.11613127 0.21903495
#> [25] 0.22144174 0.10634338 0.23837872 0.87641235 0.26053470 0.16830009
#> [31] 0.26533128 0.23475609 0.19978981 0.14754964 0.38769917 0.20811497
#> [37] 0.19311046 0.21805196 0.11777125 0.21805196 0.20156731 0.22503077
#> [43] 0.16719951 0.18271725 0.36899922 0.19970142 0.22984328 0.16888852
#> [49] 0.15772463 0.14029879 0.38300912 6.37876528 0.19456094 0.05585043
#> [55] 0.36750728 0.16888852 0.16055091 0.18454729 0.15758583 0.21966845
#> [61] 0.19576505 0.36832384 0.29600309 0.15629316 0.11750815 0.14283549
#> [67] 0.34619970 0.28154974 0.21481076 0.20259712 0.21311527 0.28408400
#> [73] 0.31254966 0.15693373 0.14694409 0.21384984 0.14189051 0.15298061
#> [79] 0.16228782 0.19340977 2.36340422 0.19607719 1.25635858 0.15177587
#> [85] 0.32172107 0.11149303 0.07074771 0.21218978 0.21013675 0.14548796
#> [91] 0.18829682 0.20021486 0.29721012 6.70765888 0.16740331 0.17198017
#> [97] 0.25960303 7.99712131 0.16270696 0.10508260
#>
plot(ad)
# the interactive plot requires a HTML viewer
if (FALSE) {
plot(ad, type = "interactive")
}