Skip to contents

Model applicability domain evaluation with ensemble sparse partial least squares.

Usage

enspls.ad(
  x,
  y,
  xtest,
  ytest,
  maxcomp = 5L,
  cvfolds = 5L,
  alpha = seq(0.2, 0.8, 0.2),
  space = c("sample", "variable"),
  method = c("mc", "boot"),
  reptimes = 500L,
  ratio = 0.8,
  parallel = 1L
)

Arguments

x

Predictor matrix of the training set.

y

Response vector of the training set.

xtest

List, with the i-th component being the i-th test set's predictor matrix (see example code below).

ytest

List, with the i-th component being the i-th test set's response vector (see example code below).

maxcomp

Maximum number of components included within each model. If not specified, will use 5 by default.

cvfolds

Number of cross-validation folds used in each model for automatic parameter selection, default is 5.

alpha

Parameter (grid) controlling sparsity of the model. If not specified, default is seq(0.2, 0.8, 0.2).

space

Space in which to apply the resampling method. Can be the sample space ("sample") or the variable space ("variable").

method

Resampling method. "mc" (Monte-Carlo resampling) or "boot" (bootstrapping). Default is "mc".

reptimes

Number of models to build with Monte-Carlo resampling or bootstrapping.

ratio

Sampling ratio used when method = "mc".

parallel

Integer. Number of CPU cores to use. Default is 1 (not parallelized).

Value

A list containing:

  • tr.error.mean - absolute mean prediction error for training set

  • tr.error.median - absolute median prediction error for training set

  • tr.error.sd - prediction error sd for training set

  • tr.error.matrix - raw prediction error matrix for training set

  • te.error.mean - list of absolute mean prediction error for test set(s)

  • te.error.median - list of absolute median prediction error for test set(s)

  • te.error.sd - list of prediction error sd for test set(s)

  • te.error.matrix - list of raw prediction error matrix for test set(s)

Note

Note that for space = "variable", method could only be "mc", since bootstrapping in the variable space will create duplicated variables, and that could cause problems.

Author

Nan Xiao <https://nanx.me>

Examples

data("logd1k")
# remove low variance variables
x <- logd1k$x[, -c(17, 52, 59)]
y <- logd1k$y

# training set
x.tr <- x[1:300, ]
y.tr <- y[1:300]

# two test sets
x.te <- list(
  "test.1" = x[301:400, ],
  "test.2" = x[401:500, ]
)
y.te <- list(
  "test.1" = y[301:400],
  "test.2" = y[401:500]
)

set.seed(42)
ad <- enspls.ad(
  x.tr, y.tr, x.te, y.te,
  maxcomp = 3, alpha = c(0.3, 0.6, 0.9),
  space = "variable", method = "mc",
  ratio = 0.8, reptimes = 10
)
print(ad)
#> Model Applicability Domain Evaluation by ENSPLS
#> ---
#> Absolute mean prediction error for each training set sample:
#>   [1] 0.422632298 0.815523615 0.648213606 0.870831328 0.257203074 0.490832846
#>   [7] 0.701719681 0.429048791 0.749324907 0.638404365 0.843665524 0.687892385
#>  [13] 1.017169872 0.763573955 0.759053761 0.871500956 0.236640925 0.088100279
#>  [19] 0.472965624 0.634421708 0.735632351 0.091059839 1.594026249 1.189337666
#>  [25] 0.340769966 0.604866100 1.303871973 0.961142660 0.092884283 0.100520976
#>  [31] 0.059779983 0.412777313 0.135195202 2.478026712 0.397403818 0.005011676
#>  [37] 0.877181685 0.145017597 0.023745363 0.280530155 0.058398364 0.568238661
#>  [43] 0.251556521 0.494641002 0.163135341 0.263345408 0.331201449 0.257333694
#>  [49] 0.210855050 0.221268755 0.057240160 0.384076723 1.179395348 0.390537444
#>  [55] 0.726689121 0.528659319 0.212638767 0.161500350 0.362637352 0.256825754
#>  [61] 0.732640202 2.477624629 0.191345044 0.378929068 1.560290268 1.490810969
#>  [67] 0.296412512 1.550013282 0.055909397 0.480447843 0.022057600 0.205219974
#>  [73] 0.533021269 0.061863659 0.325187195 0.792693950 0.171503502 1.268409310
#>  [79] 0.095289006 0.145745742 0.258202295 0.186118843 0.072360285 0.001806612
#>  [85] 0.053576611 0.542751777 0.122393517 0.701052137 0.336353318 0.176058594
#>  [91] 0.684119003 1.427365586 0.699280144 0.406684185 0.611234923 0.156924933
#>  [97] 0.562503385 1.119304806 0.471931939 0.291409291 1.002496298 0.915919886
#> [103] 1.371940567 0.159471334 0.458710686 0.352588032 0.662182370 0.092416830
#> [109] 0.381089817 0.777267167 0.723549214 0.086011861 0.504880794 0.442446365
#> [115] 2.068971171 0.677454033 0.275265965 0.146845629 0.145894573 0.380305366
#> [121] 0.175955943 0.495523895 0.095822143 0.108049721 0.421334995 0.406392115
#> [127] 0.427064993 0.475260473 0.390757565 0.386203722 0.389891939 0.638576864
#> [133] 0.514533894 0.702536133 1.016492859 0.036428252 0.068006633 0.158541714
#> [139] 0.358508958 0.126570666 0.383104094 0.316770984 0.315860883 0.367288327
#> [145] 0.005705279 0.511450671 0.776060683 0.717761225 1.248239275 0.315547146
#> [151] 0.678562080 0.436041833 0.302940412 0.333968661 0.335712517 0.490020503
#> [157] 0.027968631 0.131459236 1.330434909 0.534146052 0.054247912 0.470021073
#> [163] 0.230859468 1.113192328 0.934546316 0.167146638 1.009885792 0.124310929
#> [169] 1.056397529 1.023604883 0.225133982 0.442940412 0.583332596 0.126985869
#> [175] 0.069347790 0.653485515 0.970596830 0.195304683 0.801994753 1.199706935
#> [181] 0.864077011 0.954600519 0.260503102 1.575187384 0.241392038 0.640383510
#> [187] 0.281017068 0.421082178 0.252594496 0.580144400 0.215357014 0.462556069
#> [193] 0.224788390 0.029300831 0.277304150 0.320589756 0.214429238 0.237467730
#> [199] 0.235394840 0.414850699 0.719519720 0.354494279 0.072602051 0.007889332
#> [205] 0.390031470 0.789599200 0.266620727 0.428384211 1.065651077 0.372088798
#> [211] 0.937549606 0.090203048 0.090203048 0.256837822 1.342746701 0.574286923
#> [217] 0.789007735 0.426308453 0.412019834 0.713326957 0.394331204 0.857297967
#> [223] 0.843878184 0.347792353 0.039652246 0.200532641 0.096114491 0.544931258
#> [229] 0.337727332 0.401999859 0.059209044 1.022239155 1.165625666 0.685192754
#> [235] 0.007973348 0.462237983 1.251230872 0.166910524 0.257766196 0.709494592
#> [241] 1.283898665 0.724810413 0.141464864 1.025924894 0.848892347 0.710976980
#> [247] 0.315861373 0.185796431 0.270266859 0.106863048 0.951448651 0.890710089
#> [253] 0.900610648 1.294977014 0.980769268 0.238059821 0.849570466 0.035763040
#> [259] 1.324239041 1.380289181 0.400634228 0.560836939 1.519693134 0.926202509
#> [265] 0.911714531 0.877294549 0.059123701 1.193649120 0.360798335 0.837650147
#> [271] 1.082083058 0.422692742 0.417876957 0.839002022 0.944522512 0.136016291
#> [277] 0.965552453 1.197557941 0.999223117 1.102941147 0.749923726 1.160063450
#> [283] 1.995596507 0.876923724 0.527334455 1.275112467 0.126439904 1.522104432
#> [289] 0.730701343 0.652148077 0.484660849 1.146757330 0.118813278 0.802484059
#> [295] 1.043295173 1.189620988 0.894921760 1.299672426 0.025476701 0.518866931
#> ---
#> Prediction error SD for each training set sample:
#>   [1] 0.17682492 0.13442655 0.10556516 0.11149364 0.14528873 0.12758414
#>   [7] 0.12065164 0.07926194 0.08193757 0.06919434 0.15949739 0.11983705
#>  [13] 0.11835357 0.20540697 0.10198030 0.11554543 0.13944593 0.15980624
#>  [19] 0.08477330 0.13461282 0.23612843 0.15600606 0.21791446 0.19736729
#>  [25] 0.15016986 0.09344745 0.21864433 0.21790961 0.25336042 0.15941805
#>  [31] 0.09284126 0.16197923 0.09572618 0.08856759 0.17542166 0.14772163
#>  [37] 0.19002387 0.09668287 0.14297462 0.11884520 0.18137131 0.09604546
#>  [43] 0.13793796 0.19055507 0.11347154 0.07725716 0.12906606 0.07764695
#>  [49] 0.08125217 0.07245476 0.06363757 0.11119898 0.16865215 0.10577447
#>  [55] 0.21110265 0.09781289 0.11084679 0.14933465 0.07757903 0.22813192
#>  [61] 0.09875702 0.10747134 0.13641438 0.16742019 0.27061775 0.23712666
#>  [67] 0.13959506 0.28623757 0.13528523 0.12462349 0.05603801 0.09715490
#>  [73] 0.10030104 0.08554512 0.16769181 0.10523460 0.09182054 0.25599278
#>  [79] 0.19716493 0.14627301 0.15356982 0.09900334 0.08128891 0.20611741
#>  [85] 0.17204179 0.25688856 0.12334344 0.09244068 0.20538923 0.10928922
#>  [91] 0.29713211 0.34794542 0.13578921 0.12893068 0.15385871 0.09169239
#>  [97] 0.13315023 0.21538582 0.18799248 0.13418807 0.19680767 0.19982114
#> [103] 0.26388690 0.19550901 0.20320043 0.06773431 0.17550233 0.21529091
#> [109] 0.13784328 0.20284702 0.19120467 0.15362606 0.16542940 0.18575542
#> [115] 0.23085802 0.15447198 0.11138301 0.09840187 0.25844855 0.17223314
#> [121] 0.15344701 0.13833191 0.14289706 0.13234967 0.11556001 0.11592906
#> [127] 0.10315136 0.08660667 0.09449550 0.16854420 0.18856090 0.15430410
#> [133] 0.09661260 0.28032529 0.21735301 0.12557330 0.20184812 0.09335452
#> [139] 0.18736053 0.19047223 0.20598213 0.14763848 0.13510977 0.12023468
#> [145] 0.20625454 0.18448368 0.14956688 0.16068554 0.10170993 0.18631713
#> [151] 0.15860217 0.09685768 0.31021140 0.07256171 0.18358676 0.11571334
#> [157] 0.11897874 0.28213843 0.17354708 0.54050221 0.21115082 0.16023562
#> [163] 0.21092455 0.15063153 0.16800965 0.25836353 0.20453967 0.20524870
#> [169] 0.18909853 0.15239066 0.19057706 0.31021140 0.17806749 0.18825405
#> [175] 0.23845054 0.16207528 0.14468703 0.22773153 0.09435823 0.20591807
#> [181] 0.09285024 0.14829706 0.18164147 0.11016136 0.12613775 0.13083017
#> [187] 0.15856212 0.14654668 0.11192022 0.09052024 0.18558041 0.14932045
#> [193] 0.19117145 0.16532975 0.23524530 0.17242759 0.16421283 0.15853904
#> [199] 0.18032470 0.27398573 0.21185651 0.08873293 0.17034044 0.20631894
#> [205] 0.17913107 0.27390014 0.14980278 0.10650226 0.12233789 0.15860333
#> [211] 0.07790451 0.17097513 0.17097513 0.16639106 0.13218349 0.19943419
#> [217] 0.14607979 0.09014777 0.16541099 0.21108869 0.21867079 0.13422304
#> [223] 0.15043220 0.27229967 0.14221377 0.22646860 0.12034653 0.12950380
#> [229] 0.12548953 0.45557358 0.11447949 0.12563839 0.22438920 0.25630906
#> [235] 0.09629856 0.13508804 0.16355792 0.18598939 0.36341332 0.18273404
#> [241] 0.12908144 0.11288943 0.19083150 0.16612828 0.11700099 0.13332849
#> [247] 0.12906516 0.24467925 0.17511908 0.19636436 0.13706344 0.09350376
#> [253] 0.11156260 0.07681448 0.10086700 0.18217069 0.08211883 0.18818291
#> [259] 0.12189647 0.12451514 0.28735398 0.14521698 0.23059541 0.04790616
#> [265] 0.06013623 0.22800985 0.18774418 0.27971046 0.13806851 0.03846176
#> [271] 0.14729532 0.24749122 0.21319807 0.18974506 0.20515873 0.22254970
#> [277] 0.13604209 0.22385117 0.09036809 0.08643619 0.17599821 0.11058212
#> [283] 0.22266788 0.90792902 0.25203713 0.09633983 0.21948222 0.14396076
#> [289] 0.26004314 0.21105243 0.25559727 0.18210567 0.13009828 0.38864150
#> [295] 0.16652945 0.11959406 0.36849624 0.19887907 0.24674728 0.18786913
#> ---
#> Absolute mean prediction error for each test set sample:
#> [[1]]
#>   [1] 0.639812710 1.728652076 0.802728738 1.068664782 0.509884475 0.443834120
#>   [7] 1.778795719 1.707951484 0.581364417 1.779236657 0.486854157 0.376468936
#>  [13] 0.715406535 1.634078337 0.054522076 1.161650799 1.740578337 0.389893595
#>  [19] 1.538714709 0.031311347 0.865520787 1.488199902 0.731535368 0.111052248
#>  [25] 0.104213904 0.920199720 1.072970056 1.557923146 1.589685511 1.981304775
#>  [31] 1.790737280 1.535657507 2.257757940 0.805144341 1.425035142 1.489361200
#>  [37] 0.855622407 1.129081107 1.986525920 1.243347015 0.498346851 0.748741536
#>  [43] 0.880363834 1.199213789 0.218156034 1.496401694 2.301150689 0.904462346
#>  [49] 0.622586100 0.027112612 0.143570274 1.441423599 1.224283639 0.109178041
#>  [55] 1.073204962 0.088209298 1.315967949 2.806981476 1.963810759 1.595818003
#>  [61] 0.119306515 1.890562369 1.068519090 2.586166344 0.104238944 1.284283639
#>  [67] 2.502535164 0.714377910 0.361077210 0.004964891 1.190182265 0.269064444
#>  [73] 0.522665841 0.287735142 0.334417859 1.651955323 1.872898952 1.710504262
#>  [79] 0.860576242 0.312156724 2.056917661 1.268912438 1.552370417 1.492865920
#>  [85] 1.423811794 0.304902997 0.943973923 1.007302003 0.268239164 0.077852604
#>  [91] 0.397227200 1.574999201 2.439927322 1.469443326 0.268010248 0.082140833
#>  [97] 1.652266880 1.309732275 2.231149721 0.002849509
#> 
#> [[2]]
#>   [1] 0.3761254 1.0686362 0.9822736 0.1525924 0.8961379 2.2723739 2.0714944
#>   [8] 1.4591463 1.7583215 1.2206085 1.7216143 2.1029318 0.9041916 1.9164734
#>  [15] 1.6555303 1.5212896 0.1713064 1.6826830 0.4845723 1.0504444 0.5601090
#>  [22] 0.5565668 2.0394634 1.7641510 1.1248075 2.1042120 1.1213167 1.3636992
#>  [29] 0.3338557 1.8086985 1.3070499 1.9044063 0.1954925 2.1071527 0.2531252
#>  [36] 1.8888268 1.5903581 2.2373363 2.1912922 2.2373363 2.4847163 1.7504916
#>  [43] 0.8062721 0.9878200 1.8567469 1.6537348 1.8232545 1.8941021 2.0012004
#>  [50] 2.0299914 2.0463856 6.8750855 2.0265552 2.1496461 2.0326761 2.0341021
#>  [57] 2.2483376 2.2209414 1.4387582 2.1071983 2.2770486 2.3744691 1.9340489
#>  [64] 2.2387601 0.3949453 2.3930403 2.9662227 1.7043344 1.9380597 2.2208547
#>  [71] 1.8652574 2.0613709 2.6124837 2.2107478 2.7153936 1.8238502 2.3515942
#>  [78] 0.4086407 2.4985282 2.5847385 1.7424177 1.9049894 0.8165998 1.1007456
#>  [85] 3.2720013 2.5444853 2.7490399 1.9278361 2.6088437 1.0389593 2.3979256
#>  [92] 2.1731800 2.3825046 6.6641455 2.7777074 2.7732829 0.8122010 9.6228069
#>  [99] 1.0692105 2.0402506
#> 
#> ---
#> Prediction error SD for each test set sample:
#> [[1]]
#>   [1] 0.22083328 0.17054322 0.24153236 0.10511800 0.18976851 0.15932889
#>   [7] 0.23968000 0.11121425 0.16555398 0.24991955 0.21038877 0.36450188
#>  [13] 0.27819462 0.14341038 0.23339580 0.06099678 0.11001288 0.25935198
#>  [19] 0.27807784 0.12768484 0.39398892 0.13011395 0.25895425 0.41446740
#>  [25] 0.43185608 0.14108293 0.06591888 0.14312233 0.15275010 0.14042106
#>  [31] 0.13693109 0.25348023 0.12806220 0.17876048 0.17077105 1.34738600
#>  [37] 0.38800930 0.12158288 0.40278045 0.20050773 0.16542774 0.18043944
#>  [43] 0.19275518 0.10633509 0.13346924 0.25936149 0.13643158 0.21946839
#>  [49] 0.36433642 0.51283676 0.45020045 0.20389847 0.10552798 0.44284456
#>  [55] 0.22361746 0.28826144 0.31265941 0.28732216 0.21618896 0.22854181
#>  [61] 0.45194419 0.25694764 0.20225071 0.12415881 0.44521550 0.10552798
#>  [67] 0.30201059 0.21971458 0.43446870 0.30375351 0.38239730 0.53411721
#>  [73] 0.18460227 0.47744807 0.48077148 0.18288978 0.13848477 0.36567907
#>  [79] 0.28023087 0.46569848 0.15442597 0.42189035 0.18716995 0.26155751
#>  [85] 0.10627129 0.42819113 0.27761377 0.15701903 0.41389201 0.54313727
#>  [91] 0.22618794 0.33465260 0.12879614 1.58229002 0.25411121 0.20038697
#>  [97] 0.21076612 0.22063300 0.14627048 0.38272326
#> 
#> [[2]]
#>   [1] 0.49533369 0.21790280 0.21132863 0.50853899 0.12991737 0.21393065
#>   [7] 0.37625454 0.15044426 0.18557630 0.19134102 0.31622632 0.13592326
#>  [13] 0.31393839 0.18071248 0.16602440 0.12783851 0.41975925 0.24059637
#>  [19] 0.50266192 0.16308872 0.52542505 0.14443274 0.11613127 0.21903495
#>  [25] 0.22144174 0.10634338 0.23837872 0.87641235 0.26053470 0.16830009
#>  [31] 0.26533128 0.23475609 0.19978981 0.14754964 0.38769917 0.20811497
#>  [37] 0.19311046 0.21805196 0.11777125 0.21805196 0.20156731 0.22503077
#>  [43] 0.16719951 0.18271725 0.36899922 0.19970142 0.22984328 0.16888852
#>  [49] 0.15772463 0.14029879 0.38300912 6.37876528 0.19456094 0.05585043
#>  [55] 0.36750728 0.16888852 0.16055091 0.18454729 0.15758583 0.21966845
#>  [61] 0.19576505 0.36832384 0.29600309 0.15629316 0.11750815 0.14283549
#>  [67] 0.34619970 0.28154974 0.21481076 0.20259712 0.21311527 0.28408400
#>  [73] 0.31254966 0.15693373 0.14694409 0.21384984 0.14189051 0.15298061
#>  [79] 0.16228782 0.19340977 2.36340422 0.19607719 1.25635858 0.15177587
#>  [85] 0.32172107 0.11149303 0.07074771 0.21218978 0.21013675 0.14548796
#>  [91] 0.18829682 0.20021486 0.29721012 6.70765888 0.16740331 0.17198017
#>  [97] 0.25960303 7.99712131 0.16270696 0.10508260
#> 
plot(ad)

# the interactive plot requires a HTML viewer
if (FALSE) {
plot(ad, type = "interactive")
}