Skip to contents

Distribution coefficients at pH 7.4 (logD7.4) dataset from Wang et, al.

Usage

data(logd1k)

Format

A list with 2 components:

  • x - data frame with 1,000 rows (samples) and 80 columns (predictors)

  • y - numeric vector of length 1,000 (response)

The first 1000 compounds in the original dataset were selected.

Details

This dataset contains distribution coefficients at pH 7.4 (logD7.4) for 1,000 compounds, and 80 molecular descriptors computed with RDKit.

References

Jian-Bing Wang, Dong-Sheng Cao, Min-Feng Zhu, Yong-Huan Yun, Nan Xiao, and Yi-Zeng Liang. "In silico evaluation of logD7.4 and comparison with other prediction methods." Journal of Chemometrics 29, no. 7 (2015): 389--398.

Examples

data(logd1k)
str(logd1k)
#> List of 2
#>  $ x:'data.frame':	1000 obs. of  80 variables:
#>   ..$ BalabanJ           : num [1:1000] 1.95 1.97 2.97 2.05 2.72 ...
#>   ..$ BertzCT            : num [1:1000] 883 782 343 1134 437 ...
#>   ..$ Chi0               : num [1:1000] 16.84 15.9 9.85 19.84 12.13 ...
#>   ..$ Chi0n              : num [1:1000] 13.09 13.2 7.53 15.41 9.49 ...
#>   ..$ Chi0v              : num [1:1000] 13.09 14.02 7.53 15.41 9.49 ...
#>   ..$ Chi1               : num [1:1000] 11.56 10.39 6.16 13.47 7.56 ...
#>   ..$ Chi1n              : num [1:1000] 8.13 7.38 4.16 9.11 5.22 ...
#>   ..$ Chi1v              : num [1:1000] 8.13 8.78 4.16 9.11 5.22 ...
#>   ..$ Chi2n              : num [1:1000] 6.37 6 2.88 6.89 3.51 ...
#>   ..$ Chi2v              : num [1:1000] 6.37 7.96 2.88 6.89 3.51 ...
#>   ..$ Chi3n              : num [1:1000] 4.61 4.16 1.97 5.13 2.37 ...
#>   ..$ Chi3v              : num [1:1000] 4.61 6.08 1.97 5.13 2.37 ...
#>   ..$ Chi4n              : num [1:1000] 3.31 2.77 1.15 3.67 1.41 ...
#>   ..$ Chi4v              : num [1:1000] 3.31 4.11 1.15 3.67 1.41 ...
#>   ..$ EState_VSA1        : num [1:1000] 17.21 10.21 0 17.21 5.43 ...
#>   ..$ EState_VSA10       : num [1:1000] 19.09 8.42 9.9 19.09 14.7 ...
#>   ..$ EState_VSA11       : num [1:1000] 0 0 0 0 0 4.39 0 0 0 0 ...
#>   ..$ EState_VSA2        : num [1:1000] 17 0 17.8 10.9 11.7 ...
#>   ..$ EState_VSA3        : num [1:1000] 24.3 13.1 18.7 24.3 25.2 ...
#>   ..$ EState_VSA4        : num [1:1000] 25.9 29.6 0 30.4 0 ...
#>   ..$ EState_VSA5        : num [1:1000] 12.26 14.17 6.07 6.2 6.07 ...
#>   ..$ EState_VSA6        : num [1:1000] 6.07 7.05 17.69 10.63 17.69 ...
#>   ..$ EState_VSA7        : num [1:1000] 9.47 38.49 0 36.09 6.92 ...
#>   ..$ EState_VSA8        : num [1:1000] 5.32 9.88 0 5.32 5.32 ...
#>   ..$ EState_VSA9        : num [1:1000] 0 0 5.11 0 0 ...
#>   ..$ ExactMolWt         : num [1:1000] 331 322 183 381 224 ...
#>   ..$ FractionCSP3       : num [1:1000] 0.412 0.467 0.444 0.238 0.455 0.474 0.529 0.4 0.5 0.286 ...
#>   ..$ HallKierAlpha      : num [1:1000] -2.41 -1.57 -1.29 -3.19 -1.78 -2.61 -1.41 -1.78 -1.29 -1.25 ...
#>   ..$ HeavyAtomCount     : int [1:1000] 24 22 13 28 16 27 24 15 14 10 ...
#>   ..$ HeavyAtomMolWt     : num [1:1000] 313 300 170 361 208 ...
#>   ..$ Ipc                : num [1:1000] 672747 115307 826 2613182 3288 ...
#>   ..$ Kappa1             : num [1:1000] 15.14 15.33 9.79 18.19 12.29 ...
#>   ..$ Kappa2             : num [1:1000] 5.59 5.57 4.09 7.09 5.34 ...
#>   ..$ Kappa3             : num [1:1000] 2.45 2.6 2.09 3.03 2.85 ...
#>   ..$ LabuteASA          : num [1:1000] 137 131 76.1 160.4 93.7 ...
#>   ..$ MaxAbsEStateIndex  : num [1:1000] 14.6 12.3 11 14.9 11.2 ...
#>   ..$ MaxEStateIndex     : num [1:1000] 14.6 12.3 11 14.9 11.2 ...
#>   ..$ MinAbsEStateIndex  : num [1:1000] 0.137 0.502 0.104 0.055 0.042 0.045 0.385 0.208 0.253 0.162 ...
#>   ..$ MinEStateIndex     : num [1:1000] -1.275 -3.361 -0.363 -1.333 -0.397 ...
#>   ..$ MolMR              : num [1:1000] 88.5 89.5 48.8 106.3 60.4 ...
#>   ..$ MolWt              : num [1:1000] 331 322 183 381 224 ...
#>   ..$ NumValenceElectrons: int [1:1000] 126 120 72 144 88 144 132 82 78 54 ...
#>   ..$ PEOE_VSA1          : num [1:1000] 19.89 9.88 14.78 19.89 14.99 ...
#>   ..$ PEOE_VSA10         : num [1:1000] 11.4 0 0 11.4 0 ...
#>   ..$ PEOE_VSA11         : num [1:1000] 0 0 5.75 0 5.75 ...
#>   ..$ PEOE_VSA12         : num [1:1000] 5.43 0 5.43 5.43 11.34 ...
#>   ..$ PEOE_VSA13         : num [1:1000] 0 0 0 0 0 ...
#>   ..$ PEOE_VSA14         : num [1:1000] 5.97 10.21 0 5.97 0 ...
#>   ..$ PEOE_VSA2          : num [1:1000] 4.79 4.3 4.79 4.79 9.59 ...
#>   ..$ PEOE_VSA3          : num [1:1000] 9.19 0 0 9.19 0 ...
#>   ..$ PEOE_VSA4          : num [1:1000] 0 12.7 0 0 0 ...
#>   ..$ PEOE_VSA5          : num [1:1000] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ PEOE_VSA6          : num [1:1000] 0 0 0 18.2 0 ...
#>   ..$ PEOE_VSA7          : num [1:1000] 25 44.3 13.3 30.7 13.8 ...
#>   ..$ PEOE_VSA8          : num [1:1000] 43.8 43.8 25.4 43.4 31.8 ...
#>   ..$ PEOE_VSA9          : num [1:1000] 11.2 5.69 5.69 11.2 5.69 ...
#>   ..$ SMR_VSA1           : num [1:1000] 14.29 8.42 10.21 14.29 9.9 ...
#>   ..$ SMR_VSA10          : num [1:1000] 22.56 26.8 0 22.56 5.91 ...
#>   ..$ SMR_VSA2           : num [1:1000] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ SMR_VSA3           : num [1:1000] 9.88 14.19 4.57 9.88 9.88 ...
#>   ..$ SMR_VSA4           : num [1:1000] 0 0 0 0 0 ...
#>   ..$ SMR_VSA5           : num [1:1000] 18.88 6.42 19.89 6.92 26.81 ...
#>   ..$ SMR_VSA6           : num [1:1000] 31.08 45.08 6.61 31.08 6.54 ...
#>   ..$ SMR_VSA7           : num [1:1000] 39.9 30 28.2 69.8 28.2 ...
#>   ..$ SMR_VSA9           : num [1:1000] 0 0 5.75 5.69 5.75 ...
#>   ..$ SlogP_VSA1         : num [1:1000] 15.64 4.3 5.43 15.64 10.75 ...
#>   ..$ SlogP_VSA10        : num [1:1000] 10.08 5.69 0 10.08 0 ...
#>   ..$ SlogP_VSA11        : num [1:1000] 0 0 5.75 0 5.75 5.75 0 5.75 5.75 5.75 ...
#>   ..$ SlogP_VSA12        : num [1:1000] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ SlogP_VSA2         : num [1:1000] 41.8 63.4 21.4 41.8 22.1 ...
#>   ..$ SlogP_VSA3         : num [1:1000] 0 16.63 6.54 0 11.34 ...
#>   ..$ SlogP_VSA4         : num [1:1000] 5.82 0 6.92 12.74 6.92 ...
#>   ..$ SlogP_VSA5         : num [1:1000] 29.24 5.56 12.12 15.92 19.04 ...
#>   ..$ SlogP_VSA6         : num [1:1000] 23.1 24.4 17.1 47.4 17.1 ...
#>   ..$ SlogP_VSA7         : num [1:1000] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ SlogP_VSA8         : num [1:1000] 10.9 10.9 0 16.6 0 ...
#>   ..$ TPSA               : num [1:1000] 74.6 59.6 62.5 74.6 71.3 ...
#>   ..$ VSA_EState10       : num [1:1000] 0 -3.36 0 0 0 ...
#>   ..$ VSA_EState8        : num [1:1000] 16.43 27.48 1.74 16.54 1.69 ...
#>   ..$ VSA_EState9        : num [1:1000] 46.1 26 34.8 55 42 ...
#>  $ y: num [1:1000] -0.96 -0.92 -0.9 -0.83 -0.82 -0.79 -0.78 -0.77 -0.77 -0.77 ...