Skip to contents

This function calculates the scales-based descriptors with molecular descriptors sets calculated by Dragon, Discovery Studio and MOE. Users can specify which molecular descriptors to select from one of these deseriptor sets by specify the numerical or character index of the molecular descriptors in the descriptor set.

Usage

extractDescScales(
  x,
  propmat,
  index = NULL,
  pc,
  lag,
  scale = TRUE,
  silent = TRUE
)

Arguments

x

A character vector, as the input protein sequence.

propmat

The matrix containing the descriptor set for the amino acids, which can be chosen from AAMOE2D, AAMOE3D, AACPSA, AADescAll, AA2DACOR, AA3DMoRSE, AAACF, AABurden, AAConn, AAConst, AAEdgeAdj, AAEigIdx, AAFGC, AAGeom, AAGETAWAY, AAInfo, AAMolProp, AARandic, AARDF, AATopo, AATopoChg, AAWalk, and AAWHIM.

index

Integer vector or character vector. Specify which molecular descriptors to select from one of these deseriptor sets by specify the numerical or character index of the molecular descriptors in the descriptor set. Default is NULL, which means selecting all the molecular descriptors in this descriptor set.

pc

Integer. The maximum dimension of the space which the data are to be represented in. Must be no greater than the number of amino acid properties provided.

lag

The lag parameter. Must be less than the amino acids.

scale

Logical. Should we auto-scale the property matrix (propmat) before doing MDS? Default is TRUE.

silent

Logical. Whether we print the standard deviation, proportion of variance and the cumulative proportion of the selected principal components or not. Default is TRUE.

Value

A length lag * p^2 named vector, p is the number of scales selected.

Author

Nan Xiao <https://nanx.me>

Examples

x <- readFASTA(system.file("protseq/P00750.fasta", package = "protr"))[[1]]
descscales <- extractDescScales(
  x,
  propmat = "AATopo", index = c(37:41, 43:47),
  pc = 5, lag = 7, silent = FALSE
)
#> Summary of the first 5 principal components: 
#>                             PC1      PC2       PC3       PC4        PC5
#> Standard deviation     2.581537 1.754133 0.4621854 0.1918666 0.08972087
#> Proportion of Variance 0.666430 0.307700 0.0213600 0.0036800 0.00080000
#> Cumulative Proportion  0.666430 0.974130 0.9954900 0.9991700 0.99998000