Scales-Based Descriptors derived by Principal Components Analysis
Source:R/pcm-01-extractScales.R
extractScales.Rd
This function calculates scales-based descriptors derived by Principal Components Analysis (PCA). Users can provide customized amino acid property matrices. This function implements the core computation procedure needed for the scales-based descriptors derived by AA-Properties (AAindex) and scales-based descriptors derived by 20+ classes of 2D and 3D molecular descriptors (Topological, WHIM, VHSE, etc.) in the protr package.
Arguments
- x
A character vector, as the input protein sequence.
- propmat
A matrix containing the properties for the amino acids. Each row represent one amino acid type, each column represents one property. Note that the one-letter row names must be provided for we need them to seek the properties for each AA type.
- pc
Integer. Use the first pc principal components as the scales. Must be no greater than the number of AA properties provided.
- lag
The lag parameter. Must be less than the amino acids.
- scale
Logical. Should we auto-scale the property matrix (
propmat
) before PCA? Default isTRUE
.- silent
Logical. Whether we print the standard deviation, proportion of variance and the cumulative proportion of the selected principal components or not. Default is
TRUE
.
See also
See extractDescScales
scales descriptors based on
20+ classes of molecular descriptors, and extractProtFP
for amino acid property based scales descriptors (protein fingerprint).
Author
Nan Xiao <https://nanx.me>
Examples
x <- readFASTA(system.file("protseq/P00750.fasta", package = "protr"))[[1]]
data(AAindex)
AAidxmat <- t(na.omit(as.matrix(AAindex[, 7:26])))
scales <- extractScales(x, propmat = AAidxmat, pc = 5, lag = 7, silent = FALSE)
#> Summary of the first 5 principal components:
#> PC1 PC2 PC3 PC4 PC5
#> Standard deviation 13.71695 8.924017 7.698803 6.110576 5.413655
#> Proportion of Variance 0.35434 0.149980 0.111620 0.070320 0.055190
#> Cumulative Proportion 0.35434 0.504320 0.615940 0.686260 0.741450