Amino Acid Properties Based Scales Descriptors (Protein Fingerprint) with Gap Support
Source:R/pcm-03-extractProtFPGap.R
extractProtFPGap.Rd
This function calculates amino acid properties based scales descriptors (protein fingerprint) with gap support. Users can specify which AAindex properties to select from the AAindex database by specify the numerical or character index of the properties in the AAindex database.
Arguments
- x
A character vector, as the input protein sequence. Use '
-
' to represent gaps in the sequence.- index
Integer vector or character vector. Specify which AAindex properties to select from the AAindex database by specify the numerical or character index of the properties in the AAindex database. Default is
NULL
, means selecting all the AA properties in the AAindex database.- pc
Integer. Use the first pc principal components as the scales. Must be no greater than the number of AA properties provided.
- lag
The lag parameter. Must be less than the amino acids.
- scale
Logical. Should we auto-scale the property matrix before PCA? Default is
TRUE
.- silent
Logical. Whether we print the standard deviation, proportion of variance and the cumulative proportion of the selected principal components or not. Default is
TRUE
.
Author
Nan Xiao <https://nanx.me>
Examples
# amino acid sequence with gaps
x <- readFASTA(system.file("protseq/align.fasta", package = "protr"))$`IXI_235`
fp <- extractProtFPGap(x, index = c(160:165, 258:296), pc = 5, lag = 7, silent = FALSE)
#> Summary of the first 5 principal components:
#> PC1 PC2 PC3 PC4 PC5
#> Standard deviation 4.398253 2.620509 2.267688 1.756102 1.52816
#> Proportion of Variance 0.429880 0.152600 0.114280 0.068530 0.05189
#> Cumulative Proportion 0.429880 0.582480 0.696760 0.765290 0.81718