Pseudo Amino Acid Composition Descriptor
Usage
extractProtPAAC(
x,
props = c("Hydrophobicity", "Hydrophilicity", "SideChainMass"),
lambda = 30,
w = 0.05,
customprops = NULL
)
Arguments
- x
A character vector, as the input protein sequence.
- props
A character vector, specifying the properties used. 3 properties are used by default, as listed below:
'Hydrophobicity'
Hydrophobicity value of the 20 amino acids
'Hydrophilicity'
Hydrophilicity value of the 20 amino acids
'SideChainMass'
Side-chain mass of the 20 amino acids
- lambda
The lambda parameter for the PAAC descriptors, default is 30.
- w
The weighting factor, default is 0.05.
- customprops
A
n x 21
named data frame containsn
customize property. Each row contains one property. The column order for different amino acid types is'AccNo'
,'A'
,'R'
,'N'
,'D'
,'C'
,'E'
,'Q'
,'G'
,'H'
,'I'
,'L'
,'K'
,'M'
,'F'
,'P'
,'S'
,'T'
,'W'
,'Y'
,'V'
, and the columns should also be exactly named like this. TheAccNo
column contains the properties' names. Then users should explicitly specify these properties with these names in the argumentprops
. See the examples below for a demonstration. The default value forcustomprops
isNULL
.
Details
This function calculates the Pseudo Amino Acid Composition (PAAC) descriptor
(Dim: 20 + lambda
, default is 50).
Note
Note the default 20 * 3
prop
values have been already
independently given in the function. Users could also specify
other (up to 544) properties with the Accession Number in
the AAindex
data, with or without the default
three properties, which means users should explicitly specify
the properties to use.
References
Kuo-Chen Chou. Prediction of Protein Cellular Attributes Using Pseudo-Amino Acid Composition. PROTEINS: Structure, Function, and Genetics, 2001, 43: 246-255.
Type 1 pseudo amino acid composition. http://www.csbio.sjtu.edu.cn/bioinf/PseAAC/type1.htm
Kuo-Chen Chou. Using Amphiphilic Pseudo Amino Acid Composition to Predict Enzyme Subfamily Classes. Bioinformatics, 2005, 21, 10-19.
JACS, 1962, 84: 4240-4246. (C. Tanford). (The hydrophobicity data)
PNAS, 1981, 78:3824-3828 (T.P.Hopp & K.R.Woods). (The hydrophilicity data)
CRC Handbook of Chemistry and Physics, 66th ed., CRC Press, Boca Raton, Florida (1985). (The side-chain mass data)
R.M.C. Dawson, D.C. Elliott, W.H. Elliott, K.M. Jones, Data for Biochemical Research 3rd ed., Clarendon Press Oxford (1986). (The side-chain mass data)
See also
See extractProtAPAAC
for amphiphilic pseudo
amino acid composition descriptor.
Examples
x = readFASTA(system.file('protseq/P00750.fasta', package = 'Rcpi'))[[1]]
extractProtPAAC(x)
#> Xc1.A Xc1.R Xc1.N Xc1.D Xc1.C
#> 9.07025432 10.07806035 5.54293319 7.30659376 9.57415734
#> Xc1.E Xc1.Q Xc1.G Xc1.H Xc1.I
#> 6.80269074 6.80269074 11.58976941 4.28317565 5.03903018
#> Xc1.L Xc1.K Xc1.M Xc1.F Xc1.P
#> 10.83391488 5.54293319 1.76366056 4.53512716 7.55854527
#> Xc1.S Xc1.T Xc1.W Xc1.Y Xc1.V
#> 12.59757544 6.29878772 3.27536961 6.04683621 7.05464225
#> Xc2.lambda.1 Xc2.lambda.2 Xc2.lambda.3 Xc2.lambda.4 Xc2.lambda.5
#> 0.02514092 0.02500357 0.02527773 0.02553159 0.02445265
#> Xc2.lambda.6 Xc2.lambda.7 Xc2.lambda.8 Xc2.lambda.9 Xc2.lambda.10
#> 0.02561910 0.02486131 0.02506656 0.02553952 0.02437663
#> Xc2.lambda.11 Xc2.lambda.12 Xc2.lambda.13 Xc2.lambda.14 Xc2.lambda.15
#> 0.02491262 0.02533803 0.02351915 0.02479912 0.02548431
#> Xc2.lambda.16 Xc2.lambda.17 Xc2.lambda.18 Xc2.lambda.19 Xc2.lambda.20
#> 0.02478210 0.02513770 0.02457224 0.02543046 0.02500889
#> Xc2.lambda.21 Xc2.lambda.22 Xc2.lambda.23 Xc2.lambda.24 Xc2.lambda.25
#> 0.02476967 0.02342389 0.02431684 0.02610300 0.02626722
#> Xc2.lambda.26 Xc2.lambda.27 Xc2.lambda.28 Xc2.lambda.29 Xc2.lambda.30
#> 0.02457082 0.02343049 0.02588823 0.02490463 0.02451951
myprops = data.frame(AccNo = c("MyProp1", "MyProp2", "MyProp3"),
A = c(0.62, -0.5, 15), R = c(-2.53, 3, 101),
N = c(-0.78, 0.2, 58), D = c(-0.9, 3, 59),
C = c(0.29, -1, 47), E = c(-0.74, 3, 73),
Q = c(-0.85, 0.2, 72), G = c(0.48, 0, 1),
H = c(-0.4, -0.5, 82), I = c(1.38, -1.8, 57),
L = c(1.06, -1.8, 57), K = c(-1.5, 3, 73),
M = c(0.64, -1.3, 75), F = c(1.19, -2.5, 91),
P = c(0.12, 0, 42), S = c(-0.18, 0.3, 31),
T = c(-0.05, -0.4, 45), W = c(0.81, -3.4, 130),
Y = c(0.26, -2.3, 107), V = c(1.08, -1.5, 43))
# Use 3 default properties, 4 properties in the AAindex database,
# and 3 cutomized properties
extractProtPAAC(x, customprops = myprops,
props = c('Hydrophobicity', 'Hydrophilicity', 'SideChainMass',
'CIDH920105', 'BHAR880101',
'CHAM820101', 'CHAM820102',
'MyProp1', 'MyProp2', 'MyProp3'))
#> Xc1.A Xc1.R Xc1.N Xc1.D Xc1.C
#> 9.12536927 10.13929919 5.57661456 7.35099191 9.63233423
#> Xc1.E Xc1.Q Xc1.G Xc1.H Xc1.I
#> 6.84402695 6.84402695 11.66019407 4.30920216 5.06964960
#> Xc1.L Xc1.K Xc1.M Xc1.F Xc1.P
#> 10.89974663 5.57661456 1.77437736 4.56268464 7.60447439
#> Xc1.S Xc1.T Xc1.W Xc1.Y Xc1.V
#> 12.67412399 6.33706199 3.29527224 6.08357951 7.09750943
#> Xc2.lambda.1 Xc2.lambda.2 Xc2.lambda.3 Xc2.lambda.4 Xc2.lambda.5
#> 0.02472188 0.02515055 0.02559236 0.02588471 0.02419172
#> Xc2.lambda.6 Xc2.lambda.7 Xc2.lambda.8 Xc2.lambda.9 Xc2.lambda.10
#> 0.02570312 0.02514005 0.02462544 0.02544711 0.02427250
#> Xc2.lambda.11 Xc2.lambda.12 Xc2.lambda.13 Xc2.lambda.14 Xc2.lambda.15
#> 0.02462431 0.02510916 0.02335959 0.02501099 0.02525138
#> Xc2.lambda.16 Xc2.lambda.17 Xc2.lambda.18 Xc2.lambda.19 Xc2.lambda.20
#> 0.02491325 0.02527924 0.02448639 0.02542024 0.02498247
#> Xc2.lambda.21 Xc2.lambda.22 Xc2.lambda.23 Xc2.lambda.24 Xc2.lambda.25
#> 0.02473118 0.02329787 0.02470748 0.02592993 0.02557742
#> Xc2.lambda.26 Xc2.lambda.27 Xc2.lambda.28 Xc2.lambda.29 Xc2.lambda.30
#> 0.02469289 0.02360989 0.02570375 0.02473739 0.02436325