Skip to contents

This function calculates the Distribution descriptor of the CTD descriptors, with customized amino acid classification support.

Usage

extractCTDDClass(x, aagroup1, aagroup2, aagroup3)

Arguments

x

A character vector, as the input protein sequence.

aagroup1

A named list which contains the first group of customized amino acid classification. See example below.

aagroup2

A named list which contains the second group of customized amino acid classification. See example below.

aagroup3

A named list which contains the third group of customized amino acid classification. See example below.

Value

A length k * 15 named vector, k is the number of amino acid properties used.

Note

For this descriptor type, users need to intelligently evaluate the underlying details of the descriptors provided, instead of using this function with their data blindly. It would be wise to use some negative and positive control comparisons where relevant to help guide interpretation of the results.

References

Inna Dubchak, Ilya Muchink, Stephen R. Holbrook and Sung-Hou Kim. Prediction of protein folding class using global description of amino acid sequence. Proceedings of the National Academy of Sciences. USA, 1995, 92, 8700-8704.

Inna Dubchak, Ilya Muchink, Christopher Mayor, Igor Dralyuk and Sung-Hou Kim. Recognition of a Protein Fold in the Context of the SCOP classification. Proteins: Structure, Function and Genetics, 1999, 35, 401-407.

See also

See extractCTDCClass and extractCTDTClass for Composition and Transition of the CTD descriptors with customized amino acid classification support.

Author

Nan Xiao <https://nanx.me>

Examples

x <- readFASTA(system.file("protseq/P00750.fasta", package = "protr"))[[1]]

# using five customized amino acid property classification
group1 <- list(
  "hydrophobicity" = c("R", "K", "E", "D", "Q", "N"),
  "normwaalsvolume" = c("G", "A", "S", "T", "P", "D", "C"),
  "polarizability" = c("G", "A", "S", "D", "T"),
  "secondarystruct" = c("E", "A", "L", "M", "Q", "K", "R", "H"),
  "solventaccess" = c("A", "L", "F", "C", "G", "I", "V", "W")
)

group2 <- list(
  "hydrophobicity" = c("G", "A", "S", "T", "P", "H", "Y"),
  "normwaalsvolume" = c("N", "V", "E", "Q", "I", "L"),
  "polarizability" = c("C", "P", "N", "V", "E", "Q", "I", "L"),
  "secondarystruct" = c("V", "I", "Y", "C", "W", "F", "T"),
  "solventaccess" = c("R", "K", "Q", "E", "N", "D")
)

group3 <- list(
  "hydrophobicity" = c("C", "L", "V", "I", "M", "F", "W"),
  "normwaalsvolume" = c("M", "H", "K", "F", "R", "Y", "W"),
  "polarizability" = c("K", "M", "H", "F", "R", "Y", "W"),
  "secondarystruct" = c("G", "N", "P", "S", "D"),
  "solventaccess" = c("M", "S", "P", "T", "H", "Y")
)

extractCTDDClass(x, aagroup1 = group1, aagroup2 = group2, aagroup3 = group3)
#>   prop1.G1.residue0  prop1.G1.residue25  prop1.G1.residue50  prop1.G1.residue75 
#>           0.3558719          23.1316726          50.1779359          73.8434164 
#> prop1.G1.residue100   prop1.G2.residue0  prop1.G2.residue25  prop1.G2.residue50 
#>          99.8220641           0.5338078          27.4021352          47.3309609 
#>  prop1.G2.residue75 prop1.G2.residue100   prop1.G3.residue0  prop1.G3.residue25 
#>          75.2669039         100.0000000           0.1779359          19.5729537 
#>  prop1.G3.residue50  prop1.G3.residue75 prop1.G3.residue100   prop2.G1.residue0 
#>          51.7793594          75.6227758          99.6441281           0.3558719 
#>  prop2.G1.residue25  prop2.G1.residue50  prop2.G1.residue75 prop2.G1.residue100 
#>          25.6227758          48.0427046          75.4448399         100.0000000 
#>   prop2.G2.residue0  prop2.G2.residue25  prop2.G2.residue50  prop2.G2.residue75 
#>           1.4234875          23.3096085          54.4483986          76.3345196 
#> prop2.G2.residue100   prop2.G3.residue0  prop2.G3.residue25  prop2.G3.residue50 
#>          99.4661922           0.1779359          22.7758007          48.9323843 
#>  prop2.G3.residue75 prop2.G3.residue100   prop3.G1.residue0  prop3.G1.residue25 
#>          69.5729537          99.8220641           0.3558719          26.5124555 
#>  prop3.G1.residue50  prop3.G1.residue75 prop3.G1.residue100   prop3.G2.residue0 
#>          48.3985765          76.1565836          99.2882562           1.4234875 
#>  prop3.G2.residue25  prop3.G2.residue50  prop3.G2.residue75 prop3.G2.residue100 
#>          21.5302491          51.4234875          75.8007117         100.0000000 
#>   prop3.G3.residue0  prop3.G3.residue25  prop3.G3.residue50  prop3.G3.residue75 
#>           0.1779359          22.7758007          48.9323843          69.5729537 
#> prop3.G3.residue100   prop4.G1.residue0  prop4.G1.residue25  prop4.G1.residue50 
#>          99.8220641           0.1779359          22.9537367          50.8896797 
#>  prop4.G1.residue75 prop4.G1.residue100   prop4.G2.residue0  prop4.G2.residue25 
#>          74.3772242          99.8220641           1.6014235          21.5302491 
#>  prop4.G2.residue50  prop4.G2.residue75 prop4.G2.residue100   prop4.G3.residue0 
#>          49.2882562          70.8185053          98.9323843           0.3558719 
#>  prop4.G3.residue25  prop4.G3.residue50  prop4.G3.residue75 prop4.G3.residue100 
#>          29.0035587          48.2206406          77.4021352         100.0000000 
#>   prop5.G1.residue0  prop5.G1.residue25  prop5.G1.residue50  prop5.G1.residue75 
#>           0.5338078          23.4875445          50.0000000          74.5551601 
#> prop5.G1.residue100   prop5.G2.residue0  prop5.G2.residue25  prop5.G2.residue50 
#>          98.9323843           0.3558719          23.1316726          50.1779359 
#>  prop5.G2.residue75 prop5.G2.residue100   prop5.G3.residue0  prop5.G3.residue25 
#>          73.8434164          99.8220641           0.1779359          27.2241993 
#>  prop5.G3.residue50  prop5.G3.residue75 prop5.G3.residue100 
#>          48.0427046          75.4448399         100.0000000