Skip to contents

This function calculates the Composition descriptor of the CTD descriptors, with customized amino acid classification support.

Usage

extractCTDCClass(x, aagroup1, aagroup2, aagroup3)

Arguments

x

A character vector, as the input protein sequence.

aagroup1

A named list which contains the first group of customized amino acid classification. See example below.

aagroup2

A named list which contains the second group of customized amino acid classification. See example below.

aagroup3

A named list which contains the third group of customized amino acid classification. See example below.

Value

A length k * 3 named vector, k is the number of amino acid properties used.

Note

For this descriptor type, users need to intelligently evaluate the underlying details of the descriptors provided, instead of using this function with their data blindly. It would be wise to use some negative and positive control comparisons where relevant to help guide interpretation of the results.

References

Inna Dubchak, Ilya Muchink, Stephen R. Holbrook and Sung-Hou Kim. Prediction of protein folding class using global description of amino acid sequence. Proceedings of the National Academy of Sciences. USA, 1995, 92, 8700-8704.

Inna Dubchak, Ilya Muchink, Christopher Mayor, Igor Dralyuk and Sung-Hou Kim. Recognition of a Protein Fold in the Context of the SCOP classification. Proteins: Structure, Function and Genetics, 1999, 35, 401-407.

See also

See extractCTDTClass and extractCTDDClass for Transition and Distribution of the CTD descriptors with customized amino acid classification support.

Author

Nan Xiao <https://nanx.me>

Examples

x <- readFASTA(system.file("protseq/P00750.fasta", package = "protr"))[[1]]

# using five customized amino acid property classification
group1 <- list(
  "hydrophobicity" = c("R", "K", "E", "D", "Q", "N"),
  "normwaalsvolume" = c("G", "A", "S", "T", "P", "D", "C"),
  "polarizability" = c("G", "A", "S", "D", "T"),
  "secondarystruct" = c("E", "A", "L", "M", "Q", "K", "R", "H"),
  "solventaccess" = c("A", "L", "F", "C", "G", "I", "V", "W")
)

group2 <- list(
  "hydrophobicity" = c("G", "A", "S", "T", "P", "H", "Y"),
  "normwaalsvolume" = c("N", "V", "E", "Q", "I", "L"),
  "polarizability" = c("C", "P", "N", "V", "E", "Q", "I", "L"),
  "secondarystruct" = c("V", "I", "Y", "C", "W", "F", "T"),
  "solventaccess" = c("R", "K", "Q", "E", "N", "D")
)

group3 <- list(
  "hydrophobicity" = c("C", "L", "V", "I", "M", "F", "W"),
  "normwaalsvolume" = c("M", "H", "K", "F", "R", "Y", "W"),
  "polarizability" = c("K", "M", "H", "F", "R", "Y", "W"),
  "secondarystruct" = c("G", "N", "P", "S", "D"),
  "solventaccess" = c("M", "S", "P", "T", "H", "Y")
)

extractCTDCClass(x, aagroup1 = group1, aagroup2 = group2, aagroup3 = group3)
#>  prop1.G1  prop1.G2  prop1.G3  prop2.G1  prop2.G2  prop2.G3  prop3.G1  prop3.G2 
#> 0.2971530 0.4056940 0.2971530 0.4519573 0.2971530 0.2508897 0.3309609 0.4181495 
#>  prop3.G3  prop4.G1  prop4.G2  prop4.G3  prop5.G1  prop5.G2  prop5.G3 
#> 0.2508897 0.3896797 0.2953737 0.3149466 0.4306050 0.2971530 0.2722420