Skip to contents

Protein Sequence Segmentation

Usage

segProt(
  x,
  aa = c("A", "R", "N", "D", "C", "E", "Q", "G", "H", "I", "L", "K", "M", "F", "P", "S",
    "T", "W", "Y", "V"),
  k = 7
)

Arguments

x

A character vector, as the input protein sequence.

aa

A character, the amino acid type. one of 'A', 'R', 'N', 'D', 'C', 'E', 'Q', 'G', 'H', 'I', 'L', 'K', 'M', 'F', 'P', 'S', 'T', 'W', 'Y', 'V'.

k

A positive integer, specifys the window size (half of the window), default is 7.

Value

A named list, each component contains one of the segmentations (a character string), names of the list components are the positions of the specified amino acid in the sequence.

Details

This function extracts the segmentations from the protein sequence.

Examples

x = readFASTA(system.file('protseq/P00750.fasta', package = 'Rcpi'))[[1]]
segProt(x, aa = 'R', k = 5)
#> $`6`
#> [1] "MDAMKRGLCCV"
#> 
#> $`29`
#> [1] "QEIHARFRRGA"
#> 
#> $`31`
#> [1] "IHARFRRGARS"
#> 
#> $`32`
#> [1] "HARFRRGARSY"
#> 
#> $`35`
#> [1] "FRRGARSYQVI"
#> 
#> $`42`
#> [1] "YQVICRDEKTQ"
#> 
#> $`58`
#> [1] "HQSWLRPVLRS"
#> 
#> $`62`
#> [1] "LRPVLRSNRVE"
#> 
#> $`65`
#> [1] "VLRSNRVEYCW"
#> 
#> $`75`
#> [1] "WCNSGRAQCHS"
#> 
#> $`90`
#> [1] "SCSEPRCFNGG"
#> 
#> $`124`
#> [1] "CEIDTRATCYE"
#> 
#> $`136`
#> [1] "QGISYRGTWST"
#> 
#> $`164`
#> [1] "KPYSGRRPDAI"
#> 
#> $`165`
#> [1] "PYSGRRPDAIR"
#> 
#> $`170`
#> [1] "RPDAIRLGLGN"
#> 
#> $`180`
#> [1] "NHNYCRNPDRD"
#> 
#> $`184`
#> [1] "CRNPDRDSKPW"
#> 
#> $`224`
#> [1] "NGSAYRGTHSL"
#> 
#> $`268`
#> [1] "KHNYCRNPDGD"
#> 
#> $`284`
#> [1] "HVLKNRRLTWE"
#> 
#> $`285`
#> [1] "VLKNRRLTWEY"
#> 
#> $`302`
#> [1] "STCGLRQYSQP"
#> 
#> $`310`
#> [1] "SQPQFRIKGGL"
#> 
#> $`333`
#> [1] "IFAKHRRSPGE"
#> 
#> $`334`
#> [1] "FAKHRRSPGER"
#> 
#> $`339`
#> [1] "RSPGERFLCGG"
#> 
#> $`362`
#> [1] "HCFQERFPPHH"
#> 
#> $`374`
#> [1] "TVILGRTYRVV"
#> 
#> $`377`
#> [1] "LGRTYRVVPGE"
#> 
#> $`418`
#> [1] "KSDSSRCAQES"
#> 
#> $`427`
#> [1] "ESSVVRTVCLP"
#> 
#> $`462`
#> [1] "PFYSERLKEAH"
#> 
#> $`469`
#> [1] "KEAHVRLYPSS"
#> 
#> $`475`
#> [1] "LYPSSRCTSQH"
#> 
#> $`484`
#> [1] "QHLLNRTVTDN"
#> 
#> $`497`
#> [1] "CAGDTRSGGPQ"
#> 
#> $`524`
#> [1] "CLNDGRMTLVG"
#> 
#> $`557`
#> [1] "YLDWIRDNMRP"
#>