Parallellized Protein Sequence Similarity Calculation based on Sequence Alignment

Usage

calcParProtSeqSim(protlist, cores = 2, type = "local", submat = "BLOSUM62")

Arguments

protlist: A length n list containing n protein sequences, each component of the list is a character string, storing one protein sequence. Unknown sequences should be represented as ''.
cores: Integer. The number of CPU cores to use for parallel execution, default is 2. Users could use the detectCores() function in the parallel package to see how many cores they could use.
type: Type of alignment, default is 'local', could be 'global' or 'local', where 'global' represents Needleman-Wunsch global alignment; 'local' represents Smith-Waterman local alignment.
submat: Substitution matrix, default is 'BLOSUM62', could be one of 'BLOSUM45', 'BLOSUM50', 'BLOSUM62', 'BLOSUM80', 'BLOSUM100', 'PAM30', 'PAM40', 'PAM70', 'PAM120', 'PAM250'.

Value

A n x n similarity matrix.

Details

This function implemented the parallellized version for calculating protein sequence similarity based on sequence alignment.

Examples

s1 = readFASTA(system.file('protseq/P00750.fasta', package = 'Rcpi'))[[1]]
s2 = readFASTA(system.file('protseq/P08218.fasta', package = 'Rcpi'))[[1]]
s3 = readFASTA(system.file('protseq/P10323.fasta', package = 'Rcpi'))[[1]]
s4 = readFASTA(system.file('protseq/P20160.fasta', package = 'Rcpi'))[[1]]
s5 = readFASTA(system.file('protseq/Q9NZP8.fasta', package = 'Rcpi'))[[1]]
plist = list(s1, s2, s3, s4, s5)
# \donttest{
psimmat = calcParProtSeqSim(plist, cores = 2, type = 'local',
                            submat = 'BLOSUM62')
#> Error: The package "pwalign" is required. Please install it from Bioconductor.
print(psimmat)# }
#> Error: object 'psimmat' not found

Parallellized Protein Sequence Similarity Calculation based on Sequence Alignment

Usage

Arguments

Value

Details

See also

Examples