Skip to contents

Protein sequence descriptors

Calculate protein and peptide sequence descriptors.

extractAAC()
Amino Acid Composition Descriptor
extractDC()
Dipeptide Composition Descriptor
extractTC()
Tripeptide Composition Descriptor
extractMoreauBroto()
Normalized Moreau-Broto Autocorrelation Descriptor
extractMoran()
Moran Autocorrelation Descriptor
extractGeary()
Geary Autocorrelation Descriptor
extractCTDC()
CTD Descriptors - Composition
extractCTDCClass()
CTD Descriptors - Composition (with customized amino acid classification support)
extractCTDDClass()
CTD Descriptors - Distribution (with customized amino acid classification support)
extractCTDTClass()
CTD Descriptors - Transition (with customized amino acid classification support)
extractCTDT()
CTD Descriptors - Transition
extractCTDD()
CTD Descriptors - Distribution
extractCTriad()
Conjoint Triad Descriptor
extractCTriadClass()
Conjoint Triad Descriptor (with customized amino acid classification support)
extractSOCN()
Sequence-Order-Coupling Numbers
extractQSO()
Quasi-Sequence-Order (QSO) Descriptor
extractPAAC()
Pseudo Amino Acid Composition (PseAAC) Descriptor
extractAPAAC()
Amphiphilic Pseudo Amino Acid Composition (APseAAC) Descriptor

PSSM descriptors

Calculate PSSM (profile-based) protein and peptide sequence descriptors.

extractPSSM()
Compute PSSM (Position-Specific Scoring Matrix) for given protein sequence
extractPSSMAcc()
Profile-based protein representation derived by PSSM (Position-Specific Scoring Matrix) and auto cross covariance
extractPSSMFeature()
Profile-based protein representation derived by PSSM (Position-Specific Scoring Matrix)
acc()
Auto Cross Covariance (ACC) for Generating Scales-Based Descriptors of the Same Length

PCM descriptors

Calculate PCM (proteochemometric modeling) descriptors.

extractScales()
Scales-Based Descriptors derived by Principal Components Analysis
extractScalesGap()
Scales-Based Descriptors derived by Principal Components Analysis (with Gap Support)
extractProtFP()
Amino Acid Properties Based Scales Descriptors (Protein Fingerprint)
extractProtFPGap()
Amino Acid Properties Based Scales Descriptors (Protein Fingerprint) with Gap Support
extractDescScales()
Scales-Based Descriptors with 20+ classes of Molecular Descriptors
extractFAScales()
Scales-Based Descriptors derived by Factor Analysis
extractMDSScales()
Scales-Based Descriptors derived by Multidimensional Scaling
extractBLOSUM()
BLOSUM and PAM Matrix-Derived Descriptors

Similarity measures between proteins

Calculate protein sequence alignment based similarity measures and GO-based semantic similarity measures.

parSeqSim()
Parallel Protein Sequence Similarity Calculation Based on Sequence Alignment (In-Memory Version)
parSeqSimDisk()
Parallel Protein Sequence Similarity Calculation Based on Sequence Alignment (Disk-Based Version)
crossSetSim()
Parallel Protein Sequence Similarity Calculation Between Two Sets Based on Sequence Alignment (In-Memory Version)
crossSetSimDisk()
Parallel Protein Sequence Similarity Calculation Between Two Sets Based on Sequence Alignment (Disk-Based Version)
twoSeqSim()
Protein Sequence Alignment for Two Protein Sequences
parGOSim()
Protein Similarity Calculation based on Gene Ontology (GO) Similarity
twoGOSim()
Protein Similarity Calculation based on Gene Ontology (GO) Similarity

Pre-process protein sequences

Helper functions for pre-processing protein sequences.

getUniProt()
Retrieve Protein Sequences from UniProt by Protein ID
readFASTA()
Read Protein Sequences in FASTA Format
readPDB()
Read Protein Sequences in PDB Format
protcheck()
Protein sequence amino acid type sanity check
protseg()
Protein Sequence Segmentation/Partition
removeGaps()
Remove or replace gaps from protein sequences.
protr protr-package
protr: Generating Various Numerical Representation Schemes for Protein Sequences

Precomputed molecular descriptors

Precomputed molecular descriptors for the 20 amino acids.

AAindex
AAindex Data of 544 Physicochemical and Biological Properties for 20 Amino Acids
AAMetaInfo
Meta Information for the 20 Amino Acids
OptAA3d
OptAA3d.sdf - 20 Amino Acids Optimized with MOE 2011.10 (Semiempirical AM1)
AA2DACOR
2D Autocorrelations Descriptors for 20 Amino Acids calculated by Dragon
AA3DMoRSE
3D-MoRSE Descriptors for 20 Amino Acids calculated by Dragon
AAACF
Atom-Centred Fragments Descriptors for 20 Amino Acids calculated by Dragon
AABurden
Burden Eigenvalues Descriptors for 20 Amino Acids calculated by Dragon
AAConn
Connectivity Indices Descriptors for 20 Amino Acids calculated by Dragon
AAConst
Constitutional Descriptors for 20 Amino Acids calculated by Dragon
AACPSA
CPSA Descriptors for 20 Amino Acids calculated by Discovery Studio
AADescAll
All 2D Descriptors for 20 Amino Acids calculated by Dragon
AAEdgeAdj
Edge Adjacency Indices Descriptors for 20 Amino Acids calculated by Dragon
AAEigIdx
Eigenvalue-Based Indices Descriptors for 20 Amino Acids calculated by Dragon
AAFGC
Functional Group Counts Descriptors for 20 Amino Acids calculated by Dragon
AAGeom
Geometrical Descriptors for 20 Amino Acids calculated by Dragon
AAGETAWAY
GETAWAY Descriptors for 20 Amino Acids calculated by Dragon
AAInfo
Information Indices Descriptors for 20 Amino Acids calculated by Dragon
AAMOE2D
2D Descriptors for 20 Amino Acids calculated by MOE 2011.10
AAMOE3D
3D Descriptors for 20 Amino Acids calculated by MOE 2011.10
AAMolProp
Molecular Properties Descriptors for 20 Amino Acids calculated by Dragon
AARandic
Randic Molecular Profiles Descriptors for 20 Amino Acids calculated by Dragon
AARDF
RDF Descriptors for 20 Amino Acids calculated by Dragon
AATopo
Topological Descriptors for 20 Amino Acids calculated by Dragon
AATopoChg
Topological Charge Indices Descriptors for 20 Amino Acids calculated by Dragon
AAWalk
Walk and Path Counts Descriptors for 20 Amino Acids calculated by Dragon
AAWHIM
WHIM Descriptors for 20 Amino Acids calculated by Dragon

BLOSUM and PAM matrices

BLOSUM and PAM matrices for the 20 amino acids.

AABLOSUM45
BLOSUM45 Matrix for 20 Amino Acids
AABLOSUM50
BLOSUM50 Matrix for 20 Amino Acids
AABLOSUM62
BLOSUM62 Matrix for 20 Amino Acids
AABLOSUM80
BLOSUM80 Matrix for 20 Amino Acids
AABLOSUM100
BLOSUM100 Matrix for 20 Amino Acids
AAPAM30
PAM30 Matrix for 20 Amino Acids
AAPAM40
PAM40 Matrix for 20 Amino Acids
AAPAM70
PAM70 Matrix for 20 Amino Acids
AAPAM120
PAM120 Matrix for 20 Amino Acids
AAPAM250
PAM250 Matrix for 20 Amino Acids