Skip to contents

Protein sequence retrivial

Functions for retrieving protein sequence data from online databases.

getProt()
Retrieve Protein Sequence in various Formats from Databases
getFASTAFromUniProt()
Retrieve Protein Sequence in FASTA Format from the UniProt Database
getFASTAFromKEGG()
Retrieve Protein Sequence in FASTA Format from the KEGG Database
getPDBFromRCSBPDB()
Retrieve Protein Sequence in PDB Format from RCSB PDB
getSeqFromUniProt()
Retrieve Protein Sequence from the UniProt Database
getSeqFromKEGG()
Retrieve Protein Sequence from the KEGG Database
getSeqFromRCSBPDB()
Retrieve Protein Sequence from RCSB PDB

Drug molecular data retrivial

Functions for retrieving drug molecular data from online databases.

getDrug()
Retrieve Drug Molecules in MOL and SMILES Format from Databases
getMolFromDrugBank()
Retrieve Drug Molecules in MOL Format from the DrugBank Database
getMolFromPubChem()
Retrieve Drug Molecules in MOL Format from the PubChem Database
getMolFromChEMBL()
Retrieve Drug Molecules in MOL Format from the ChEMBL Database
getMolFromKEGG()
Retrieve Drug Molecules in MOL Format from the KEGG Database
getMolFromCAS()
Retrieve Drug Molecules in InChI Format from the CAS Database
getSmiFromDrugBank()
Retrieve Drug Molecules in SMILES Format from the DrugBank Database
getSmiFromPubChem()
Retrieve Drug Molecules in SMILES Format from the PubChem Database
getSmiFromChEMBL()
Retrieve Drug Molecules in SMILES Format from the ChEMBL Database
getSmiFromKEGG()
Retrieve Drug Molecules in SMILES Format from the KEGG Database

Protein sequence descriptors

Functions and datasets for computing commonly used protein sequence derived descriptors.

extractProtAAC()
Amino Acid Composition Descriptor
extractProtDC()
Dipeptide Composition Descriptor
extractProtTC()
Tripeptide Composition Descriptor
extractProtMoreauBroto()
Normalized Moreau-Broto Autocorrelation Descriptor
extractProtMoran()
Moran Autocorrelation Descriptor
extractProtGeary()
Geary Autocorrelation Descriptor
extractProtCTDC()
CTD Descriptors - Composition
extractProtCTDT()
CTD Descriptors - Transition
extractProtCTDD()
CTD Descriptors - Distribution
extractProtCTriad()
Conjoint Triad Descriptor
extractProtSOCN()
Sequence-Order-Coupling Numbers
extractProtQSO()
Quasi-Sequence-Order (QSO) Descriptor
extractProtPAAC()
Pseudo Amino Acid Composition Descriptor
extractProtAPAAC()
Amphiphilic Pseudo Amino Acid Composition Descriptor
AAindex
AAindex Data of 544 Physicochemical and Biological Properties for 20 Amino Acids

Profile-based protein sequence descriptors

Functions for generating profile-based protein representations.

extractProtPSSM()
Compute PSSM (Position-Specific Scoring Matrix) for given protein sequence
extractProtPSSMFeature()
Profile-based protein representation derived by PSSM (Position-Specific Scoring Matrix)
extractProtPSSMAcc()
Profile-based protein representation derived by PSSM (Position-Specific Scoring Matrix) and auto cross covariance

Scales-based descriptors for proteochemometrics modeling

Functions for generating scales-based descriptors for proteochemometrics (PCM) modeling.

extractPCMScales()
Generalized Scales-Based Descriptors derived by Principal Components Analysis
extractPCMPropScales()
Generalized AA-Properties Based Scales Descriptors
extractPCMDescScales()
Scales-Based Descriptors with 20+ classes of Molecular Descriptors
extractPCMFAScales()
Generalized Scales-Based Descriptors derived by Factor Analysis
extractPCMMDSScales()
Generalized Scales-Based Descriptors derived by Multidimensional Scaling
extractPCMBLOSUM()
Generalized BLOSUM and PAM Matrix-Derived Descriptors
acc()
Auto Cross Covariance (ACC) for Generating Scales-Based Descriptors of the Same Length

Molecular descriptor sets for generating scales-based descriptors

Molecular descriptor sets of the 20 amino acids for generating scales-based descriptors.

OptAA3d
OptAA3d.sdf - 20 Amino Acids Optimized with MOE 2011.10 (Semiempirical AM1)
AAMetaInfo
Meta Information for the 20 Amino Acids
AA2DACOR
2D Autocorrelations Descriptors for 20 Amino Acids calculated by Dragon
AA3DMoRSE
3D-MoRSE Descriptors for 20 Amino Acids calculated by Dragon
AAACF
Atom-Centred Fragments Descriptors for 20 Amino Acids calculated by Dragon
AABurden
Burden Eigenvalues Descriptors for 20 Amino Acids calculated by Dragon
AAConn
Connectivity Indices Descriptors for 20 Amino Acids calculated by Dragon
AAConst
Constitutional Descriptors for 20 Amino Acids calculated by Dragon
AAEdgeAdj
Edge Adjacency Indices Descriptors for 20 Amino Acids calculated by Dragon
AAEigIdx
Eigenvalue-Based Indices Descriptors for 20 Amino Acids calculated by Dragon
AAFGC
Functional Group Counts Descriptors for 20 Amino Acids calculated by Dragon
AAGeom
Geometrical Descriptors for 20 Amino Acids calculated by Dragon
AAGETAWAY
GETAWAY Descriptors for 20 Amino Acids calculated by Dragon
AAInfo
Information Indices Descriptors for 20 Amino Acids calculated by Dragon
AAMolProp
Molecular Properties Descriptors for 20 Amino Acids calculated by Dragon
AARandic
Randic Molecular Profiles Descriptors for 20 Amino Acids calculated by Dragon
AARDF
RDF Descriptors for 20 Amino Acids calculated by Dragon
AATopo
Topological Descriptors for 20 Amino Acids calculated by Dragon
AATopoChg
Topological Charge Indices Descriptors for 20 Amino Acids calculated by Dragon
AAWalk
Walk and Path Counts Descriptors for 20 Amino Acids calculated by Dragon
AAWHIM
WHIM Descriptors for 20 Amino Acids calculated by Dragon
AACPSA
CPSA Descriptors for 20 Amino Acids calculated by Discovery Studio
AADescAll
All 2D Descriptors for 20 Amino Acids calculated by Dragon
AAMOE2D
2D Descriptors for 20 Amino Acids calculated by MOE 2011.10
AAMOE3D
3D Descriptors for 20 Amino Acids calculated by MOE 2011.10
AABLOSUM45
BLOSUM45 Matrix for 20 Amino Acids
AABLOSUM50
BLOSUM50 Matrix for 20 Amino Acids
AABLOSUM62
BLOSUM62 Matrix for 20 Amino Acids
AABLOSUM80
BLOSUM80 Matrix for 20 Amino Acids
AABLOSUM100
BLOSUM100 Matrix for 20 Amino Acids
AAPAM30
PAM30 Matrix for 20 Amino Acids
AAPAM40
PAM40 Matrix for 20 Amino Acids
AAPAM70
PAM70 Matrix for 20 Amino Acids
AAPAM120
PAM120 Matrix for 20 Amino Acids
AAPAM250
PAM250 Matrix for 20 Amino Acids

Molecular descriptors

Functions for computing commonly used molecular descriptors.

extractDrugAIO()
Calculate All Molecular Descriptors in Rcpi at Once
extractDrugALOGP()
Calculate Atom Additive logP and Molar Refractivity Values Descriptor
extractDrugAminoAcidCount()
Calculate the Number of Amino Acids Descriptor
extractDrugApol()
Calculate the Sum of the Atomic Polarizabilities Descriptor
extractDrugAromaticAtomsCount()
Calculate the Number of Aromatic Atoms Descriptor
extractDrugAromaticBondsCount()
Calculate the Number of Aromatic Bonds Descriptor
extractDrugAtomCount()
Calculate the Number of Atom Descriptor
extractDrugAutocorrelationCharge()
Calculate the Moreau-Broto Autocorrelation Descriptors using Partial Charges
extractDrugAutocorrelationMass()
Calculate the Moreau-Broto Autocorrelation Descriptors using Atomic Weight
extractDrugAutocorrelationPolarizability()
Calculate the Moreau-Broto Autocorrelation Descriptors using Polarizability
extractDrugBCUT()
BCUT -- Eigenvalue Based Descriptor
extractDrugBondCount()
Calculate the Descriptor Based on the Number of Bonds of a Certain Bond Order
extractDrugBPol()
Calculate the Descriptor that Describes the Sum of the Absolute Value of the Difference between Atomic Polarizabilities of All Bonded Atoms in the Molecule
extractDrugCarbonTypes()
Topological Descriptor Characterizing the Carbon Connectivity in Terms of Hybridization
extractDrugChiChain()
Calculate the Kier and Hall Chi Chain Indices of Orders 3, 4, 5, 6 and 7
extractDrugChiCluster()
Evaluates the Kier and Hall Chi cluster indices of orders 3, 4, 5 and 6
extractDrugChiPath()
Calculate the Kier and Hall Chi Path Indices of Orders 0 to 7
extractDrugChiPathCluster()
Calculate the Kier and Hall Chi Path Cluster Indices of Orders 4, 5 and 6
extractDrugCPSA()
A Variety of Descriptors Combining Surface Area and Partial Charge Information
extractDrugDescOB()
Calculate Molecular Descriptors Provided by OpenBabel
extractDrugECI()
Calculate the Eccentric Connectivity Index Descriptor
extractDrugFMF()
Calculate the FMF Descriptor
extractDrugFragmentComplexity()
Calculate Complexity of a System
extractDrugGravitationalIndex()
Descriptor Characterizing the Mass Distribution of the Molecule.
extractDrugHBondAcceptorCount()
Number of Hydrogen Bond Acceptors
extractDrugHBondDonorCount()
Number of Hydrogen Bond Donors
extractDrugHybridizationRatio()
Descriptor that Characterizing Molecular Complexity in Terms of Carbon Hybridization States
extractDrugIPMolecularLearning()
Calculate the Descriptor that Evaluates the Ionization Potential
extractDrugKappaShapeIndices()
Descriptor that Calculates Kier and Hall Kappa Molecular Shape Indices
extractDrugKierHallSmarts()
Descriptor that Counts the Number of Occurrences of the E-State Fragments
extractDrugLargestChain()
Descriptor that Calculates the Number of Atoms in the Largest Chain
extractDrugLargestPiSystem()
Descriptor that Calculates the Number of Atoms in the Largest Pi Chain
extractDrugLengthOverBreadth()
Calculate the Ratio of Length to Breadth Descriptor
extractDrugLongestAliphaticChain()
Descriptor that Calculates the Number of Atoms in the Longest Aliphatic Chain
extractDrugMannholdLogP()
Descriptor that Calculates the LogP Based on a Simple Equation Using the Number of Carbons and Hetero Atoms
extractDrugMDE()
Calculate Molecular Distance Edge (MDE) Descriptors for C, N and O
extractDrugMomentOfInertia()
Descriptor that Calculates the Principal Moments of Inertia and Ratios of the Principal Moments
extractDrugPetitjeanNumber()
Descriptor that Calculates the Petitjean Number of a Molecule
extractDrugPetitjeanShapeIndex()
Descriptor that Calculates the Petitjean Shape Indices
extractDrugRotatableBondsCount()
Descriptor that Calculates the Number of Nonrotatable Bonds on A Molecule
extractDrugRuleOfFive()
Descriptor that Calculates the Number Failures of the Lipinski's Rule Of Five
extractDrugTPSA()
Descriptor of Topological Polar Surface Area Based on Fragment Contributions (TPSA)
extractDrugVABC()
Descriptor that Calculates the Volume of A Molecule
extractDrugVAdjMa()
Descriptor that Calculates the Vertex Adjacency Information of A Molecule
extractDrugWeight()
Descriptor that Calculates the Total Weight of Atoms
extractDrugWeightedPath()
Descriptor that Calculates the Weighted Path (Molecular ID)
extractDrugWHIM()
Calculate Holistic Descriptors Described by Todeschini et al.
extractDrugWienerNumbers()
Descriptor that Calculates Wiener Path Number and Wiener Polarity Number
extractDrugXLogP()
Descriptor that Calculates the Prediction of logP Based on the Atom-Type Method Called XLogP
extractDrugZagrebIndex()
Descriptor that Calculates the Sum of the Squared Atom Degrees of All Heavy Atoms

Molecular fingerprints

Functions for computing commonly used molecular fingerprints.

extractDrugStandard()
Calculate the Standard Molecular Fingerprints (in Compact Format)
extractDrugStandardComplete()
Calculate the Standard Molecular Fingerprints (in Complete Format)
extractDrugExtended()
Calculate the Extended Molecular Fingerprints (in Compact Format)
extractDrugExtendedComplete()
Calculate the Extended Molecular Fingerprints (in Complete Format)
extractDrugGraph()
Calculate the Graph Molecular Fingerprints (in Compact Format)
extractDrugGraphComplete()
Calculate the Graph Molecular Fingerprints (in Complete Format)
extractDrugHybridization()
Calculate the Hybridization Molecular Fingerprints (in Compact Format)
extractDrugHybridizationComplete()
Calculate the Hybridization Molecular Fingerprints (in Complete Format)
extractDrugMACCS()
Calculate the MACCS Molecular Fingerprints (in Compact Format)
extractDrugMACCSComplete()
Calculate the MACCS Molecular Fingerprints (in Complete Format)
extractDrugEstate()
Calculate the E-State Molecular Fingerprints (in Compact Format)
extractDrugEstateComplete()
Calculate the E-State Molecular Fingerprints (in Complete Format)
extractDrugPubChem()
Calculate the PubChem Molecular Fingerprints (in Compact Format)
extractDrugPubChemComplete()
Calculate the PubChem Molecular Fingerprints (in Complete Format)
extractDrugKR()
Calculate the KR (Klekota and Roth) Molecular Fingerprints (in Compact Format)
extractDrugKRComplete()
Calculate the KR (Klekota and Roth) Molecular Fingerprints (in Complete Format)
extractDrugShortestPath()
Calculate the Shortest Path Molecular Fingerprints (in Compact Format)
extractDrugShortestPathComplete()
Calculate the Shortest Path Molecular Fingerprints (in Complete Format)
extractDrugOBFP2()
Calculate the FP2 Molecular Fingerprints
extractDrugOBFP3()
Calculate the FP3 Molecular Fingerprints
extractDrugOBFP4()
Calculate the FP4 Molecular Fingerprints
extractDrugOBMACCS()
Calculate the MACCS Molecular Fingerprints

PPI and CPI descriptors

Functions for computing protein-protein and compound-protein interation descriptors.

getPPI()
Generating Protein-Protein Interaction Descriptors
getCPI()
Generating Compound-Protein Interaction Descriptors

Similarity and similarity searching

Functions for computing sequence/molecular similarities and similarity searching.

calcDrugFPSim()
Calculate Drug Molecule Similarity Derived by Molecular Fingerprints
calcDrugMCSSim()
Calculate Drug Molecule Similarity Derived by Maximum Common Substructure Search
searchDrug()
Parallelized Drug Molecule Similarity Search by Molecular Fingerprints Similarity or Maximum Common Substructure Search
calcTwoProtSeqSim()
Protein Sequence Alignment for Two Protein Sequences
calcParProtSeqSim()
Parallellized Protein Sequence Similarity Calculation based on Sequence Alignment
calcTwoProtGOSim()
Protein Similarity Calculation based on Gene Ontology (GO) Similarity
calcParProtGOSim()
Protein Sequence Similarity Calculation based on Gene Ontology (GO) Similarity

Protein sequence data processing

Functions for processing protein sequence data.

readFASTA()
Read Protein Sequences in FASTA Format
readPDB()
Read Protein Sequences in PDB Format
segProt()
Protein Sequence Segmentation
checkProt()
Check if the protein sequence's amino acid types are the 20 default types

Molecular data processing

Functions for processing molecular data.

readMolFromSDF()
Read Molecules from SDF Files and Return Parsed Java Molecular Object
readMolFromSmi()
Read Molecules from SMILES Files and Return Parsed Java Molecular Object or Plain Text List
convMolFormat()
Chemical File Formats Conversion
Rcpi Rcpi-package
Rcpi: Molecular Informatics Toolkit for Compound-Protein Interaction in Drug Discovery