Skip to contents

Parallelized Drug Molecule Similarity Search by Molecular Fingerprints Similarity or Maximum Common Substructure Search

Usage

searchDrug(
  mol,
  moldb,
  cores = 2,
  method = c("fp", "mcs"),
  fptype = c("standard", "extended", "graph", "hybrid", "maccs", "estate", "pubchem",
    "kr", "shortestpath", "fp2", "fp3", "fp4", "obmaccs"),
  fpsim = c("tanimoto", "euclidean", "cosine", "dice", "hamming"),
  mcssim = c("tanimoto", "overlap"),
  ...
)

Arguments

mol

The query molecule. The location of a sdf file containing one molecule.

moldb

The molecule database. The location of a sdf file containing all the molecules to be searched with.

cores

Integer. The number of CPU cores to use for parallel search, default is 2. Users could use the detectCores() function in the parallel package to see how many cores they could use.

method

'fp' or 'mcs'. Search by molecular fingerprints or by maximum common substructure searching.

fptype

The fingerprint type, only available when method = 'fp'. Rcpi supports 13 types of fingerprints, including 'standard', 'extended', 'graph', 'hybrid', 'maccs', 'estate', 'pubchem', 'kr', 'shortestpath', 'fp2', 'fp3', 'fp4', 'obmaccs'.

fpsim

Similarity measure type for fingerprint, only available when method = 'fp'. Including 'tanimoto', 'euclidean', 'cosine', 'dice' and 'hamming'. See calcDrugFPSim for details.

mcssim

Similarity measure type for maximum common substructure search, only available when method = 'mcs'. Including 'tanimoto' and 'overlap'.

...

Other possible parameter for maximum common substructure search, see calcDrugMCSSim for available options.

Value

Named numerical vector. With the decreasing similarity value of the molecules in the database.

Details

This function does compound similarity search derived by various molecular fingerprints with various similarity measures or derived by maximum common substructure search. This function runs for a query compound against a set of molecules.

Examples

mol = system.file('compseq/DB00530.sdf', package = 'Rcpi')
# DrugBank ID DB00530: Erlotinib
moldb = system.file('compseq/tyrphostin.sdf', package = 'Rcpi')
# Database composed by searching 'tyrphostin' in PubChem and filtered by Lipinski's Rule of Five
# \donttest{
searchDrug(mol, moldb, cores = 4, method = 'fp', fptype = 'maccs', fpsim = 'hamming')
#> Error in loadMolecules(mol): The package "rcdk" is required to load molecular structures
searchDrug(mol, moldb, cores = 4, method = 'fp', fptype = 'fp2', fpsim = 'tanimoto')
#> Error: Must install the `ChemmineOB` package first.
searchDrug(mol, moldb, cores = 4, method = 'mcs', mcssim = 'tanimoto')# }
#> Error: Must install the `ChemmineR` package to use the 'mcs' method.