Introduction
The tidychem
package offers a lightweight R interface
for accessing RDKit via the RDKit Python API.
Load, parse, and write chemical data
Chemical data format intro: SMI and SDF/MOL.
Reading. Parsing (Error handling example – will be
NULL
.). Writing.
Calculate chemical fingerprints
mols <- "smi-multiple.smi" |>
tidychem_example() |>
read_smiles()
# ECFP4
mols |> fp_morgan()
# similarity
# mols |> fp_morgan |> sim_tanimoto
# matrix
mols |> fp_morgan(explicit = TRUE)
Calculate chemical descriptors
2D descriptors and 3D descriptors.
The 3D follow a common workflow: 3D formance -> descriptor…
If already optimized with 3D coordinates, load them with
parse_sdf
or read_sdf
directly, then compute
the 3D descriptors with the vanilla option.
df <- "logd74.tsv" |>
tidychem_example() |>
read_tsv()
y <- df$logD7.4
mols <- df$SMILES |> parse_smiles()
mols
# matrix of 2D/3D descriptors
x <- mols |> desc_2d()
x
x[which(is.na(x))] <- 0