Project

project picture

Winter 2024 challenge: rejected after 1 year

Chemical Space Docking And Optimization Of Anticancer Drugs Against CRC Targets

Ashok Palaniappan, SASTRA deemed University, Thanjavur, India

We were interested in discovery of novel anticancer agents for the treatment of colorectal cancer. We used an original protocol for target discovery building on earlier efforts for biomarker discovery. Deciding to work with natural products as the source of small molecule libraries to compare against FDA-approved molecules, we followed three lines of investigation in the pursuit of BioSolveIT-aided drug discovery: (I) SeeSAR-based docking studies to identify multi-target ligands, which could be optimized for ADME profiles; (II) InfiniSee-based chemical docking studies, which combined FlexX docking, Hyde scoring, FastGrow, Chemical Space docking, and FTrees in a computational pipeline to deliver optimized new chemical entities (NCE) with improved ADMET properties for each target; and (III) Creation of novel Chemical Space for Natural Products using CoLibri. We identified six targets upregulated in colorectal cancer using a multimodal analysis of the biomarkers identified in our earlier work COADREADx (PMID: 39484215) . This included application of network reconstruction, statistical modeling, and database cross-referencing, followed by robust rank aggregation. Crystal structures for four targets were available, namely GABBR2, AGT6, FUT8, and GABRD. The remaining two targets, namely ESM1 and GNG4, were modelled using AlphaFold3 and validated. The binding site for each target was identified using CastP and FPocket quality metrics, and validated with SeeSAR. The small-molecule libraries of interest included Coconut (695,128), Lotus (276,518), IMPPAT phytochemicals (1,335), and Dr Dukes phytochemicals. Since Dr Dukes was available only in IUPAC / common names, we manually converted the same to SMILES to yield 4,355 compounds. The investigations bore fruit, identifying promising and high-confidence leads for some of the targets that remain to be validated using molecular dynamics and experimental studies. Application of the Chemical spaces for drug discovery is a work in progress.
After 1 year, Ashok has achieved the following goals:
  1. Target disovery and docking studies using SeeSAR: Local docking with BioSolveIT was performed with DrugBankv5 FDA-approved dataset (~2,600 compounds). The top 30 ligands (based on binding affinity) for each target with LLE ≥ 5 and IC50 < 1 nM, were identified, and the ligand space was analyzed. We found 13 ligands shared by >= 3 targets. We found that ligands binding 3 targets could be optimized for their ADMET properties using a custom RL-based MCTS model, yielding: (I) OC[C@@H]([NH2+]CC[NH2+][C@H](CO)CC)CC , optimized to CC[C@@H](CO)[NH2+]CC#[PdH2+][C@@H](CC)CO ; (II) P([O-])([O-])(=O)C(P([O-])([O-])=O)(O)CC[N@H+](CCCCC)C , optimized to *C=CCC[N@H+](C)CCC(O)(P(=O)([O-])[O-])P(=O)([O-])[O-] ; (III) SC[C@H]([NH2+]CC[NH2+][C@H](C(=O)OCC)CS)C(=O)OCC , optimized to CCOC(=O)[C@H](C=S)[NH2+]CC[NH2+][C@@H](CC)C(=O)OCC ; and (IV) O=C(N1CCOCC1)NCC[NH2+]C[C@H](O)COc2ccc(O)cc2 , optimized to O=C(*CC[NH2+]C[C@H](O)COc1c#cc(O)cc1)N1CCOCC1 .
  2. Chemical Space Docking studies using infiniSEE: We applied Tanimoto cutoff = 0.85 on the pooled dataset of Lotus, IMPPAT, DrDukes and FDA-approved DrugBank, to yield 27k compounds for virtual screening. We restricted our attention to four targets with defined binding pockets: GABBR2, FUT8, ESM1, and GNG4. FlexX docking of the 27k library against the 4 targets yielded the best candidates ranked on docking score, predicted affinity, LLE, and drug-likeness estimates. The top-ranked ligands all came from Dr Dukes. They was subjected to fragment-based optimization with FastGrow. Using these ligand as a three-dimensional query template, we executed CSD for analogue expansion with ChemRiya, and validated the results with Ftrees pharmacophore similarity: (1) FUT8 showed a high-confidence lead, N6-Ethyladenine nucleotide, with affinity 4.0 µM; LLE 4.78; docking score -32.92; and FTrees score: 0.906, appropriate as ATP-mimic. (2) GNG4 showed a promising hit, but overly lipophilic LLE.
  3. Creation of Chemical Space for Natural Products using CoLibri: We used all NP databases of interest incl. COCONUT along with FDA-approved drugs in DrugBankv5 and followed a three-phase protocol to identify the building blocks (1) Intra-db deduplication, while preserving regioisomers and stereoisomers; (2) Inter-db Clustering by Butina on the Morgan fingerprints at Tanimoto threshold = 0.85. (3) Building Block Extraction using the cluster centroids, by following BRICS and RECAP fallback strategy to ensure complete fragmentation. The process yielded 813,662 compounds, 362,013 cluster representatives, and 114,035 building blocks in the respective phases. These were combined with 105 Reaction SMIRKS in 8 categories (from eMolecules Cookbook and Hartenfeller et al, JCIM, 2011) using CoLibri to yield ~2GB chemical space file in a proprietary Tree-based Feature Space, enumerable into ~2B compounds. This resource could be used in infiniSee for synthetically accessible CSD powered by HPSee.