We have identified six targets for colorectal cancer using an aggregate analysis of biomarkers identified in COADREADx, including network reconstruction, statistical modeling, machine learning, and established literature. Two targets were modelled using AlphaFold3 and validated. The unambiguous binding site for each target was identified using CastP and validated with SeeSAR. Local docking was performed with DrugBankv5 FDA-approved dataset. Analyzing the top ligands from this procedure, we found thirteen ligands shared by at least three out of the six targets. Such ligands are being used in template-based Chemical Space Docking with CHEMriya. Concurrently, we are using DrugAI deep learning model for lead optimization. In future, we will be extending datasets to DrDukes and IMPPATv2, and CSD to infiniSEE xREAL space. Further we will be applying FastGrow and FTrees tools in conjunction with other techniques to obtain New Chemical Entities with improved ADMET properties.
After 3 months, Ashok has achieved the following milestones:
- Using the seven biomarkers from COADREADx as a seed set, target discovery was performed using a novel protocol that included expression analysis, network reconstruction, and calculation of network centrality measures. Using tools such as DrugnomeAI and DepMap yielded additional scores for each candidate target. Finally robust rank aggregation was used to identify the most recurrent 6 targets for drug discovery. The binding sites for each target was evaluated using CastP and SeeSAR based on DoG and pocket volume, and the most druggable voluminous pocket was chosen for local docking (with SeeSAR) using FDA-approved DrugBankv5 moieties. Top 30 ligands (based on binding affinity) for each target with LLE ≥ 5 and IC50 < 1 nM, were identified. Analyzing the ligand space for all the targets identified one ligand shared by all six targets, two each shared by four and five targets respectively, and eight ligands shared by exactly three targets. Local docking jobs have been slow to complete.
- We are using these 13 ligands as templates for chemical space docking. We are also working with DrDukes and IMPPAT databases to identify more shared ligands. We have completed the conversion of DrDukes IUPAC dataset to SMILES to yield ~6000 compounds. Together with IMPPAT and DrugBank, the size of our combined curated small molecule library is about 17,000. Template-based CSD is being executed using CHEMriya and infiniSEE xREAL. This is expected to yield novel optimized leads assessed using Hyde scoring. Favourable such leads (Hyde score < -8 kcal/mol) would be optimized using FastGrow for improved ADMET properties without changing the binding pose geometry or affinity. We are also exploring pharmacophore-driven lead optimization via scaffold-hopping with FTrees. Both local docking and chemical space docking have been compute-intensive so far. By running on multiple machines 24/7, we have been able to conduct pilot studies and obtain results.
- Chemical Space Docking generates outcomes that are synthesizable – New Chemical Entities (NCEs). We have identified alternative methods to validate such NCEs: (i) deep learning models such as DrugAI that improve the QED (Quantitative estimate of drug-likeness) of potential leads via reinforcement learning coupled with Monte Carlo tree search. We are adapting the model for better results - some of the consensus ligands have shown higher QED post DrugAI optimization. For e.g, OC[C@@H]([NH2+]CC[NH2+][C@H](CO)CC)CC – a potential lead with QED 0.32 – has been optimized to CC[C@@H](CO)[NH2+]CC#[PdH2+][C@@H](CC)CO with an improved QED: 0.55. (ii) Molecular dynamics simulations of the drug-target complex in an aqueous bath for upto 50 ns to ascertain the residence time distribution of the drug in the docked pocket. (iii) ChemProp - based models trained on ClinTox to evaluate the toxicity profiles.