The majority of signal transduction in eukaryotic cells is mediated by protein kinases, one of the largest protein families in humans. Dysregulation is often linked to cancer, autoimmune disorders, or Alzheimer’s, highlighting their pivotal role as drug targets. In my Bachelor thesis, we developed an automated in silico pipeline to generate potential kinase inhibitors using a fragment-based approach. This pipeline leverages KinFragLib, a kinase-focused fragment library, derived from the fragmentation of cocrystallized kinase-ligand complexes. The approach employs FlexX (template) docking and HYDE scoring to iteratively grow ligands within a protein kinase binding pocket of interest, guided by a subpocket-constrained beam search strategy. Using extensive prior knowledge on kinase ligands and functionally relevant subpockets, the search is tailored to ensure that distinct fragments are placed in specific regions of the kinase binding pocket.
Over the past 12 months, we have further refined the pipeline to increase molecular diversity during each fragment-growing iteration and updated KinFragLib using the latest version of KLIFS. The pipeline has now been finalized, and we have published its first release on GitHub to facilitate application to other kinase targets.
As a case study, we applied the pipeline to a PKA kinase (PDB code: 5n1f), an in-house target available through our collaboration partners AK Diederich, Philipps University Marburg. This application generated 59,255 molecules, of which 33,758 were successfully docked. As part of the initial post-filtering step, we removed compounds with a HYDE score of > 1µM, narrowing the dataset to 1166 candidates for further analysis. We then assessed the drug likeness of these candidates by evaluating relevant chemical properties, molecular diversity, and chemical novelty. Remarkably, 99.9% of the compounds exhibited novelty compared to ChEMBL33. To simplify downstream synthesis, we further excluded compounds with chiral centers. After that, 815 compounds remained, from which the 100 top-ranked compounds (HYDE estimated affinity) were prioritized. Together with our collaboration partners, we selected 16 compounds for experimental testing. To date, 11 of these selected compounds have been synthesized, and preliminary results from the PKA binding assay show promising activity in the low µ-molar range. More precisely, 4 of these compounds reduced the kinase activity to less than 50%, two of them to less than 10%, at a 100µM compound concentration.
In summary, our work demonstrates the potential of fragment-based in silico pipelines to generate chemically diverse and synthesizable novel kinase inhibitors, with promising early experimental validation supporting their relevance as potential drug candidates.
After 1 year, Katharina has achieved the following goals:
- Our primary goal was to refine the pipeline and finalize it for sharing the code on GitHub. To achieve this, we refactored the code to facilitate the integration of new features, enhancing its modularity and usability. A key methodological development was the implementation of a cluster-based compound selection feature, which increases molecular diversity during each fragment-growing iteration. This feature balances the tradeoff between exploitation (optimizing high-scoring compounds) and exploration (identifying diverse candidates). Additionally, we updated KinFragLib by incorporating and fragmenting complexes from the newest KLIFS version. This expanded the library from 7486 to 9505 fragments. The resulting automatic pipeline is available under https://github.com/volkamerlab/KinFragLib_PocketEnum.
- Our second major goal was to apply the refined pipeline to the hamster PKA (PDB code: 5n1f), which is available for testing through AK Diedrich, Marburg. We generated 33,758 compounds of which 1166 were identified as potential PKA binders based on their promising HYDE binding affinity. To facilitate in-depth analysis, we developed three Jupyter notebooks focusing on three key aspects: (1) chemical properties, (2) molecular diversity, and (3) chemical novelty. The chemical properties analysis revealed that 96% of the candidates exhibit drug-like molecular properties, satisfying Lipinski’s Rule of Five criteria. The chemical space analysis highlighted a high degree of molecular diversity within the generated compounds, with a mean Tanimoto distance of 0.79. To assess chemical novelty, we compared the post-filtered compounds with ChEMBL33. Only 1.2% of the compounds had a Tanimoto similarity of ≥ 0.9 to any ChEMBL compound.
- Our third goal was to validate the pipeline by selecting, synthesizing and experimentally testing promising compounds. As a first step, we applied a filtering criterion to remove compounds with chiral centers (27% of candidates) to avoid potential challenges in synthesis. From the remaining pool, we prioritized the 100 compounds with the highest HYDE binding affinities (0.83-73.88nM). Together with our collaboration partners, we selected 16 diverse candidates for synthesis. This selection was guided by their expertise, diversity considerations, and the availability and costs of necessary starting materials. To date, 11 of these compounds have been successfully synthesized (minor modifications were made to the molecules to simplify the synthesis). To experimentally validate the compounds, we measured the relative kinase activity with the synthesized compounds at 100µM concentration. Notably, 4 out of 11 compounds reduced the kinase activity to less than 50%, 2 to less than 10%.