The idea behind predicting ADME properties is relatively straightforward: Starting with a large dataset of data points that link specific parameters to the 2D structure of a compound, patterns and correlations are identified. For instance, lipophilic groups increase the logP value, hydrophilic groups raise the logS value and enhance excretion while lowering the half-life, and so on. The models generated can vary in complexity, ranging from incremental calculations based on the presence or absence of structural elements (e.g., if something is present, parameter x increases; if absent, parameter y changes) to advanced machine learning algorithms that can detect more complex relationships.
The accuracy of predictions improves with the size of the dataset and the chemical diversity of the compounds covered. Today, predicting ADME properties using appropriate models is a core component of the R&D process. In most cases, it is integrated into the automated decision-making system that determines which molecules should be synthesized. For instance, if the model predicts that a molecule has highly unfavorable parameters, it is flagged accordingly and is only approved for synthesis in exceptional cases.
Given that this approach has already resulted in significant cost savings, several pharmaceutical companies have joined forces under the
Mellody project to share in-house testing data to expand their dataset.