Following our main goal, we developed a ligand-based and chemometrics study with six different in-house databases of ABC, RND and MFS substrates. Calculations with the FTrees conducted us to understand the scaffold similarities between the molecules which are substrates of ABC, RND and MFS. After it, the construction of a 2D-model through a logistic regression permits us to understand the physico-chemical properties that can be used to classify the molecules as substrates of this pumps. These models also allow us to screen molecules which can be efflux pumps substrates.
After 1 year, Vinicius has achieved the following goals:
- First, our team concentrated the efforts in the compilation of six solid in-house databases with the biggest number of different bacterial efflux pump substrates (ABC, RND and MFS) and with new antibiotic drugs. Both databases were built using the molecules available in the literature from different medicinal chemistry, chemoinformatics and microbiology papers. To equilibrate the number of active and inactive molecules we used the Dark Chemical Matter small molecules database. This database consists of several molecules that have never shown any substantial biological activity despite having been tested in hundreds of high-throughput assays. Thus, we consider them as our non-active molecules. Even the principle being the same, the works do not evaluate the derivatives activity in the same conditions. We thus divided the derivatives into two classes: Class 0: compounds with no affinity to the efflux pump; Class 1: molecules substrates to the bacterial efflux pumps.
- Second, we want to observe the chemical similarities between the molecules from our database. To this, ligand-based and chemometrics techniques were employed. For exemple, using the FTrees we observe that the most present scaffold present in the ABC substrates our molecules was the beta-lactamic ring. It was present in more than 60 % of the derivatives. Then; we turned our interested in understand the physico-chemical similarities. To this 1875 physico-chemical properties of the compounds were predicted using the software PaDel and we developed and employed a KNIME workflows to evaluate the database through a linear logistical model. The columns were normalized in z-score and the variables were selected through the Forward Feature Selection. The models were validated through the method of cross validation 10 and matrix of confusion, both showing a good score.
- Third, we identified the physico-chemical parameters that allow us to correctly classify the molecules from our in-house database between bacterial efflux pumps substrates and non-substrates with a good rate of AUC (area under the curve). These parameters give us an idea about the lipophilicity, weight, hydrogenic bonds, electronic properties, scaffold and topology of the derivatives. With these models, we tested another in-house database with new antibiotics, synthesized to avoid the bacterial resistance issues, and we observe that 72% of the proposed molecules could be substrates of ABC efflux pumps, 63% substrates of RND efflux pump and 23% substrates of MFS efflux pumps.