TY - JOUR
T1 - RS-Predictor models augmented with SMARTCyp reactivities
T2 - robust metabolic regioselectivity predictions for nine CYP isozymes
AU - Zaretzki, Jed
AU - Rydberg, Patrik
AU - Bergeron, Charles
AU - Bennett, Kristin P
AU - Olsen, Lars
AU - Breneman, Curt Mark
PY - 2012
Y1 - 2012
N2 - RS-Predictor is a tool for creating pathway-independent, isozyme-specific site of metabolism (SOM) prediction models using any set of known cytochrome P450 substrates and metabolites. Until now, the RS-Predictor method was only trained and validated on CYP 3A4 data, but in the present study we report on the versatility the RS-Predictor modeling paradigm by creating and testing regioselectivity models for substrates of the nine most important CYP isozymes. Through curation of source literature, we have assembled 680 substrates distributed among CYPs 1A2, 2A6, 2B6, 2C19, 2C8, 2C9, 2D6, 2E1 and 3A4, which we believe is the largest publicly accessible collection of P450 ligands and metabolites ever released. A comprehensive investigation into the importance of different descriptor classes for predicting the regioselectivity of each isozyme is made through the generation of multiple independent RS-Predictor models for each set of isozyme substrates. Two of these models include a DFT reactivity descriptor derived from SMARTCyp. Optimal combinations of RS-Predictor and SMARTCyp are shown to have stronger performance than either method alone, while also exceeding the accuracy of the commercial regioselectivity prediction methods distributed by StarDrop and Schrödinger, correctly identifying a large proportion of the metabolites in each substrate set within the top two rank-positions: 1A2(83.0%), 2A6(85.7%), 2B6(82.1%), 2C19(86.2%), 2C8(83.8%), 2C9(84.5%), 2D6(85.9%), 2E1(82.8%), 3A4(82.3%) and merged(86.0%). Comprehensive datamining of each substrate set and careful statistical analyses of the predictions made by the different models revealed new insights into molecular features that control metabolic regioselectivity and enable accurate prospective prediction of likely SOMs.
AB - RS-Predictor is a tool for creating pathway-independent, isozyme-specific site of metabolism (SOM) prediction models using any set of known cytochrome P450 substrates and metabolites. Until now, the RS-Predictor method was only trained and validated on CYP 3A4 data, but in the present study we report on the versatility the RS-Predictor modeling paradigm by creating and testing regioselectivity models for substrates of the nine most important CYP isozymes. Through curation of source literature, we have assembled 680 substrates distributed among CYPs 1A2, 2A6, 2B6, 2C19, 2C8, 2C9, 2D6, 2E1 and 3A4, which we believe is the largest publicly accessible collection of P450 ligands and metabolites ever released. A comprehensive investigation into the importance of different descriptor classes for predicting the regioselectivity of each isozyme is made through the generation of multiple independent RS-Predictor models for each set of isozyme substrates. Two of these models include a DFT reactivity descriptor derived from SMARTCyp. Optimal combinations of RS-Predictor and SMARTCyp are shown to have stronger performance than either method alone, while also exceeding the accuracy of the commercial regioselectivity prediction methods distributed by StarDrop and Schrödinger, correctly identifying a large proportion of the metabolites in each substrate set within the top two rank-positions: 1A2(83.0%), 2A6(85.7%), 2B6(82.1%), 2C19(86.2%), 2C8(83.8%), 2C9(84.5%), 2D6(85.9%), 2E1(82.8%), 3A4(82.3%) and merged(86.0%). Comprehensive datamining of each substrate set and careful statistical analyses of the predictions made by the different models revealed new insights into molecular features that control metabolic regioselectivity and enable accurate prospective prediction of likely SOMs.
KW - Former Faculty of Pharmaceutical Sciences
U2 - 10.1021/ci300009z
DO - 10.1021/ci300009z
M3 - Journal article
C2 - 22524152
VL - 52
SP - 1637
EP - 1659
JO - Journal of Chemical Information and Modeling
JF - Journal of Chemical Information and Modeling
SN - 1549-9596
IS - 6
ER -