Skip to main navigation Skip to search Skip to main content

pKalculator: A pKa predictor for C–H bonds

Rasmus M. Borup, Nicolai Ree, Jan H. Jensen*

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

4 Citations (Scopus)
37 Downloads (Pure)

Abstract

Determining the pKa values of various C–H sites in organic molecules offers valuable insights for synthetic chemists in predicting reaction sites. As molecular complexity increases, this task becomes more challenging. This paper introduces pKalculator, a quantum chemistry (QM)-based workflow for automatic computations of C–H pKa values, which is used to generate a training dataset for a machine learning (ML) model. The QM workflow is benchmarked against 695 experimentally determined C–H pKa values in DMSO. The ML model is trained on a diverse dataset of 775 molecules with 3910 C–H sites. Our ML model predicts C–H pKa values with a mean absolute error (MAE) and a root mean squared error (RMSE) of 1.24 and 2.15 pKa units, respectively. Furthermore, we employ our model on 1043 pKa-dependent reactions (aldol, Claisen, and Michael) and successfully indicate the reaction sites with a Matthew’s correlation coefficient (MCC) of 0.82.

Original languageEnglish
JournalBeilstein Journal of Organic Chemistry
Volume20
Pages (from-to)1614-1622
Number of pages9
ISSN2195-951X
DOIs
Publication statusPublished - 2024

Bibliographical note

Funding Information:
This work was funded by the Independent Research Foundation Denmark (DFF; grant number 1032-00129B).

Publisher Copyright:
© 2024 Borup et al.;

Keywords

  • pKa predictor
  • values values;

Cite this