Prediction of aqueous solubility from SCRATCH

Parijat Jain, Samuel H. Yalkowsky

Research output: Contribution to journalArticlepeer-review

20 Scopus citations


This study proposes the SCRATCH model for the aqueous solubility estimation of a compound directly from its structure. The algorithm utilizes predicted melting points and predicted aqueous activity coefficients. It uses two additive, constitutive molecular descriptors (enthalpy of melting and aqueous activity coefficient) and two non-additive molecular descriptors (symmetry and flexibility). The latter are used to determine the entropy of melting. The melting point prediction is trained on over 2200 compounds whereas the aqueous activity coefficient is trained on about 1640 compounds, making the model very rigorous and robust. The model is validated using a 10-fold cross-validation on a dataset of 883 compounds for the aqueous solubility prediction. A comparison with the general solubility equation (GSE) suggests that the SCRATCH predicted aqueous solubilities have a slightly greater average absolute error. This could result from the fact that SCRATCH uses two predicted parameters whereas the GSE utilizes one measured property, the melting point. Although the GSE is simpler to use, the drawback of requiring an experimental melting point is overcome in SCRATCH which can predict the aqueous solubility of a compound based solely on its structure and no experimental values.

Original languageEnglish (US)
Pages (from-to)1-5
Number of pages5
JournalInternational Journal of Pharmaceutics
Issue number1-2
StatePublished - Jan 29 2010


  • Activity coefficient
  • Cross-validation
  • GSE
  • Melting point
  • Model
  • Prediction
  • Solubility

ASJC Scopus subject areas

  • Pharmaceutical Science


Dive into the research topics of 'Prediction of aqueous solubility from SCRATCH'. Together they form a unique fingerprint.

Cite this