Quantitative structure-property relationship analysis of critical properties of individual organic compounds
DOI:
https://doi.org/10.15276/opu.1.71.2025.24Keywords:
critical properties, simplex approach, informational field of molecules, random forest method, QSPRAbstract
This work presents the development and analysis of robust QSPR models for critical properties (pressure PC, temperature TC, and volume VC). The object of this study is a database consisting of 399 different organic compounds. These compounds include saturated and unsaturated hydrocarbons, aromatic hydrocarbons, heterocyclic compounds, alcohols, simple and complex ethers, and their derivatives. The database encompasses carbon-, halogen-, oxygen-, nitrogen-, and sulfur- containing compounds. The molecular structures of investigated compounds were previously modeled, standardized, and validated with respect to connectivity and uniqueness. Structural descriptors were calculated at the 2D molecular modeling level using the simplex approach. To differentiate simplex vertices, not only atom types but also other atomic characteristics were used, including van der Waals interactions (from the universal force field), and potentials of informational fields weighted by atomic properties such as partial charge, polarizability, electronegativity, and lipophilicity. In addition to simplexes, smaller fragments consisting of two and three atoms were also used for each compound of database. A total of 4,939 2D structural descriptors were calculated. The Random Forest (RF) method was applied to establish the relationships between the structural descriptors and the critical properties (PC, TC, VC). The developed 2D RF QSPR models showed high approximation accuracy (R² = 0.99) and predictive ability (R²oob = 0.90…0.97). Physicochemical interpretation revealed that electrostatic factors have the greatest impact on critical properties. For compounds lacking experimental data on the studied properties, predictions were presented using the developed RF models, and the applicability domain of the developed QSPR models was evaluated.
Downloads
References
Bilyi, O.V. (2003). Physical chemistry. Kyev: CUL.
Ambrose, D., Tsonopoulos, C., Nikitin, E., Morton, D. & Marsh, K. (2015). Vapor–Liquid Critical Properties of Elements and Compounds. Review of Recent Data for Hydrocarbons and Non-hydrocarbons. Chem. Eng. Data, 60(12), 3444–3482. DOI: https://doi.org/10.1021/acs.jced.5b00571.
Yang, S., Lu, W., Chen, N., & Hu, Q. (2005). Support vector regression based QSPR for the prediction of some physicochemical properties of alkyl benzenes. J. Mol. Struct., 719, 119. DOI: https://doi.org/10.1016/j.theochem.2004.10.060.
Huoyu, R., Zhiqiang, Z., Guofang, J., Zhanggao, L., & Zhenzhen, X. (2022). Quantitative Structure-Property Relationship for Critical Temperature of Alkenes with Quantum-Chemical and Topological Indices. Russ. J. Phys. Chem., 96(11), 2329–2334. DOI: https://doi.org/10.1134/S0036024422110267.
Bouhedjar, K., Nacereddine, A. K., Ghorab, H., & Djerourou, A. (2019). QSPR Modeling for Critical Temperatures of Organic Compounds Using Hybrid Optimal Descriptors. International Journal of Quantitative Structure‑Property Relationships (IJQSPR), 4(4), 15–26. DOI: https://doi.org/10.4018/IJQSPR.2019100102.
Reid, R. C., Prausnitz, J. M., & Sherwood, T. K. (1977). The properties of gases and liquids. McGraw-Hill.
Chemaxon Ltd. (1998–2020). MarvinSketch, Standardizer. Retrieved from http://www.chemaxon.com.
Kuz’min, V., Artemenko, A., & Ognichenko, L. et al. (2021). Simplex Representation of molecular structure as universal QSAR/QSPR tool. Struct. Chem., 32(4), 1365–1392. DOI: https://doi.org/10.1007/s11224-021-01793-z.
Ognichenko, L.N., Kuz’min, V.E., & Artemenko, A.G. (2009). New structural descriptors of molecules on the basis of symbiosis of the informational field model and simplex representation of molecular structure. QSAR&Comb.Sci., 28(9), 939–945.
Rappe, A.K., Casewit, C.J., & Colwell, K.S. et al. (1992). UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations. J. Am. Chem. Soc., 114(25), 10024‒10035. DOI: https://doi.org/10.1021/ja00051a040.
Breiman, L. (2001). Random Forest. Machine Learning, 45, 5‒32. DOI: https://doi.org/10.1023/A:1010933404324.
Organisation for Economic Co-operation and Development. (2006). Report on the Regulatory Uses and Applications in OECD Member Countries of (Quantitative) Structure-Activity Relationship [(Q)SAR] Models in the Assessment of New and Existing Chemicals. OECD Papers, 6, 79‒157. DOI: https://doi.org/10.1787/oecd_papers-v6-art37-en.
Kuz’min, V.E, Polishchuk, P.G., & Artemenko, A.G. et al. (2011). Interpretation of QSAR models based on Random Forest method. Mol. Inf., 30(6-7), 593–603. DOI: https://doi.org/10.1002/minf.201000173.

