QMugs 1.1: Quantum mechanical properties of organic compounds commonly encountered in reactivity datasets

被引:6
|
作者
Neeser, Rebecca M. [1 ,2 ]
Isert, Clemens [1 ]
Stuyver, Thijs [2 ,3 ]
Schneider, Gisbert [1 ,4 ]
Coley, Connor W. [2 ,5 ]
机构
[1] Swiss Fed Inst Technol, Inst Pharmaceut Sci, Vladimir Prelog Weg 4, CH-8093 Zurich, Switzerland
[2] MIT, Dept Chem Engn, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[3] Univ PSL, Ecole Natl Super Chim Paris, Inst Chem Life & Hlth Sci, CNRS, F-75005 Paris, France
[4] ETH Singapore SEC Ltd, 1 CREATE Way,06-01 CREATE Tower, Singapore 138602, Singapore
[5] MIT, Dept Elect Engn, 77 Massachusetts Ave, Cambridge, MA 02139 USA
来源
CHEMICAL DATA COLLECTIONS | 2023年 / 46卷
基金
瑞士国家科学基金会;
关键词
DFT; Quantum chemistry; High-throughput screening; Machine learning; Chemical reactivity; DESIGN; CHEMBL; GRAPH;
D O I
10.1016/j.cdc.2023.101040
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Here, the Quantum Mechanical Properties of Drug-like Molecules (QMugs) dataset is expanded to facilitate its use as training data for surrogate machine learning models to predict quantum mechanical properties for tasks related to chemical reactivity. Small molecules from reaction databases as well as charged and boron-containing compounds from ChEMBL were added. Each of these compounds was passed through a pipeline of MMFF94s/UFF conformer generation, followed by GFN2-xTB optimization and finally a density functional theory single-point calculation at the omega B97X-D/def2-SVP level of theory. In total, 71,632 new molecules were evaluated in this manner. Steric (SASA) and dispersion ( P int ) descriptors were computed at the semiempirical GFN2-xTB level of theory for the lowest energy conformer of all species in the enlarged QMugs dataset. The expanded dataset aims to facilitate the construction of surrogate models of much broader scope than the original QMugs dataset which was limited to biologically active compounds.
引用
收藏
页数:7
相关论文
共 50 条