Treat Molecular Linear Notations as Sentences: Accurate Quantitative Structure-Property Relationship Modeling via a Natural Language Processing Approach
被引:7
|
作者:
Zhou, Zhengtao
论文数: 0引用数: 0
h-index: 0
机构:
Chongqing Univ, Sch Chem & Chem Engn, Chongqing 400044, Peoples R ChinaChongqing Univ, Sch Chem & Chem Engn, Chongqing 400044, Peoples R China
Zhou, Zhengtao
[1
]
Eden, Mario
论文数: 0引用数: 0
h-index: 0
机构:
Auburn Univ, Dept Chem Engn, Auburn, AL 36849 USAChongqing Univ, Sch Chem & Chem Engn, Chongqing 400044, Peoples R China
Eden, Mario
[2
]
Shen, Weifeng
论文数: 0引用数: 0
h-index: 0
机构:
Chongqing Univ, Sch Chem & Chem Engn, Chongqing 400044, Peoples R ChinaChongqing Univ, Sch Chem & Chem Engn, Chongqing 400044, Peoples R China
Shen, Weifeng
[1
]
机构:
[1] Chongqing Univ, Sch Chem & Chem Engn, Chongqing 400044, Peoples R China
[2] Auburn Univ, Dept Chem Engn, Auburn, AL 36849 USA
WATER PARTITION-COEFFICIENTS;
ORGANIC-COMPOUNDS;
DRUG DISCOVERY;
PREDICTION;
SMILES;
QSPRS;
D O I:
10.1021/acs.iecr.2c04070
中图分类号:
TQ [化学工业];
学科分类号:
0817 ;
摘要:
Quantitative structure-property relationship (QSPR) modeling is an implementation for estimating molecular properties based on structural information, which is widely applied in exploring new solvents, pharmaceuticals, and materials with desired properties. In QSPR modeling, "simplified molecular input line-entry system " (SMILES) is a popular molecular representation with specific vocabulary and syntax. Herein, SMILES is considered a chemical language, and each SMILES notation is treated as a sentence. A deep pyramid convolutional neural network architecture is constructed for extracting the information from SMILES "sentences ", and the feed-forward neural network is used for the property correlation. A case study of predicting the logarithm values of the octanol-water partition coefficient is conducted to prove the effectiveness of the proposed philosophy. Compared with a precedent reference model, the outperformance of the developed QSPR models provides fascinating insights for applying natural language processing technologies for molecular information mining and exploration of chemical property space.
机构:
Univ Perugia, Dept Chem Biol & Biotechnol, Drug Design & Mol Modeling Lab, Via Elce di Sotto 8, I-06123 Perugia, ItalyUniv Perugia, Dept Chem Biol & Biotechnol, Drug Design & Mol Modeling Lab, Via Elce di Sotto 8, I-06123 Perugia, Italy
Tortorella, Sara
De Angelis, Filippo
论文数: 0引用数: 0
h-index: 0
机构:
CNR, ISTM, CLHYO, Via Elce di Sotto 8, I-06123 Perugia, ItalyUniv Perugia, Dept Chem Biol & Biotechnol, Drug Design & Mol Modeling Lab, Via Elce di Sotto 8, I-06123 Perugia, Italy
机构:
China Univ Petr, Petr Mol Engn Ctr PMEC, State Key Lab Heavy Oil Proc, Beijing 102249, Peoples R ChinaChina Univ Petr, Petr Mol Engn Ctr PMEC, State Key Lab Heavy Oil Proc, Beijing 102249, Peoples R China
Cai, Guang-Qing
Zhang, Lin-Zhou
论文数: 0引用数: 0
h-index: 0
机构:
China Univ Petr, Petr Mol Engn Ctr PMEC, State Key Lab Heavy Oil Proc, Beijing 102249, Peoples R ChinaChina Univ Petr, Petr Mol Engn Ctr PMEC, State Key Lab Heavy Oil Proc, Beijing 102249, Peoples R China
机构:
State Key Laboratory of Heavy Oil Processing, Petroleum Molecular Engineering Center (PMEC), China University of PetroleumState Key Laboratory of Heavy Oil Processing, Petroleum Molecular Engineering Center (PMEC), China University of Petroleum
Guang-Qing Cai
Lin-Zhou Zhang
论文数: 0引用数: 0
h-index: 0
机构:
State Key Laboratory of Heavy Oil Processing, Petroleum Molecular Engineering Center (PMEC), China University of PetroleumState Key Laboratory of Heavy Oil Processing, Petroleum Molecular Engineering Center (PMEC), China University of Petroleum
机构:
S China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R China
Zhaoqing Univ, Sch Chem & Chem Engn, Zhaoqing, Peoples R ChinaS China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R China
Wu, Wensheng
Zhang, Canyang
论文数: 0引用数: 0
h-index: 0
机构:
S China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R ChinaS China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R China
Zhang, Canyang
Lin, Wenjing
论文数: 0引用数: 0
h-index: 0
机构:
S China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R ChinaS China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R China
Lin, Wenjing
Chen, Quan
论文数: 0引用数: 0
h-index: 0
机构:
S China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R ChinaS China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R China
Chen, Quan
Guo, Xindong
论文数: 0引用数: 0
h-index: 0
机构:
S China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R ChinaS China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R China
Guo, Xindong
Qian, Yu
论文数: 0引用数: 0
h-index: 0
机构:
S China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R ChinaS China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R China
Qian, Yu
Zhang, Lijuan
论文数: 0引用数: 0
h-index: 0
机构:
S China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R ChinaS China Univ Technol, Sch Chem & Chem Engn, Guangzhou 510641, Guangdong, Peoples R China
机构:
IFP Energies Nouvelles, Rond Point Echangeur Solaize, BP 3, F-69360 Solaize, FranceIFP Energies Nouvelles, Rond Point Echangeur Solaize, BP 3, F-69360 Solaize, France
Khabzina, Y.
Laroche, C.
论文数: 0引用数: 0
h-index: 0
机构:
IFP Energies Nouvelles, Rond Point Echangeur Solaize, BP 3, F-69360 Solaize, FranceIFP Energies Nouvelles, Rond Point Echangeur Solaize, BP 3, F-69360 Solaize, France
Laroche, C.
Perez-Pellitero, J.
论文数: 0引用数: 0
h-index: 0
机构:
IFP Energies Nouvelles, Rond Point Echangeur Solaize, BP 3, F-69360 Solaize, FranceIFP Energies Nouvelles, Rond Point Echangeur Solaize, BP 3, F-69360 Solaize, France