Compound Classification and Consideration of Correlation with Chemical Descriptors from Articles on Antioxidant Capacity Using Natural Language Processing

被引:0
|
作者
Matsumoto, Yuto [1 ]
Gotoh, Hiroaki [1 ]
机构
[1] Yokohama Natl Univ, Dept Chem & Life Sci, Yokohama 2408501, Japan
关键词
SOAC VALUES; EXTRACTION; QUERCETIN;
D O I
10.1021/acs.jcim.3c01826
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
In recent times, there has been a substantial increase in the number of articles focusing on antioxidants. However, the development of a comprehensive estimator for antioxidant capacity remains elusive due to the challenge of integrating information from these articles. Furthermore, the complexity of the antioxidant mechanism, which involves a multitude of factors, makes it difficult to establish a simple equation or correlation. Hence, there is a pressing need for a model that can effectively interpret the collective knowledge from these articles, especially from a chemistry perspective. In this research, we employed natural language processing techniques, specifically Word2Vec, to analyze articles related to antioxidant capacity. We extracted representation vectors of compound names from these documents and organized them into 10 distinct clusters. In our investigation of two of these clusters, we unveiled that the majority of the compounds in question were flavonoids and flavonoid glycosides. To establish a link between the descriptors and clusters, we utilized kernel density estimation and generated scatter plots to visualize their similarity. These visualizations clearly indicated a strong relationship between the descriptors and clusters, affirming that a tangible connection exists between word vectors and compound descriptors through a document analysis conducted with natural language processing techniques. This study represents a pioneering approach that utilizes document analysis to shed light on the field of antioxidant capacity research, marking a significant advancement in this domain.
引用
收藏
页码:119 / 127
页数:9
相关论文
共 50 条
  • [31] Classification and Prediction of Breast Cancer Data derived Using Natural Language Processing
    Rani, Johanna Johnsi G.
    Gladis, Dennis
    Mammen, Joy
    PROCEEDING OF THE THIRD INTERNATIONAL SYMPOSIUM ON WOMEN IN COMPUTING AND INFORMATICS (WCI-2015), 2015, : 250 - 255
  • [32] Explaining tourist revisit intention using natural language processing and classification techniques
    Andreas Gregoriades
    Maria Pampaka
    Herodotos Herodotou
    Evripides Christodoulou
    Journal of Big Data, 10
  • [33] Clickbait Pattern Detection and Classification of News Headlines using Natural Language Processing
    Manjesh, Suraj
    Kanakagiri, Tushar
    Vaishak, P.
    Chettiar, Vivek
    Shobha, G.
    2017 2ND INTERNATIONAL CONFERENCE ON COMPUTATIONAL SYSTEMS AND INFORMATION TECHNOLOGY FOR SUSTAINABLE SOLUTION (CSITSS-2017), 2017, : 153 - 158
  • [34] ICON: Instagram Profile Classification Using Image and Natural Language Processing Methods
    Guven, Ebu Yusuf
    Boyaci, Ali
    Saritemur, Fatma Nur
    Turk, Zehra
    Sutcu, Gizem
    Turna, Ozgur Can
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (02): : 2776 - 2783
  • [35] Automated Genre Classification of Books Using Machine Learning and Natural Language Processing
    Gupta, Shikha
    Agarwal, Mohit
    Jain, Satbir
    2019 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2019), 2019, : 269 - 272
  • [36] Towards High-Precision Stroke Classification Using Natural Language Processing
    Majersik, Jennifer J.
    Mowery, Danielle
    Zhang, Mingyuan
    Hill, Brent
    Cannon-Albright, Lisa A.
    Chapman, Wendy
    STROKE, 2018, 49
  • [37] Resume Classification System using Natural Language Processing and Machine Learning Techniques
    Ali, Irfan
    Mughal, Nimra
    Khand, Zahid Hussain
    Ahmed, Javed
    Mujtaba, Ghulam
    MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2022, 41 (01) : 65 - 79
  • [38] A Novel Approach to Music Genre Classification using Natural Language Processing and Spark
    Duggirala, Sharan
    Moh, Teng-Sheng
    PROCEEDINGS OF THE 2020 14TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM), 2020,
  • [39] Expert guided natural language processing using one-class classification
    Joffe, Erel
    Pettigrew, Emily J.
    Herskovic, Jorge R.
    Bearden, Charles F.
    Bernstam, Elmer V.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2015, 22 (05) : 962 - 966
  • [40] Natural language processing for aviation safety reports: From classification to interactive analysis
    Tanguy, Ludovic
    Tulechki, Nikola
    Urieli, Assaf
    Hermann, Eric
    Raynal, Celine
    COMPUTERS IN INDUSTRY, 2016, 78 : 80 - 95