Text Mining and Analysis of Treatise on Febrile Diseases Based on Natural Language Processing

被引:0
|
作者
Kai Zhao [1 ]
Na Shi [1 ]
Zhen Sa [1 ]
Hua-Xing Wang [1 ]
Chun-Hua Lu [2 ]
Xiao-Ying Xu [1 ]
机构
[1] School of Traditional Chinese Medicine, Beijing University of Chinese Medicine
[2] School of Life Science, Beijing University of Chinese Medicine
关键词
Knowledge discovery; natural language processing; text mining; traditional Chinese medicine literature; treatise on febrile diseases;
D O I
暂无
中图分类号
R441.3 [发热]; TP391.1 [文字信息处理];
学科分类号
081203 ; 0835 ; 100208 ;
摘要
Objective:With using natural language processing (NLP) technology to analyze and process the text of "Treatise on Febrile Diseases (TFDs)"for the sake of finding important information, this paper attempts to apply NLP in the field of text mining of traditional Chinese medicine (TCM)literature. Materials and Methods:Based on the Python language, the experiment invoked the NLP toolkit such as Jieba, nltk, gensim,and sklearn library, and combined with Excel and Word software. The text of "TFDs" was sequentially cleaned, segmented, and moved the stopped words, and then implementing word frequency statistics and analysis, keyword extraction, named entity recognition (NER) and other operations, finally calculating text similarity. Results:Jieba can accurately identify the herbal name in "TFDs." Word frequency statistics based on the word segmentation found that "warm therapy" is an important treatment of "TFDs." Guizhi decoction is the main prescription,and five core decoctions are identified. Keyword extraction based on the term "frequency-inverse document frequency" algorithm is ideal.The accuracy of NER in "TFDs" is about 86%; latent semantic indexing model calculating the similarity,"Understanding of Synopsis of Golden Chamber (SGC)" is much more similar with "SGC" than with "TFDs." The results meet expectation. Conclusions:It lays a research foundation for applying NLP to the field of text mining of unstructured TCM literature. With the combination of deep learning technology,NLP as an important branch of artificial intelligence will have broader application prospective in the field of text mining in TCM literature and construction of TCM knowledge graph as well as TCM knowledge services.
引用
收藏
页码:67 / 73
页数:7
相关论文
共 50 条
  • [41] Augmenting Qualitative Text Analysis with Natural Language Processing: Methodological Study
    Guetterman, Timothy C.
    Chang, Tammy
    DeJonckheere, Melissa
    Basu, Tanmay
    Scruggs, Elizabeth
    Vydiswaran, Vinod
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2018, 20 (06)
  • [42] Neurolinguistic approach to natural language processing with applications to medical text analysis
    Duch, Wlodzisfaw
    Matykiewicz, Pawel
    Pestian, John
    NEURAL NETWORKS, 2008, 21 (10) : 1500 - 1510
  • [43] The Effect of Natural Language Processing on the Analysis of Unstructured Text: A Systematic Review
    Roldan-Baluis, Walter Luis
    Zapata, Noel Alcas
    Vasquez, Maria Soledad Manaccasa
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (05) : 43 - 51
  • [44] RESEARCH ON THE TEXT CLASSIFICATION BASED ON NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING
    Chen Keming
    Zheng Jianguo
    JOURNAL OF THE BALKAN TRIBOLOGICAL ASSOCIATION, 2016, 22 (03): : 2484 - 2494
  • [45] A Compression-Based Toolkit for Modelling and Processing Natural Language Text
    Teahan, William John
    INFORMATION, 2018, 9 (12)
  • [46] Natural language processing as a technique for conducting text-based research
    Allen, Laura K.
    Creer, Sarah D.
    Poulos, Mary Cati
    LANGUAGE AND LINGUISTICS COMPASS, 2021, 15 (07):
  • [47] Text Mining and Natural Language Processing Approaches for Automatic Categorization of Lay Requests to Web-Based Expert Forums
    Himmel, Wolfgang
    Reincke, Ulrich
    Michelmann, Hans Wilhelm
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2009, 11 (03)
  • [48] Natural language processing for Nepali text: a review
    Tej Bahadur Shahi
    Chiranjibi Sitaula
    Artificial Intelligence Review, 2022, 55 : 3401 - 3429
  • [49] Natural language processing for Nepali text: a review
    Shahi, Tej Bahadur
    Sitaula, Chiranjibi
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (04) : 3401 - 3429
  • [50] Urdu text translation with Natural Language Processing
    Shaikh, MK
    Khowaja, HHA
    Khan, MA
    SCONEST 2004: STUDENT CONFERENCE ON ENGINEERING SCIENCES AND TECHNOLOGY, 2002, : 81 - 85