Text Value and Linguistic Characterization in Chinese Language Literature Based on Text Mining Techniques

被引:0
|
作者
Liu M. [1 ]
Hu S. [1 ]
Qing W. [1 ]
机构
[1] Department of Teacher Education, Nanchong Vocational and Technical College, Sichuan, Nanchong
关键词
Chinese language; Labeled-LDA; literature; Semantic network; Textual disambiguation; Vector space modeling;
D O I
10.2478/amns-2024-0486
中图分类号
学科分类号
摘要
This study applies text mining techniques to deeply analyze Chinese language and literature’s text value and linguistic features. The study adopts the methods of textual disambiguation, vector space modeling, semantic network and Labeled-LDA model. Taking the novels of Yu Hua and Ge Fei as an example, it reveals the differences between the two writers in linguistic features such as using punctuation, average word length, and sentence discrete degree. The study provides a comprehensive heat score for the novels based on three dimensions: reading base group, reading gain, and reading discussion. The results show that the frequency of period use in Yu Hua’s works is decentralized, while Ge Fei’s works are more concentrated. Ge Fei’s average word length is slightly higher, showing a tendency to use multi-syllabic words. The novel popularity and heat scores conform to a power law distribution, reflecting the Pareto rule that 80% of the popularity is concentrated on 20% of the hot novels. This study provides a new perspective on Chinese language and literature through the application of text mining technology, and its methods and tools can effectively enhance the effectiveness and efficiency of teaching. © 2023 published by Sciendo.
引用
收藏
相关论文
共 50 条
  • [41] A Framework of Chinese Semantic Text Mining Based on Ontology Learning
    Zhang, Yu-feng
    Hu, Feng
    FOURTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2011): MACHINE VISION, IMAGE PROCESSING, AND PATTERN ANALYSIS, 2012, 8349
  • [42] Traditional Chinese Medicine Prescription Mining Based on Abstract Text
    Xie, Dan
    Pei, Wei
    Zhu, Weiwei
    Li, Xiaodong
    2017 IEEE 19TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES (HEALTHCOM), 2017,
  • [43] Analysis on Chinese quantitative stylistic features based on text mining
    Hou, Renkui
    Jiang, Minghu
    DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2016, 31 (02) : 357 - 367
  • [44] EMBERT: A Pre-trained Language Model for Chinese Medical Text Mining
    Cai, Zerui
    Zhang, Taolin
    Wang, Chengyu
    He, Xiaofeng
    WEB AND BIG DATA, APWEB-WAIM 2021, PT I, 2021, 12858 : 242 - 257
  • [45] Decarbonization of Turkey: Text Mining Based Topic Modeling for the Literature
    Yilmaz, Selin
    Yesil, Ercem
    Kaya, Tolga
    INTELLIGENT AND FUZZY SYSTEMS: DIGITAL ACCELERATION AND THE NEW NORMAL, INFUS 2022, VOL 2, 2022, 505 : 372 - 379
  • [46] Improving Literature-Based Discovery with Advanced Text Mining
    Korhonen, Anna
    Guo, Yufan
    Baker, Simon
    Yetisgen-Yildiz, Meliha
    Stenius, Ulla
    Narita, Masashi
    Lio, Pietro
    COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS, CIBB 2014, 2015, 8623 : 89 - 98
  • [47] NATURAL PRODUCTS FOR DIABETIC KIDNEY DISEASE: TEXT MINING OF THE CHINESE HISTORICAL LITERATURE
    Zhang, La
    Yang, Lihong
    Shergis, Johannah
    Zhang, Anthony Lin
    Xue, Charlie Changli
    Liu, Xusheng
    Guo, Xinfeng
    Zhang, Lei
    AMERICAN JOURNAL OF KIDNEY DISEASES, 2020, 75 (04) : 662 - 662
  • [48] Irony Recognition in Chinese Text Based on Linguistic Features and Attention Mechanism
    Qiu, Xiaofeng
    CHINESE LEXICAL SEMANTICS, CLSW 2022, PT II, 2023, 13496 : 351 - 363
  • [49] Mining and Application of Tourism Online Review Text Based on Natural Language Processing and Text Classification Technology
    Xu, Hongsheng
    Lv, Yanqing
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [50] Characterization of CSR, ESG, and Corporate Citizenship through a Text Mining-Based Review of Literature
    Park, Jong Gyu
    Park, Kijung
    Noh, Heena
    Kim, Yong Geun
    SUSTAINABILITY, 2023, 15 (05)