Text Value and Linguistic Characterization in Chinese Language Literature Based on Text Mining Techniques

被引:0
|
作者
Liu M. [1 ]
Hu S. [1 ]
Qing W. [1 ]
机构
[1] Department of Teacher Education, Nanchong Vocational and Technical College, Sichuan, Nanchong
关键词
Chinese language; Labeled-LDA; literature; Semantic network; Textual disambiguation; Vector space modeling;
D O I
10.2478/amns-2024-0486
中图分类号
学科分类号
摘要
This study applies text mining techniques to deeply analyze Chinese language and literature’s text value and linguistic features. The study adopts the methods of textual disambiguation, vector space modeling, semantic network and Labeled-LDA model. Taking the novels of Yu Hua and Ge Fei as an example, it reveals the differences between the two writers in linguistic features such as using punctuation, average word length, and sentence discrete degree. The study provides a comprehensive heat score for the novels based on three dimensions: reading base group, reading gain, and reading discussion. The results show that the frequency of period use in Yu Hua’s works is decentralized, while Ge Fei’s works are more concentrated. Ge Fei’s average word length is slightly higher, showing a tendency to use multi-syllabic words. The novel popularity and heat scores conform to a power law distribution, reflecting the Pareto rule that 80% of the popularity is concentrated on 20% of the hot novels. This study provides a new perspective on Chinese language and literature through the application of text mining technology, and its methods and tools can effectively enhance the effectiveness and efficiency of teaching. © 2023 published by Sciendo.
引用
收藏
相关论文
共 50 条
  • [21] Working with Text. Tools, Techniques and Approaches for Text Mining
    Savoy, Jacques
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2018, 69 (01) : 181 - 184
  • [22] Status of text-mining techniques applied to biomedical text
    Erhardt, RAA
    Schneider, R
    Blaschke, C
    DRUG DISCOVERY TODAY, 2006, 11 (7-8) : 315 - 325
  • [23] Professional language in Swedish clinical text: Linguistic characterization and comparative studies
    Smith, Kelly
    Megyesi, Beata
    Velupillai, Sumithra
    Kvist, Maria
    NORDIC JOURNAL OF LINGUISTICS, 2014, 37 (02) : 297 - 323
  • [24] Compression techniques for Chinese text
    Vines, P
    Zobel, J
    SOFTWARE-PRACTICE & EXPERIENCE, 1998, 28 (12): : 1299 - 1314
  • [25] Compression techniques for Chinese text
    Vines, Phil
    Zobel, Justin
    Software - Practice and Experience, 1998, 28 (12): : 1299 - 1314
  • [26] Translation of Metaphorical Information in Japanese Literature Combined with Text Mining Techniques
    Wang, Tingting
    Applied Mathematics and Nonlinear Sciences, 2024, 9 (01):
  • [27] A Combination of Text Mining Techniques for Relevant Literature Search and Extractive Summarization
    Phongwattana, Thiptanawat
    Chan, Jonathan H.
    PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL (NLPIR 2018), 2018, : 7 - 11
  • [28] Cognitive and linguistic factors affecting alphasyllabary language users comprehending Chinese text
    Shum, Mark Shiu Kee
    Ki, Wing Wah
    Leong, Che Kan
    READING IN A FOREIGN LANGUAGE, 2014, 26 (01): : 153 - 175
  • [29] Text-mining Techniques and Tools for Systematic Literature Reviews: A Systematic Literature Review
    Feng, Luyi
    Chiam, Yin Kia
    Lo, Sin Kuang
    2017 24TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2017), 2017, : 41 - 50
  • [30] TEXT-MESS: Intelligent, Interactive and Multilingual Text Mining based on Human Language Technologies
    Martinez-Barco, Patricio
    Palomar, Manuel
    Pla, Ferran
    Rosso, Paolo
    Gonzalo, Julio
    Penas, Anselmo
    Ageno, Alicia
    Turmo, Jordi
    Urena-Lopez, L. Alfonso
    Martin-Valdivia, Ma Teresa
    Marti, M. Antonia
    Taule, Mariona
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2008, (41): : 317 - 318