The research of estimation model for the correlativity between words in Chinese text

被引:0
|
作者
Zhang, YS [1 ]
Cao, YD [1 ]
Chen, LC [1 ]
机构
[1] Beijing Inst Technol, Dept Comp Sci & Engn, Beijing 100081, Peoples R China
关键词
context correlativity; language environment related model; bi-orderly-neighbor model; mutual information;
D O I
暂无
中图分类号
TH7 [仪器、仪表];
学科分类号
0804 ; 080401 ; 081102 ;
摘要
The analysis and use of the relation between words in Chinese text by statistical method is discussed in this article. After understanding the importance of the relation between words we investigated the characteristics of measurement models such as mutual information and related degree, and constructed a cascade estimation model which is used to describe the bi-orderly-neighbor between neighboring words in Chinese. Then, based on the characteristic of the Chinese text proof distance information between words and context information of current word were combined with N-gram model, and a novel model of correlativity between words based on language environment is present. Finally, the two models were applied in Chinese text automatic proof and corresponding experiment data and results are presented. Results show that the language environment related model can describe the relation between words better than other models such as mutual information or related degree, and indicate a good effect in the automatic defection of Chinese text errors.
引用
收藏
页码:1174 / 1178
页数:5
相关论文
共 50 条
  • [1] Research on words segmentation technology in Chinese full text retrieval system
    Liu, Chang
    [J]. INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY II, PTS 1-4, 2013, 411-414 : 313 - 316
  • [2] A continuous model for the distances between coextensive words in a text
    Zoernig, Peter
    [J]. GLOTTOMETRICS, 2013, 25 : 54 - 68
  • [3] Discovering Chinese words from unsegmented text
    Ge, XP
    Pratt, W
    Smyth, P
    [J]. SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 1999, : 271 - 272
  • [4] Segmenting unrestricted Chinese text into prosodic words instead of lexical words
    Qian, Y
    Chu, M
    Peng, H
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 825 - 828
  • [5] A Chinese Short Text Semantic Similarity Computation Model Based on Stop Words and TongyiciCilin
    Tang Shancheng
    Bai Yunyue
    Ma Fuyu
    [J]. PROCEEDINGS OF 2017 6TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2017), 2017, : 310 - 314
  • [6] Research on Chinese News Text Classification Based on ERNIE Model
    Zhang, Wenxu
    [J]. PROCEEDINGS OF THE WORLD CONFERENCE ON INTELLIGENT AND 3-D TECHNOLOGIES, WCI3DT 2022, 2023, 323 : 89 - 100
  • [7] Research on Chinese Text Error Correction Based on Sequence Model
    Duan, Jianyong
    Yuan, Yang
    Wang, Hao
    Wei, Xiaopeng
    Tan, Zheng
    [J]. PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 154 - 159
  • [8] COMPREHENSION OF A TEXT AND LIAISON BETWEEN WORDS
    OLERON, G
    [J]. ACTA PSYCHOLOGICA, 1961, 19 (01) : 685 - 686
  • [9] COMPREHENSION OF THE TEXT AND ASSOCIATIONS BETWEEN WORDS
    OLERON, G
    [J]. ANNEE PSYCHOLOGIQUE, 1961, 61 (02): : 377 - 395
  • [10] RESEARCH ON EVENT EXTRACTION MODEL BASED ON SEMANTIC FEATURES OF CHINESE WORDS
    Zhu, Shaowu
    Sun, Haichun
    Jian, Hanying
    [J]. COMPUTING AND INFORMATICS, 2022, 41 (06) : 1625 - 1647