English-Vietnamese cross-language paraphrase identification using hybrid feature classes

被引:3
|
作者
Dien Dinh [1 ]
Nguyen Le Thanh [1 ]
机构
[1] VNU HCM, Univ Sci, Ho Chi Minh City, Vietnam
关键词
Paraphrase identification; Semantic similarity; Cross-language; BabelNet; Vietnamese;
D O I
10.1007/s10732-019-09411-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Paraphrase identification plays an important role with various applications in natural language processing tasks such as machine translation, bilingual information retrieval, plagiarism detection, etc. With the development of information technology and the Internet, the requirement of textual comparing is not only in the same language but also in many different language pairs. Especially in Vietnamese, detecting paraphrase in the English-Vietnamese pair of sentences is a high demand because English is one of the most popular foreign languages in Vietnam. However, the in-depth studies on cross- language paraphrase identification tasks between English and Vietnamese are still limited. Therefore, in this paper, we propose a method to identify the English-Vietnamese cross-language paraphrase cases, using hybrid feature classes. These classes are calculated by using the fuzzy-based method as well as the siamese recurrent model, and then combined to get the final result with a mathematical formula. The experimental results show that our model achieves 87.4% F-measure accuracy.
引用
收藏
页码:193 / 209
页数:17
相关论文
共 50 条
  • [41] Statistical Feature Extraction for Cross-Language Web Content Quality Assessment
    Geng, Guang-Gang
    Li, Xiao-Dong
    Wang, Li-Ming
    Wang, Wei
    Shen, Shuo
    PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), 2011, : 1129 - 1130
  • [42] Cross-language transfer of morphological awareness in Chinese-English bilinguals
    Pasquarella, Adrian
    Chen, Xi
    Lam, Katie
    Luo, Yang C.
    Ramirez, Gloria
    JOURNAL OF RESEARCH IN READING, 2011, 34 (01) : 23 - 42
  • [43] Arabic-English Corpus for Cross-Language Textual Similarity Detection
    Aljuaid, Hanan
    INFORMATION SCIENCE AND APPLICATIONS, 2020, 621 : 527 - 536
  • [44] Cross-language vowel perception and production by Japanese and Korean learners of English
    Ingram, JCL
    Park, SG
    JOURNAL OF PHONETICS, 1997, 25 (03) : 343 - 370
  • [45] Cross-Language Plagiarism Detection Method: Arabic vs. English
    Hattab, Ezz
    PROCEEDINGS 2015 INTERNATIONAL CONFERENCE ON DEVELOPMENTS IN ESYSTEMS ENGINEERING DESE 2015, 2015, : 141 - 144
  • [46] Query translation in Chinese-English cross-language information retrieval
    Zhang, YB
    Sun, L
    Du, L
    Sun, Y
    PROCEEDINGS OF THE 2000 JOINT SIGDAT CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND VERY LARGE CORPORA, 2000, : 104 - 109
  • [47] Cross-language differences in fundamental frequency range: A comparison of English and German
    Mennen, Ineke
    Schaeffler, Felix
    Docherty, Gerard
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2012, 131 (03): : 2249 - 2260
  • [48] Indonesian-English Transitive Translation for Cross-Language Information Retrieval
    Adriani, Mirna
    Hayurani, Herika
    Sari, Syandra
    ADVANCES IN MULTILINGUAL AND MULTIMODAL INFORMATION RETRIEVAL, 2008, 5152 : 127 - 133
  • [49] Asymmetric Cross-language Activation of Translations in Korean-English Bilinguals
    Kim, Ji Hyon
    Kim, Jin Ah
    Lee, Jin Myung
    Yang, Jae Hee
    JOURNAL OF COGNITIVE SCIENCE, 2018, 19 (01) : 69 - 98
  • [50] Chinese-English SMT for Cross-language Dialogue Agent Support
    Duan, Xiangyu
    Zhang, Min
    2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,