English-Vietnamese cross-language paraphrase identification using hybrid feature classes

被引:3
|
作者
Dien Dinh [1 ]
Nguyen Le Thanh [1 ]
机构
[1] VNU HCM, Univ Sci, Ho Chi Minh City, Vietnam
关键词
Paraphrase identification; Semantic similarity; Cross-language; BabelNet; Vietnamese;
D O I
10.1007/s10732-019-09411-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Paraphrase identification plays an important role with various applications in natural language processing tasks such as machine translation, bilingual information retrieval, plagiarism detection, etc. With the development of information technology and the Internet, the requirement of textual comparing is not only in the same language but also in many different language pairs. Especially in Vietnamese, detecting paraphrase in the English-Vietnamese pair of sentences is a high demand because English is one of the most popular foreign languages in Vietnam. However, the in-depth studies on cross- language paraphrase identification tasks between English and Vietnamese are still limited. Therefore, in this paper, we propose a method to identify the English-Vietnamese cross-language paraphrase cases, using hybrid feature classes. These classes are calculated by using the fuzzy-based method as well as the siamese recurrent model, and then combined to get the final result with a mathematical formula. The experimental results show that our model achieves 87.4% F-measure accuracy.
引用
收藏
页码:193 / 209
页数:17
相关论文
共 50 条
  • [31] CROSS-LANGUAGE CONCEPTUAL PRIMING IN ENGLISH-SPANISH BILINGUALS
    FRANCIS, WS
    BJORK, RA
    BULLETIN OF THE PSYCHONOMIC SOCIETY, 1992, 30 (06) : 479 - 479
  • [32] A framework for cross-language information access: Application to English and Japanese
    Jones, G
    Collier, N
    Sakai, T
    Sumita, K
    Hirakawa, H
    COMPUTERS AND THE HUMANITIES, 2001, 35 (04): : 371 - 388
  • [33] VOICE TIMING - CROSS-LANGUAGE EXPERIMENTS IN IDENTIFICATION AND DISCRIMINATION
    ABRAMSON, AS
    LISKER, L
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1968, 44 (01): : 377 - &
  • [34] Using feature films in language classes
    Seferoglu, Golge
    EDUCATIONAL STUDIES, 2008, 34 (01) : 1 - 9
  • [35] A Hybrid Cross-Language Name Matching Technique using Novel Modified Levenshtein Distance
    Medhat, Doaa
    Hassan, Ahmed
    Salama, Cherif
    2015 TENTH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2015, : 204 - 209
  • [36] Cross-language French-English question answering using the DLT system at CLEF 2004
    Sutcliffe, RFE
    Gabbay, I
    Mulcahy, M
    O'Gorman, A
    MULTILINGUAL INFORMATION ACCESS FOR TEXT, SPEECH AND IMAGES, 2005, 3491 : 404 - 410
  • [37] Cross-language French-English question answering using the DLT system at CLEF 2003
    Sutcliffe, RFE
    Gabbay, I
    O'Gorman, A
    COMPARATIVE EVALUATION OF MULTILINGUAL INFORMATION ACCESS SYSTEMS, 2003, 3237 : 572 - 580
  • [38] Cross-language French-English question answering using the DLT system at CLEF 2005
    Sutcliffe, Richard F. E.
    Mulcahy, Michael
    Gabbay, Igal
    O'Gorman, Aoife
    Slattery, Darina
    ACCESSING MULTILINGUAL INFORMATION REPOSITORIES, 2006, 4022 : 502 - 509
  • [39] Applying content and language integrated learning in legal English classes: A Vietnamese perspective
    Nhac, Thanh-Huong
    ISSUES IN EDUCATIONAL RESEARCH, 2023, 33 (04): : 1513 - 1531
  • [40] Compiling Cross-Language Network Programs Into Hybrid Data Plane
    Li, Hao
    Zhang, Peng
    Sun, Guangda
    Cao, Wanyue
    Hu, Chengchen
    Shan, Danfeng
    Pan, Tian
    Fu, Qiang
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2022, 30 (03) : 1088 - 1103