A method of Chinese and Thai cross-lingual query expansion based on comparable corpus

被引:0
|
作者
Tang P. [1 ]
Zhao J. [2 ]
Yu Z. [1 ]
Wang Z. [1 ]
Xian Y. [1 ]
机构
[1] Intelligent Information Processing Laboratory of Yunnan Province, Key Laboratory in Regional University of Yunnan Province, Kunming University of Science and Technology, Yunnan
[2] National University of Defense Technology, Hunan
来源
Yu, Zhengtao (ztyu@hotmail.com) | 2017年 / Korea Information Processing Society卷 / 13期
关键词
Comparable corpus; Cross-language information retrieval; Cross-language query expansion; Words relationship;
D O I
10.3745/JIPS.04.0039
中图分类号
G252.7 [文献检索]; G354 [情报检索];
学科分类号
摘要
Cross-lingual query expansion is usually based on the relationship among monolingual words. Bilingual comparable corpus contains relationships among bilingual words. Therefore, this paper proposes a method based on these relationships to conduct query expansion. First, the word vectors which characterize the bilingual words are trained using Chinese and Thai bilingual comparable corpus. Then, the correlation between Chinese query words and Thai words are computed based on these word vectors, followed with selecting the Thai candidate expansion terms via the correlative value. Then, multi-group Thai query expansion sentences are built by the Thai candidate expansion words based on Chinese query sentence. Finally, we can get the optimal sentence using the Chinese and Thai query expansion method, and perform the Thai query expansion. Experiment results show that the cross-lingual query expansion method we proposed can effectively improve the accuracy of Chinese and Thai cross-language information retrieval. © 2017 KIPS.
引用
收藏
页码:805 / 817
页数:12
相关论文
共 50 条
  • [41] Domain Adaptation for Cross-Lingual Query Classification using Search Query Logs and Document Classification
    Hady, Mohamed Farouk Abdel
    Ibrahim, Rania
    Ashour, Ahmed
    2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,
  • [42] Query expansion based on term selection for Hindi - English cross lingual IR
    Chandra, Ganesh
    Dwivedi, Sanjay K.
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2020, 32 (03) : 310 - 319
  • [43] Revisiting Cross-Lingual Summarization: A Corpus-based Study and A New Benchmark with Improved Annotation
    Chen, Yulong
    Zhang, Huajian
    Zhou, Yijie
    Bai, Xuefeng
    Wang, Yueguan
    Zhong, Ming
    Yan, Jianhao
    Li, Yafu
    Li, Judy
    Zhu, Michael
    Zhang, Yue
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 1 - 18
  • [44] Cross-Lingual Topic Discovery From Multilingual Search Engine Query Log
    Jiang, Di
    Tong, Yongxin
    Song, Yuanfeng
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2016, 35 (02)
  • [45] Cross-Lingual Entity Query from Large-Scale Knowledge Graphs
    Su, Yonghao
    Zhang, Chi
    Li, Jinyang
    Wang, Chengyu
    Qian, Weining
    Zhou, Aoying
    WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2015 WORKSHOPS, 2015, 9461 : 139 - 150
  • [46] Learning Tibetan-Chinese cross-lingual word embeddings
    Ma, Wei
    Yu, Hongzhi
    Zhao, Kun
    Zhao, Deshun
    2019 15TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG 2019), 2019, : 49 - 53
  • [47] WikiTranslate: Query Translation for Cross-Lingual Information Retrieval Using Only Wikipedia
    Nguyen, Dong
    Overwijk, Arnold
    Hauff, Claudia
    Trieschnigg, Dolf R. B.
    Hiemstra, Djoerd
    de Jong, Franciska
    EVALUATING SYSTEMS FOR MULTILINGUAL AND MULTIMODAL INFORMATION ACCESS, 2009, 5706 : 58 - 65
  • [48] Augmenting Passage Representations with Query Generation for Enhanced Cross-Lingual Dense Retrieval
    Zhuang, Shengyao
    Shou, Linjun
    Zuccon, Guido
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1827 - 1832
  • [49] Using query-relevant documents pairs for cross-lingual information retrieval
    Pinto, David
    Juan, Alfons
    Rosso, Paolo
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2007, 4629 : 630 - 637
  • [50] Cross-lingual Inflection as a Data Augmentation Method for Parsing
    Munoz-Ortiz, Alberto
    Gomez-Rodriguez, Carlos
    Vilares, David
    PROCEEDINGS OF THE THIRD WORKSHOP ON INSIGHTS FROM NEGATIVE RESULTS IN NLP (INSIGHTS 2022), 2022, : 54 - 61