A method of Chinese and Thai cross-lingual query expansion based on comparable corpus

被引:0
|
作者
Tang P. [1 ]
Zhao J. [2 ]
Yu Z. [1 ]
Wang Z. [1 ]
Xian Y. [1 ]
机构
[1] Intelligent Information Processing Laboratory of Yunnan Province, Key Laboratory in Regional University of Yunnan Province, Kunming University of Science and Technology, Yunnan
[2] National University of Defense Technology, Hunan
来源
Yu, Zhengtao (ztyu@hotmail.com) | 2017年 / Korea Information Processing Society卷 / 13期
关键词
Comparable corpus; Cross-language information retrieval; Cross-language query expansion; Words relationship;
D O I
10.3745/JIPS.04.0039
中图分类号
G252.7 [文献检索]; G354 [情报检索];
学科分类号
摘要
Cross-lingual query expansion is usually based on the relationship among monolingual words. Bilingual comparable corpus contains relationships among bilingual words. Therefore, this paper proposes a method based on these relationships to conduct query expansion. First, the word vectors which characterize the bilingual words are trained using Chinese and Thai bilingual comparable corpus. Then, the correlation between Chinese query words and Thai words are computed based on these word vectors, followed with selecting the Thai candidate expansion terms via the correlative value. Then, multi-group Thai query expansion sentences are built by the Thai candidate expansion words based on Chinese query sentence. Finally, we can get the optimal sentence using the Chinese and Thai query expansion method, and perform the Thai query expansion. Experiment results show that the cross-lingual query expansion method we proposed can effectively improve the accuracy of Chinese and Thai cross-language information retrieval. © 2017 KIPS.
引用
收藏
页码:805 / 817
页数:12
相关论文
共 50 条
  • [31] Cross-lingual analysis of English and Chinese web search
    Lin, Peiguang
    Zhang, Tong
    Xia, Menglong
    Zhou, Jin
    Nie, Peiyao
    INTERNATIONAL JOURNAL OF WEB AND GRID SERVICES, 2018, 14 (04) : 376 - 399
  • [32] Query-dependent learning to rank for cross-lingual information retrieval
    Elham Ghanbari
    Azadeh Shakery
    Knowledge and Information Systems, 2019, 59 : 711 - 743
  • [33] A method for generating rules for cross-lingual transliteration
    V. K. Logacheva
    Automatic Documentation and Mathematical Linguistics, 2011, 45 (5) : 239 - 248
  • [34] Cross-Lingual Knowledge Distillation for Chinese Video Captioning
    Hou J.-Y.
    Qi Y.-Y.
    Wu X.-X.
    Jia Y.-D.
    Jisuanji Xuebao/Chinese Journal of Computers, 2021, 44 (09): : 1907 - 1921
  • [35] Query-dependent learning to rank for cross-lingual information retrieval
    Ghanbari, Elham
    Shakery, Azadeh
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 59 (03) : 711 - 743
  • [36] A Method for Generating Rules for Cross-lingual Transliteration
    Logacheva, V. K.
    AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS, 2011, 45 (05) : 239 - 248
  • [37] Development of Sentiment Lexicon in Bengali utilizing Corpus and Cross-lingual Resources
    Sazzed, Salim
    2020 IEEE 21ST INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2020), 2020, : 237 - 244
  • [38] Mongolian-Chinese Cross-lingual Topic Detection Based on Knowledge Distillation
    Wang, Yanli
    Ji, Yatu
    Sun, Baolei
    Ren, Qing-Dao-Er-Ji
    Wu, Nier
    Liu, Na
    Lu, Min
    Zhao, Chen
    Jia, Yepai
    2024 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, IALP 2024, 2024, : 383 - 388
  • [39] Tibetan-Chinese Cross-Lingual Sentiment Classification Based on Adversarial Network
    Zhang, Tingting
    Jiang, Tao
    Shan, Ruikang
    2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 340 - 345
  • [40] Using the Web corpus to translate the queries in cross-lingual information retrieval
    Zhang, JL
    Sun, L
    Min, JM
    PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 493 - 498