A method of Chinese and Thai cross-lingual query expansion based on comparable corpus

被引:0
|
作者
Tang P. [1 ]
Zhao J. [2 ]
Yu Z. [1 ]
Wang Z. [1 ]
Xian Y. [1 ]
机构
[1] Intelligent Information Processing Laboratory of Yunnan Province, Key Laboratory in Regional University of Yunnan Province, Kunming University of Science and Technology, Yunnan
[2] National University of Defense Technology, Hunan
来源
Yu, Zhengtao (ztyu@hotmail.com) | 2017年 / Korea Information Processing Society卷 / 13期
关键词
Comparable corpus; Cross-language information retrieval; Cross-language query expansion; Words relationship;
D O I
10.3745/JIPS.04.0039
中图分类号
G252.7 [文献检索]; G354 [情报检索];
学科分类号
摘要
Cross-lingual query expansion is usually based on the relationship among monolingual words. Bilingual comparable corpus contains relationships among bilingual words. Therefore, this paper proposes a method based on these relationships to conduct query expansion. First, the word vectors which characterize the bilingual words are trained using Chinese and Thai bilingual comparable corpus. Then, the correlation between Chinese query words and Thai words are computed based on these word vectors, followed with selecting the Thai candidate expansion terms via the correlative value. Then, multi-group Thai query expansion sentences are built by the Thai candidate expansion words based on Chinese query sentence. Finally, we can get the optimal sentence using the Chinese and Thai query expansion method, and perform the Thai query expansion. Experiment results show that the cross-lingual query expansion method we proposed can effectively improve the accuracy of Chinese and Thai cross-language information retrieval. © 2017 KIPS.
引用
收藏
页码:805 / 817
页数:12
相关论文
共 50 条
  • [21] A Knowledge Base Approach to Cross-Lingual Keyword Query Interpretation
    Zhang, Lei
    Rettinger, Achim
    Zhang, Ji
    SEMANTIC WEB - ISWC 2016, PT I, 2016, 9981 : 615 - 631
  • [22] Utilisation of Metadata Fields and Query Expansion in Cross-Lingual Search of User-Generated Internet Video
    Khwileh, Ahmad
    Ganguly, Debasis
    Jones, Gareth J. F.
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2016, 55 : 249 - 281
  • [23] Monolingual and Cross-Lingual Acceptability Judgments with the Italian CoLA corpus
    Trotta, Daniela
    Guarasci, Raffaele
    Leonardelli, Elisa
    Tonelli, Sara
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2929 - 2940
  • [24] A cross-guidance cross-lingual model on generated parallel corpus for classical Chinese machine reading comprehension
    Xiang, Junyi
    Liu, Maofu
    Li, Qiyuan
    Qiu, Chen
    Hu, Huijun
    INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (02)
  • [25] Design of a Speech Corpus for Research on Cross-Lingual Prosody Transfer
    Secujski, Milan
    Gerazov, Branislav
    Csapo, Tamas Gabor
    Delic, Vlado
    Garner, Philip N.
    Gjoreski, Aleksandar
    Guennec, David
    Ivanovski, Zoran
    Melov, Aleksandar
    Nemeth, Geza
    Stojkovic, Ana
    Szaszak, Gyoergy
    SPEECH AND COMPUTER, 2016, 9811 : 199 - 206
  • [26] A Multitask Cross-Lingual Summary Method Based on ABO Mechanism
    Li, Qing
    Wan, Weibing
    Zhao, Yuming
    APPLIED SCIENCES-BASEL, 2023, 13 (11):
  • [27] Cross-lingual document similarity estimation and dictionary generation with comparable corpora
    Stajner, Tadej
    Mladenic, Dunja
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 58 (03) : 729 - 743
  • [28] Cross-lingual document similarity estimation and dictionary generation with comparable corpora
    Tadej Štajner
    Dunja Mladenić
    Knowledge and Information Systems, 2019, 58 : 729 - 743
  • [29] A Learning to rank framework based on cross-lingual loss function for cross-lingual information retrieval
    Ghanbari, Elham
    Shakery, Azadeh
    APPLIED INTELLIGENCE, 2022, 52 (03) : 3156 - 3174
  • [30] A Learning to rank framework based on cross-lingual loss function for cross-lingual information retrieval
    Elham Ghanbari
    Azadeh Shakery
    Applied Intelligence, 2022, 52 : 3156 - 3174