Fractional Similarity: Cross-Lingual Feature Selection for Search

被引:0
|
作者
Jagarlamudi, Jagadeesh [1 ]
Bennett, Paul N. [2 ]
机构
[1] Univ Maryland, College Pk, MD 20742 USA
[2] Microsoft Res, Redmond, WA 98052 USA
来源
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Training data as well as supplementary data such as usage-based click behavior may abound in one search market (i.e., a particular region, domain, or language) and be much scarcer in another market. Transfer methods attempt to improve performance in these resource-scarce markets by leveraging data across markets. However, differences in feature distributions across markets can change the optimal model. We introduce a method called Fractional Similarity, which uses query-based variance within a market to obtain more reliable estimates of feature deviations across markets. An empirical analysis demonstrates that using this scoring method as a feature selection criterion in cross-lingual transfer improves relevance ranking in the foreign language and compares favorably to a baseline based on KL divergence.
引用
收藏
页码:226 / +
页数:3
相关论文
共 50 条
  • [1] Cross-Lingual Document Similarity
    Muhic, Andrej
    Rupnik, Jan
    Skraba, Primoz
    PROCEEDINGS OF THE ITI 2012 34TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES (ITI), 2012, : 387 - 392
  • [2] Improved Cross-Lingual Document Similarity Measurement
    Isuranga, Udhan
    Sandaruwan, Janaka
    Athukorala, Udesh
    Dias, Gihan
    2020 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2020), 2020, : 45 - 49
  • [3] Model Selection for Cross-Lingual Transfer
    Chen, Yang
    Ritter, Alan
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 5675 - 5687
  • [4] Cross-lingual information retrieval by feature vectors
    Lilleng, Jeanine
    Tomassen, Stein L.
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PROCEEDINGS, 2007, 4592 : 229 - +
  • [5] Zero-shot cross-lingual transfer language selection using linguistic similarity
    Eronen, Juuso
    Ptaszynski, Michal
    Masui, Fumito
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)
  • [6] Cross-Lingual Information Retrieve in Sogou Search
    Xu, JingFang
    Zhai, Feifei
    Xue, Zhengshan
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1361 - 1361
  • [7] A cross-lingual spoken content search system
    Ajmera, Jitendra
    Verma, Ashish
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2268 - 2271
  • [8] Document Similarity for Arabic and Cross-Lingual Web Content
    Salhi, Ali
    Yahya, Adnan H.
    ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, 2018, 782 : 134 - 146
  • [10] A Sense Based Similarity Measure for Cross-Lingual Documents
    Huang, Hsun-Hui
    Yang, Horng-Chang
    Kuo, Yau-Hwang
    ISDA 2008: EIGHTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 1, PROCEEDINGS, 2008, : 9 - +