Fractional Similarity: Cross-Lingual Feature Selection for Search

被引:0
|
作者
Jagarlamudi, Jagadeesh [1 ]
Bennett, Paul N. [2 ]
机构
[1] Univ Maryland, College Pk, MD 20742 USA
[2] Microsoft Res, Redmond, WA 98052 USA
来源
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Training data as well as supplementary data such as usage-based click behavior may abound in one search market (i.e., a particular region, domain, or language) and be much scarcer in another market. Transfer methods attempt to improve performance in these resource-scarce markets by leveraging data across markets. However, differences in feature distributions across markets can change the optimal model. We introduce a method called Fractional Similarity, which uses query-based variance within a market to obtain more reliable estimates of feature deviations across markets. An empirical analysis demonstrates that using this scoring method as a feature selection criterion in cross-lingual transfer improves relevance ranking in the foreign language and compares favorably to a baseline based on KL divergence.
引用
收藏
页码:226 / +
页数:3
相关论文
共 50 条
  • [41] SimCSum: Joint Learning of Simplification and Cross-lingual Summarization for Cross-lingual Science Journalism
    Fatima, Mehwish
    Kolber, Tim
    Markert, Katja
    Strube, Michael
    NewSumm 2023 - Proceedings of the 4th New Frontiers in Summarization Workshop, Proceedings of EMNLP Workshop, 2023, : 24 - 40
  • [42] Cross-lingual sentiment classification: Similarity discovery plus training data adjustment
    Zhang, Peng
    Wang, Suge
    Li, Deyu
    KNOWLEDGE-BASED SYSTEMS, 2016, 107 : 129 - 141
  • [43] Categorical color perception shown in a cross-lingual comparison of visual search
    Wakui, Elley
    Mylonas, Dimitris
    Caparos, Serge
    Davidoff, Jules
    COLOR RESEARCH AND APPLICATION, 2025,
  • [44] Cross-lingual Emotion Detection
    Hassan, Sabit
    Shaar, Shaden
    Darwish, Kareem
    2022 Language Resources and Evaluation Conference, LREC 2022, 2022, : 6948 - 6958
  • [45] The Secret is in the Spectra: Predicting Cross-lingual Task Performance with Spectral Similarity Measures
    Dubossarsky, Haim
    Vulic, Ivan
    Reichart, Roi
    Korhonen, Anna
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2377 - 2390
  • [46] Extending Monolingual Semantic Textual Similarity Task to Multiple Cross-lingual Settings
    Hayashi, Yoshihiko
    Luo, Wentao
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 1233 - 1239
  • [47] Exploring the Cross-Lingual Similarity of Valmiki Ramayana Using Semantic and Sentiment Analysis
    Kulkarni, Pooja
    Birajdar, Gajanan K.
    VIETNAM JOURNAL OF COMPUTER SCIENCE, 2025,
  • [48] Cross-Cultural Similarity Features for Cross-Lingual Transfer Learning of Pragmatically Motivated Tasks
    Sun, Jimin
    Ahn, Hwijeen
    Park, Chan Young
    Tsvetkov, Yulia
    Mortensen, David R.
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2403 - 2414
  • [49] Cross-lingual talker discrimination
    Wester, Mirjam
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1253 - 1256
  • [50] Cross-Lingual Word Embeddings
    Søgaard A.
    Vulić I.
    Ruder S.
    Faruqui M.
    Synthesis Lectures on Human Language Technologies, 2019, 12 (02): : 1 - 132