Character N-grams translation in cross-language information retrieval

被引:0
|
作者
Vilares, Jesus [1 ]
Oakes, Michael P. [2 ]
Vilares, Manuel [3 ]
机构
[1] Univ A Coruna, Dept Comp Sci, Campus Elvinas S-N, La Coruna 15071, Spain
[2] Univ Sunderland, Sch Comp &Technol, Sunderland SR6 0DD, Durham, England
[3] Univ Vigo, Dept Comp Sci, Orense 32004, Spain
来源
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PROCEEDINGS | 2007年 / 4592卷
关键词
Cross-Language Information Retrieval; character N-grams; translation algorithms; alignment algorithms; association measures;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a new technique for the direct translation of character n-grams for use in Cross-Language Information Retrieval systems. This solution avoids the need for word normalization during indexing or translation, and it can also deal with out-of-vocabulary words. This knowledge-light approach does not rely on language-specific processing, and it can be used with languages of very different natures even when linguistic information and resources are scarce or unavailable. Our proposal also tries to achieve a higher speed during the n-gram alignment process with respect to previous approaches.
引用
收藏
页码:217 / +
页数:2
相关论文
共 50 条
  • [1] On the feasibility of character n-grams pseudo-translation for Cross-Language Information Retrieval tasks
    Vilares, Jesus
    Vilares, Manuel
    Alonso, Miguel A.
    Oakes, Michael P.
    COMPUTER SPEECH AND LANGUAGE, 2016, 36 : 136 - 164
  • [2] Translation Techniques in Cross-Language Information Retrieval
    Zhou, Dong
    Truran, Mark
    Brailsford, Tim
    Wade, Vincent
    Ashman, Helen
    ACM COMPUTING SURVEYS, 2012, 45 (01)
  • [3] Translation Ambiguity in Cross-Language Information Retrieval
    Sadat, Fatiha
    BUSINESS TRANSFORMATION THROUGH INNOVATION AND KNOWLEDGE MANAGEMENT: AN ACADEMIC PERSPECTIVE, VOLS 1-2, 2010, : 301 - 303
  • [4] Fast document translation for cross-language information retrieval
    McCarley, JS
    Roukos, S
    MACHINE TRANSLATION AND THE INFORMATION SOUP, 1998, 1529 : 150 - 157
  • [5] Part of speech n-grams and Information Retrieval
    Lioma, Christina
    van Rijsbergen, C. J. Keith
    REVUE FRANCAISE DE LINGUISTIQUE APPLIQUEE, 2008, 13 (01): : 9 - 22
  • [6] N-grams for translation and retrieval in CL-SDR
    McNamee, P
    Mayfield, J
    COMPARATIVE EVALUATION OF MULTILINGUAL INFORMATION ACCESS SYSTEMS, 2003, 3237 : 658 - 663
  • [7] s-grams:: Defining generalized n-grams for information retrieval
    Jarvelin, Anni
    Jarvelin, Antti
    Jarvelin, Kalervo
    INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (04) : 1005 - 1019
  • [8] A Comparative Study on Translation Disambiguation for Cross-Language Information Retrieval
    Sadat, Fatiha
    KNOWLEDGE MANAGEMENT AND INNOVATION IN ADVANCING ECONOMIES-ANALYSES & SOLUTIONS, VOLS 1-3, 2009, : 1326 - 1337
  • [9] Cross-language information retrieval
    Nie J.-Y.
    Synthesis Lectures on Human Language Technologies, 2010, 3 (01): : 1 - 142
  • [10] Statistical query translation models for cross-language information retrieval
    Microsoft Research
    不详
    不详
    不详
    不详
    ACM Trans. Asian Lang. Inf. Process., 2006, 4 (323-359): : 323 - 359