Using KCCA for Japanese–English cross-language information retrieval and document classification

被引:9
|
作者
Yaoyong Li
John Shawe-Taylor
机构
[1] The University of Sheffield,Department of Computer Science
[2] University of Southampton,ISIS Group, School of Electronics and Computer Science
关键词
Cross-language information retrieval; Machine learning; Kernel canonical correlation analysis; Unsupervised learning; Cross-language Japanese–English document retrieval and classification;
D O I
暂无
中图分类号
学科分类号
摘要
Kernel Canonical Correlation Analysis (KCCA) is a method of correlating linear relationship between two variables in a kernel defined feature space. A machine learning algorithm based on KCCA is studied for cross-language information retrieval. We apply the algorithm in Japanese–English cross-language information retrieval. The results are quite encouraging and are significantly better than those obtained by other state of the art methods. Computational complexity is an important issue when applying KCCA to large dataset as in information retrieval. We experimentally evaluate several methods to alleviate the problem of applying KCCA to large datasets. We also investigate cross-language document classification using KCCA as well as other methods. Our results show that it is feasible to use a classifier learned in one language to classify the documents in other languages.
引用
收藏
页码:117 / 133
页数:16
相关论文
共 50 条
  • [1] Using KCCA for Japanese-English cross-language information retrieval and document classification
    Li, Yaoyong
    Shawe-Taylor, John
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2006, 27 (02) : 117 - 133
  • [2] Cross-Language Information Retrieval using Japanese and English WordNets
    Ueno, Ryo
    Klyuev, Vitaly
    [J]. 2012 INTERNATIONAL CONFERENCE ON APPLIED INFORMATICS AND COMMUNICATION (ICAIC 2012), 2013, : 198 - 203
  • [3] Japanese-english cross-language information retrieval integrating query and document translation methods
    Graduate School of Library, Information and Media Studies, University of Tsukuba, Tsukuba, 305-8550, Japan
    不详
    不详
    不详
    不详
    不详
    不详
    [J]. Syst Comput Jpn, 2006, 2 (96-105):
  • [4] Japanese/English Cross-Language Information Retrieval: Exploration of Query Translation and Transliteration
    Atsushi Fujii
    Tetsuya Ishikawa
    [J]. Computers and the Humanities, 2001, 35 : 389 - 420
  • [5] Japanese/English cross-language information retrieval: Exploration of query translation and transliteration
    Fujii, A
    Ishikawa, T
    [J]. COMPUTERS AND THE HUMANITIES, 2001, 35 (04): : 389 - 420
  • [6] Fast document translation for cross-language information retrieval
    McCarley, JS
    Roukos, S
    [J]. MACHINE TRANSLATION AND THE INFORMATION SOUP, 1998, 1529 : 150 - 157
  • [7] A comparison of query translation methods for English-Japanese cross-language information retrieval
    Jones, G
    Sakai, T
    Collier, N
    Kumano, A
    Sumita, K
    [J]. SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 1999, : 269 - 270
  • [8] Support for interactive document selection in cross-language information retrieval
    Oard, DW
    Resnik, P
    [J]. INFORMATION PROCESSING & MANAGEMENT, 1999, 35 (03) : 363 - 379
  • [9] Support for interactive document selection in cross-language information retrieval
    Oard, Douglas W.
    Resnik, Philip
    [J]. Information Processing and Management, 1999, 35 (03): : 363 - 379
  • [10] Cross-language information retrieval
    Nie, Jian-Yun
    [J]. Synthesis Lectures on Human Language Technologies, 2010, 3 (01): : 1 - 142