An Investigation of Cross-Language Information Retrieval for User-Generated Internet Video

被引:1
|
作者
Khwileh, Ahmad [1 ]
Ganguly, Debasis [1 ]
Jones, Gareth J. F. [1 ]
机构
[1] Dublin City Univ, Sch Comp, ADAPT Ctr, Dublin 9, Ireland
关键词
Cross-Language Video Retrieval; User generated content; User generated internet video search; TRACK; NORMALIZATION;
D O I
10.1007/978-3-319-24027-5_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Increasing amounts of user-generated video content are being uploaded to online repositories. This content is often very uneven in quality and topical coverage in different languages. The lack of material in individual languages means that cross-language information retrieval (CLIR) within these collections is required to satisfy the user's information need. Search over this content is dependent on available metadata, which includes user-generated annotations and often noisy transcripts of spoken audio. The effectiveness of CLIR depends on translation quality between query and content languages. We investigate CLIR effectiveness for the blip10000 archive of user-generated Internet video content. We examine the retrieval effectiveness using the title and free-text metadata provided by the uploader and automatic speech recognition (ASR) generated transcripts. Retrieval is carried out using the Divergence From Randomness models, and automatic translation using Google translate. Our experimental investigation indicates that different sources of evidence have different retrieval effectiveness and in particular differing levels of performance in CLIR. Specifically, we find that the retrieval effectiveness of the ASR source is significantly degraded in CLIR. Our investigation also indicates that for this task the Title source provides the most robust source of evidence for CLIR, and performs best when used in combination with other sources of evidence. We suggest areas for investigation to give most effective and robust CLIR performance for user-generated content.
引用
收藏
页码:117 / 129
页数:13
相关论文
共 50 条
  • [1] Cross-language information retrieval
    Nie, Jian-Yun
    [J]. Synthesis Lectures on Human Language Technologies, 2010, 3 (01): : 1 - 142
  • [2] Cross-Language Information Retrieval
    Federico, Marcello
    [J]. COMPUTATIONAL LINGUISTICS, 2011, 37 (02) : 411 - 412
  • [3] Cross-language information retrieval
    Oard, DW
    Diekema, AR
    [J]. ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY, 1998, 33 : 223 - 256
  • [4] A study of user profile representation for personalized cross-language information retrieval
    Zhou, Dong
    Lawless, Seamus
    Wu, Xuan
    Zhao, Wenyu
    Liu, Jianxun
    [J]. ASLIB JOURNAL OF INFORMATION MANAGEMENT, 2016, 68 (04) : 448 - 477
  • [5] Study on cross-language information retrieval
    Si, Shen
    [J]. PROCEEDINGS OF 2008 INTERNATIONAL PRE-OLYMPIC CONGRESS ON COMPUTER SCIENCE, VOL I: COMPUTER SCIENCE AND ENGINEERING, 2008, : 6 - 10
  • [6] Cross-language multimedia information retrieval
    Flank, S
    [J]. 6TH APPLIED NATURAL LANGUAGE PROCESSING CONFERENCE/1ST MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE AND PROCEEDINGS OF THE ANLP-NAACL 2000 STUDENT RESEARCH WORKSHOP, 2000, : 13 - 20
  • [7] Which user interaction for cross-language information retrieval? Design issues and reflections
    Petrelli, D
    Levin, S
    Beaulieu, M
    Sanderson, M
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2006, 57 (05): : 709 - 722
  • [8] User-assisted query translation for interactive cross-language information retrieval
    Oard, Douglas W.
    He, Daqing
    Wang, Jianqiang
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2008, 44 (01) : 181 - 211
  • [9] Cross-language Information Retrieval Based on Multiple Information
    Liu, Pengyuan
    Zheng, Zhijun
    Su, Qi
    [J]. 2018 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2018), 2018, : 623 - 626
  • [10] Neural Methods for Cross-Language Information Retrieval
    Yang, Eugene
    Lawrie, Dawn
    Mayfield, James
    Nair, Suraj
    Oard, Douglas W.
    [J]. PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 3430 - 3431