Multi-task Ranking with User Behaviors for Text-video Search

被引:3
|
作者
Liu, Peidong [1 ,2 ]
Liao, Dongliang [2 ]
Wang, Jinpeng [1 ,2 ]
Wu, Yangxin [2 ]
Li, Gongfu [2 ]
Xia, Shu-Tao [1 ,3 ]
Xu, Jin [4 ]
机构
[1] Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Beijing, Peoples R China
[2] Tencent Inc, Wechat Grp, Shenzhen, Peoples R China
[3] Peng Cheng Lab, Res Ctr Artificial Intelligence, Shenzhen, Peoples R China
[4] South China Univ Technol, Sch Future Technol, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Text-video Search; Ranking Model; Multi-task Learning; User Behaviors; Multi-modal Fusion;
D O I
10.1145/3487553.3524207
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-video search has become an important demand in many industrial video sharing platforms, e.g., YouTube, TikTok, and WeChat Channels, thereby attracting increasing research attention. Traditional relevance-based ranking methods for text-video search concentrate on exploiting the semantic relevance between video and query. However, relevance is no longer the principal issue in the ranking stage, because the candidate items retrieved from the matching stage naturally guarantee adequate relevance. Instead, we argue that boosting user satisfaction should be an ultimate goal for ranking and it is promising to excavate cheap and rich user behaviors for model training. To achieve this goal, we propose an effective Multi-Task Ranking pipeline with User Behaviors (MTRUB) for text-video search. Specifically, to exploit the multi-modal data effectively, we put forward a Heterogeneous Multi-modal Fusion Module (HMFM) to fuse the query and video features of different modalities in adaptive ways. Besides that, we design an Independent Multi-modal Input Scheme (IMIS) to alleviate competing task correlation problems in multi-task learning. Experiments on the offline dataset gathered from WeChat Search demonstrate that MTRUB outperforms the baseline by 12.0% in mean gAUC and 13.3% in mean nDCG@10. We also conduct live experiments on a large-scale mobile search engine, i.e., WeChat Search, and MTRUB obtains substantial improvement compared with the traditional relevance-based ranking model.
引用
收藏
页码:126 / 130
页数:5
相关论文
共 50 条
  • [21] A Multi-task Learning Framework for Product Ranking with BERT
    Wu, Xuyang
    Magnani, Alessandro
    Chaidaroon, Suthee
    Puthenputhussery, Ajit
    Liao, Ciya
    Fang, Yi
    PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 493 - 501
  • [22] Generative Multi-Task Learning for Text Classification
    Zhao, Wei
    Gao, Hui
    Chen, Shuhui
    Wang, Nan
    IEEE ACCESS, 2020, 8 : 86380 - 86387
  • [23] Multi-Task Label Embedding for Text Classification
    Zhang, Honglun
    Xiao, Liqiang
    Chen, Wenqing
    Wang, Yongkun
    Jin, Yaohui
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4545 - 4553
  • [24] Tchebycheff Procedure for Multi-task Text Classification
    Mao, Yuren
    Yung, Shuang
    Liu, Weiwei
    Du, Bo
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 4217 - 4226
  • [25] TASK AWARE MULTI-TASK LEARNING FOR SPEECH TO TEXT TASKS
    Indurthi, Sathish
    Zaidi, Mohd Abbas
    Lakumarapu, Nikhil Kumar
    Lee, Beomseok
    Han, Hyojung
    Ahn, Seokchan
    Kim, Sangha
    Kim, Chanwoo
    Hwang, Inchul
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7723 - 7727
  • [26] Boosted Multi-Task Learning for Face Verification With Applications to Web Image and Video Search
    Wang, Xiaogang
    Zhang, Cha
    Zhang, Zhengyou
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 142 - +
  • [27] Multi-task Video Enhancement for Dental Interventions
    Katsaros, Efklidis
    Ostrowski, Piotr K.
    Wlodarczak, Krzysztof
    Lewandowska, Emilia
    Ruminski, Jacek
    Siupka-Mroz, Damian
    Lassmann, Lukasz
    Jezierska, Anna
    Wesierski, Daniel
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VII, 2022, 13437 : 177 - 187
  • [28] Multi-task learning for video anomaly detection*
    Chang, Xingya
    Zhang, Yuxin
    Xue, Dingyu
    Chen, Dongyue
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 87
  • [29] Multi-task learning for video anomaly detection
    Chang, Xingya
    Zhang, Yuxin
    Xue, Dingyu
    Chen, Dongyue
    Journal of Visual Communication and Image Representation, 2022, 87
  • [30] Multi-task learning to rank for web search
    Chang, Yi
    Bai, Jing
    Zhou, Ke
    Xue, Gui-Rong
    Zha, Hongyuan
    Zheng, Zhaohui
    PATTERN RECOGNITION LETTERS, 2012, 33 (02) : 173 - 181