Multi-task Ranking with User Behaviors for Text-video Search

被引:3
|
作者
Liu, Peidong [1 ,2 ]
Liao, Dongliang [2 ]
Wang, Jinpeng [1 ,2 ]
Wu, Yangxin [2 ]
Li, Gongfu [2 ]
Xia, Shu-Tao [1 ,3 ]
Xu, Jin [4 ]
机构
[1] Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Beijing, Peoples R China
[2] Tencent Inc, Wechat Grp, Shenzhen, Peoples R China
[3] Peng Cheng Lab, Res Ctr Artificial Intelligence, Shenzhen, Peoples R China
[4] South China Univ Technol, Sch Future Technol, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Text-video Search; Ranking Model; Multi-task Learning; User Behaviors; Multi-modal Fusion;
D O I
10.1145/3487553.3524207
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-video search has become an important demand in many industrial video sharing platforms, e.g., YouTube, TikTok, and WeChat Channels, thereby attracting increasing research attention. Traditional relevance-based ranking methods for text-video search concentrate on exploiting the semantic relevance between video and query. However, relevance is no longer the principal issue in the ranking stage, because the candidate items retrieved from the matching stage naturally guarantee adequate relevance. Instead, we argue that boosting user satisfaction should be an ultimate goal for ranking and it is promising to excavate cheap and rich user behaviors for model training. To achieve this goal, we propose an effective Multi-Task Ranking pipeline with User Behaviors (MTRUB) for text-video search. Specifically, to exploit the multi-modal data effectively, we put forward a Heterogeneous Multi-modal Fusion Module (HMFM) to fuse the query and video features of different modalities in adaptive ways. Besides that, we design an Independent Multi-modal Input Scheme (IMIS) to alleviate competing task correlation problems in multi-task learning. Experiments on the offline dataset gathered from WeChat Search demonstrate that MTRUB outperforms the baseline by 12.0% in mean gAUC and 13.3% in mean nDCG@10. We also conduct live experiments on a large-scale mobile search engine, i.e., WeChat Search, and MTRUB obtains substantial improvement compared with the traditional relevance-based ranking model.
引用
收藏
页码:126 / 130
页数:5
相关论文
共 50 条
  • [41] Adaptive multi-task learning for speech to text translation
    Feng, Xin
    Zhao, Yue
    Zong, Wei
    Xu, Xiaona
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01):
  • [42] A user-friendly program for multi-task analysis
    Caporali, SA
    Akladios, M
    Becker, P
    INTELLIGENT SYSTEMS IN DESIGN AND MANUFACTURING III, 2000, 4192 : 403 - 416
  • [43] Multi-Task and Multi-Scene Unified Ranking Model for Online Advertising
    Tan, Shulong
    Li, Meifang
    Zhao, Weijie
    Zheng, Yandan
    Pei, Xin
    Li, Ping
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 2046 - 2051
  • [44] MGSGA: Multi-grained and Semantic-Guided Alignment for Text-Video Retrieval
    Wu, Xiaoyu
    Qian, Jiayao
    Yang, Lulu
    NEURAL PROCESSING LETTERS, 2024, 56 (02)
  • [45] Descriptor based video coding for machine for multi-task
    Lee, Jin Young
    Lee, HeeKyung
    Choo, Hyon-Gon
    Cheong, Won-Sik
    Seo, Jeongil
    INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY (IWAIT) 2022, 2022, 12177
  • [46] Multi-Task Video Captioning with a Stepwise Multimodal Encoder
    Liu, Zihao
    Wu, Xiaoyu
    Yu, Ying
    ELECTRONICS, 2022, 11 (17)
  • [47] Multi-Task Learning for Video Surveillance with Limited Data
    Doshi, Keval
    Yilmaz, Yasin
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3888 - 3898
  • [48] MTLP-JR: Multi-task learning-based prediction for joint ranking in neural architecture search
    Lyu, Bo
    Lu, Longfei
    Hamdi, Maher
    Wen, Shiping
    Yang, Yin
    Li, Ke
    COMPUTERS & ELECTRICAL ENGINEERING, 2023, 105
  • [49] Modeling Orders of User Behaviors via Differentiable Sorting: A Multi-task Framework to Predicting User Post-click Conversion
    Wang, Menghan
    Yang, Jinming
    Guo, Yuchen
    Shen, Yuming
    Zhu, Mengying
    Wang, Yanlin
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2184 - 2188
  • [50] Multi-Task Multi-User Offloading in Mobile Edge Computing
    Moussammi, Nouhaila
    El Ghmary, Mohamed
    Idrissi, Abdellah
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (12) : 938 - 943