Clustering-based Transduction for Learning a Ranking Model with Limited Human Labels

被引:4
|
作者
Zhang, Xin [1 ]
He, Ben [1 ]
Luo, Tiejian [1 ]
Li, Dongxing [1 ]
Xu, Jungang [1 ]
机构
[1] Univ Chinese Acad Sci, Sch Comp & Control Engn, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Clustering; Transductive learning; Learning to rank;
D O I
10.1145/2505515.2505647
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transductive learning is a semi-supervised learning paradigm that can leverage unlabeled data by creating pseudo labels for learning a ranking model, when there is only limited or no training examples available. However, the effectiveness of transductive learning in information retrieval (IR) can be hindered by the low quality pseudo labels. To this end, we propose to incorporate a two-step k-means clustering algorithm to select the high quality training queries for generating the pseudo labels. In particular, the first step selects the high-quality queries for which the relevant documents are highly coherent as indicated by the clustering results. The second step then selects the initial training examples for the transductive learning that iteratively aggregating the pseudo examples. Finally, the learning to rank (LTR) algorithms are applied to learn the ranking model using the pseudo training examples created by the transductive learning process. Our proposed approach is particularly suitable for applications where there is only little or no human labels available as it does not necessarily involve the use of relevance assessments information or human efforts. Experimental results on the standard TREC Tweets11 collection show that our proposed approach outperforms strong baselines, namely the conventional applications of learning to rank algorithms using human labels for the training and transductive learning using all the queries available.
引用
收藏
页码:1777 / 1782
页数:6
相关论文
共 50 条
  • [1] A Clustering-based Grouping Model for Enhancing Collaborative Learning
    Pang, Yulei
    Xiao, Feiya
    Wang, Huaying
    Xue, Xiaozhen
    [J]. 2014 13TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2014, : 562 - 567
  • [2] Wind Speed Forecasting with a Clustering-Based Deep Learning Model
    Kosanoglu, Fuat
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (24):
  • [3] A clustering-based discretization for supervised learning
    Gupta, Ankit
    Mehrotra, Kishan G.
    Mohan, Chilukuri
    [J]. STATISTICS & PROBABILITY LETTERS, 2010, 80 (9-10) : 816 - 824
  • [4] Metric learning with clustering-based constraints
    Xinyao Guo
    Chuangyin Dang
    Jianqing Liang
    Wei Wei
    Jiye Liang
    [J]. International Journal of Machine Learning and Cybernetics, 2021, 12 : 3597 - 3605
  • [5] Metric learning with clustering-based constraints
    Guo, Xinyao
    Dang, Chuangyin
    Liang, Jianqing
    Wei, Wei
    Liang, Jiye
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (12) : 3597 - 3605
  • [6] Fuzzy Collaborative Clustering-Based Ranking Approach for Complex Objects
    Liu, Shihu
    Chen, Xiaozhou
    Moughal, Tauqir Ahmed
    Yu, Fusheng
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [7] Fuel consumption estimation method based on clustering-based deep learning model
    Chen, Chi-Hua
    [J]. ASIA-PACIFIC JOURNAL OF CLINICAL ONCOLOGY, 2022, 18 : 129 - 130
  • [8] ClusterFL: A Clustering-based Federated Learning System for Human Activity Recognition
    Ouyang, Xiaomin
    Xie, Zhiyuan
    Zhou, Jiayu
    Xing, Guoliang
    Huang, Jianwei
    [J]. ACM TRANSACTIONS ON SENSOR NETWORKS, 2023, 19 (01)
  • [9] ClusterFL: A Clustering-based Federated Learning System for Human Activity Recognition
    Ouyang, Xiaomin
    Xie, Zhiyuan
    Zhou, Jiayu
    Xing, Guoliang
    Huang, Jianwei
    [J]. ACM Transactions on Sensor Networks, 2022, 19 (01)
  • [10] Adaptive Clustering-Based Model Aggregation for Federated Learning with Imbalanced Data
    Wang, Dong
    Zhang, Naifu
    Tao, Meixia
    [J]. SPAWC 2021: 2021 IEEE 22ND INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS (IEEE SPAWC 2021), 2020, : 591 - 595