Concentric Mixtures of Mallows Models for Top-k Rankings: Sampling and Identifiability

被引:0
|
作者
Collas, Fabien [1 ]
Irurozki, Ekhine [1 ,2 ]
机构
[1] Basque Ctr Appl Math, Bilbao, Spain
[2] Telecom Paris, Inst Polytech Paris, LTCI, Paris, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we study mixtures of two Mallows models for top-k rankings with equal location parameters but with different scale parameters (a mixture of concentric Mallows models). These models arise when we have a heterogeneous population of voters formed by two populations, one of which is a subpopulation of expert voters. We show the identifiability of both components and the learnability of their respective parameters. These results are based upon, first, bounding the sample complexity for the Borda algorithm with top-k rankings. Second, we characterize the distances between rankings, showing that an off-the-shelf clustering algorithm separates the rankings by components with high probability-provided the scales are well-separated. As a by-product, we include an efficient sampling algorithm for Mallows top-k rankings. Finally, since the rank aggregation will suffer from a large amount of noise introduced by the non-expert voters, we adapt the Borda algorithm to be able to recover the ground truth consensus ranking which is especially consistent with the expert rankings.
引用
收藏
页数:10
相关论文
共 27 条
  • [1] Mallows Models for Top-k Lists
    Chierichetti, Flavio
    Dasgupta, Anirban
    Haddadan, Shahrzad
    Kumar, Ravi
    Lattanzi, Silvio
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [2] Sensitivity index to measure dependence on parameters for rankings and top-k rankings
    Rolland, Antoine
    Cugliari, Jairo
    [J]. JOURNAL OF APPLIED STATISTICS, 2020, 47 (07) : 1191 - 1207
  • [3] On Sampling Top-K Recommendation Evaluation
    Li, Dong
    Jin, Ruoming
    Gao, Jing
    Liu, Zhi
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 2114 - 2124
  • [4] Policy-Aware Unbiased Learning to Rank for Top-k Rankings
    Oosterhuis, Harrie
    de Rijke, Maarten
    [J]. PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 489 - 498
  • [5] Mining top-K frequent itemsets through progressive sampling
    Andrea Pietracaprina
    Matteo Riondato
    Eli Upfal
    Fabio Vandin
    [J]. Data Mining and Knowledge Discovery, 2010, 21 : 310 - 326
  • [6] A sampling-based estimator for top-k selection query
    Chen, CM
    Ling, YB
    [J]. 18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2002, : 617 - 627
  • [7] APPROXIMATE CONSISTENT WEIGHTED SAMPLING FOR EFFICIENT TOP-K SEARCH
    Kim, Yunna
    Hwang, Heasoo
    [J]. INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2020, 16 (03): : 1125 - 1132
  • [8] Mining top-K frequent itemsets through progressive sampling
    Pietracaprina, Andrea
    Riondato, Matteo
    Upfal, Eli
    Vandin, Fabio
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2010, 21 (02) : 310 - 326
  • [9] MapReduce approach to build network user profiles with top-k rankings for network security
    Parres-Peredo, Alvaro
    Piza-Davila, Ivan
    Cervantes, Francisco
    [J]. 2018 9TH IFIP INTERNATIONAL CONFERENCE ON NEW TECHNOLOGIES, MOBILITY AND SECURITY (NTMS), 2018,
  • [10] Sampling Wisely: Deep Image Embedding by Top-k Precision Optimization
    Lu, Jing
    Xu, Chaofan
    Zhang, Wei
    Duan, Lingyu
    Mei, Tao
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7960 - 7969