Exploring Data Splitting Strategies for the Evaluation of Recommendation Models

被引:48
|
作者
Meng, Zaigiao [1 ]
McCreadie, Richard [1 ]
Macdonald, Craig [1 ]
Ounis, Iadh [1 ]
机构
[1] Univ Glasgow, Glasgow, Lanark, Scotland
关键词
Recommender Systems; Spliting Strategy; Model Evaluation; Leave-one-out; Temporal Split;
D O I
10.1145/3383313.3418479
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Effective methodologies for evaluating recommender systems are critical, so that different systems can be compared in a sound manner. A commonly overlooked aspect of evaluating recommender systems is the selection of the data splitting strategy. In this paper, we both show that there is no standard splitting strategy and that the selection of splitting strategy can have a strong impact on the ranking of recommender systems during evaluation. In particular, we perform experiments comparing three common data splitting strategies, examining their impact over seven state-of-the-art recommendation models on two datasets. Our results demonstrate that the splitting strategy employed is an important confounding variable that can markedly alter the ranking of recommender systems, making much of the currently published literature non-comparable, even when the same datasets and metrics are used.
引用
收藏
页码:681 / 686
页数:6
相关论文
共 50 条
  • [1] Data Preprocessing for Evaluation of Recommendation Models in E-Commerce
    Chaudhary, Namrata
    Chowdhury, Drimik Roy
    DATA, 2019, 4 (01)
  • [2] EVALUATING MUSIC RECOMMENDATION IN A REAL-WORLD SETTING: ON DATA SPLITTING AND EVALUATION METRICS
    Chou, Szu-Yu
    Yang, Yi-Hsuan
    Lin, Yu-Ching
    2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2015,
  • [3] Data splitting strategies for improving data driven models for reference evapotranspiration estimation among similar stations
    Shiri, Jalal
    Marti, Pau
    Karimi, Sepideh
    Landeras, Gorka
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2019, 162 : 70 - 81
  • [4] Exploring Social Media Data for MOOC Recommendation
    Assami, Sara
    Daoudi, Najima
    Ajhoun, Rachida
    2019 4TH INTERNATIONAL CONFERENCE ON SYSTEMS OF COLLABORATION BIG DATA, INTERNET OF THINGS & SECURITY (SYSCOBIOTS 2019), 2019, : 62 - 69
  • [5] An overview and evaluation of citation recommendation models
    Ali, Zafar
    Ullah, Irfan
    Khan, Amin
    Ullah Jan, Asim
    Muhammad, Khan
    SCIENTOMETRICS, 2021, 126 (05) : 4083 - 4119
  • [6] An overview and evaluation of citation recommendation models
    Zafar Ali
    Irfan Ullah
    Amin Khan
    Asim Ullah Jan
    Khan Muhammad
    Scientometrics, 2021, 126 : 4083 - 4119
  • [7] Exploring multiple diversification strategies for academic citation contexts recommendation
    Chen, Haihua
    Yang, Yunhan
    Lu, Wei
    Chen, Jiangping
    ELECTRONIC LIBRARY, 2020, 38 (04): : 821 - 842
  • [8] Splitting models for multivariate count data
    Peyhardi, Jean
    Fernique, Pierre
    Durand, Jean-Baptiste
    JOURNAL OF MULTIVARIATE ANALYSIS, 2021, 181
  • [9] Evaluation of echosounder data preparation strategies for modern machine learning models
    Ordonez, Alba
    Utseth, Ingrid
    Brautaset, Olav
    Korneliussen, Rolf
    Handegard, Nils Olav
    FISHERIES RESEARCH, 2022, 254
  • [10] Correction to: An overview and evaluation of citation recommendation models
    Zafar Ali
    Irfan Ullah
    Amin Ul Haq
    Asim Ullah Jan
    Khan Muhammad
    Scientometrics, 2021, 126 : 8771 - 8771