Data pricing in machine learning pipelines

被引:0
|
作者
Zicun Cong
Xuan Luo
Jian Pei
Feida Zhu
Yong Zhang
机构
[1] Simon Fraser University,
[2] Singapore Management University,undefined
[3] Huawei Technologies Canada,undefined
来源
Knowledge and Information Systems | 2022年 / 64卷
关键词
Data assets; Data pricing; Data products; Machine learning; AI;
D O I
暂无
中图分类号
学科分类号
摘要
Machine learning is disruptive. At the same time, machine learning can only succeed by collaboration among many parties in multiple steps naturally as pipelines in an eco-system, such as collecting data for possible machine learning applications, collaboratively training models by multiple parties and delivering machine learning services to end users. Data are critical and penetrating in the whole machine learning pipelines. As machine learning pipelines involve many parties and, in order to be successful, have to form a constructive and dynamic eco-system, marketplaces and data pricing are fundamental in connecting and facilitating those many parties. In this article, we survey the principles and the latest research development of data pricing in machine learning pipelines. We start with a brief review of data marketplaces and pricing desiderata. Then, we focus on pricing in three important steps in machine learning pipelines. To understand pricing in the step of training data collection, we review pricing raw data sets and data labels. We also investigate pricing in the step of collaborative training of machine learning models and overview pricing machine learning models for end users in the step of machine learning deployment. We also discuss a series of possible future directions.
引用
收藏
页码:1417 / 1455
页数:38
相关论文
共 50 条
  • [21] On Integrating the Data-Science and Machine-Learning Pipelines for Responsible AI
    Esmaelizadeh, Armin
    Rorseth, Joel
    Yu, Andy
    Godfrey, Parke
    Golab, Lukasz
    Srivastava, Divesh
    Szlichta, Jaroslaw
    Taghva, Kazem
    FIRST WORKSHOP ON GOVERNANCE, UNDERSTANDING, AND INTEGRATION OF DATA FOR EFFECTIVE AND RESPONSIBLE AI, GUIDE-AI 2024, 2024, : 50 - 53
  • [22] mlr3pipelines-Flexible Machine Learning Pipelines in R
    Binder, Martin
    Pfisterer, Florian
    Lang, Michel
    Schneider, Lennart
    Kotthoff, Lars
    Bischl, Bernd
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [23] mlr3pipelines - flexible machine learning pipelines in r
    Binder, Martin
    Pfisterer, Florian
    Lang, Michel
    Schneider, Lennart
    Kotthofi, Lars
    Bischl, Bernd
    Journal of Machine Learning Research, 2021, 22 : 1 - 7
  • [24] Towards Data-Centric What-If Analysis for Native Machine Learning Pipelines
    Grafberger, Stefan
    Groth, Paul
    Schelter, Sebastian
    PROCEEDINGS OF THE 6TH WORKSHOP ON DATA MANAGEMENT FOR END-TO-END MACHINE LEARNING, DEEM 2022, 2022,
  • [25] VAP: Online Data Valuation and Pricing for Machine Learning Models in Mobile Health
    Xu, Anran
    Zheng, Zhenzhe
    Li, Qinya
    Wu, Fan
    Chen, Guihai
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (05) : 5966 - 5983
  • [26] Research on real estate pricing methods based on data mining and machine learning
    Yanliang Yu
    Jingfu Lu
    Dan Shen
    Binbing Chen
    Neural Computing and Applications, 2021, 33 : 3925 - 3937
  • [27] Demonstration of Nimbus: Model-based Pricing for Machine Learning in a Data Marketplace
    Chen, Lingjiao
    Wang, Hongyi
    Chen, Leshang
    Koutris, Paraschos
    Kumar, Arun
    SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 1885 - 1888
  • [28] Research on real estate pricing methods based on data mining and machine learning
    Yu, Yanliang
    Lu, Jingfu
    Shen, Dan
    Chen, Binbing
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (09): : 3925 - 3937
  • [29] Failure risk analysis of pipelines using data-driven machine learning algorithms
    Mazumder, Ram K.
    Salman, Abdullahi M.
    Li, Yue
    STRUCTURAL SAFETY, 2021, 89
  • [30] Herschel vision: A hyperspectral image processing software for data preparation in machine learning pipelines
    Ram, Billy G.
    Sunil, G. C.
    Sun, Xin
    SOFTWAREX, 2025, 30