Data pricing in machine learning pipelines

被引:0
|
作者
Zicun Cong
Xuan Luo
Jian Pei
Feida Zhu
Yong Zhang
机构
[1] Simon Fraser University,
[2] Singapore Management University,undefined
[3] Huawei Technologies Canada,undefined
来源
Knowledge and Information Systems | 2022年 / 64卷
关键词
Data assets; Data pricing; Data products; Machine learning; AI;
D O I
暂无
中图分类号
学科分类号
摘要
Machine learning is disruptive. At the same time, machine learning can only succeed by collaboration among many parties in multiple steps naturally as pipelines in an eco-system, such as collecting data for possible machine learning applications, collaboratively training models by multiple parties and delivering machine learning services to end users. Data are critical and penetrating in the whole machine learning pipelines. As machine learning pipelines involve many parties and, in order to be successful, have to form a constructive and dynamic eco-system, marketplaces and data pricing are fundamental in connecting and facilitating those many parties. In this article, we survey the principles and the latest research development of data pricing in machine learning pipelines. We start with a brief review of data marketplaces and pricing desiderata. Then, we focus on pricing in three important steps in machine learning pipelines. To understand pricing in the step of training data collection, we review pricing raw data sets and data labels. We also investigate pricing in the step of collaborative training of machine learning models and overview pricing machine learning models for end users in the step of machine learning deployment. We also discuss a series of possible future directions.
引用
收藏
页码:1417 / 1455
页数:38
相关论文
共 50 条
  • [31] Function plus Data Flow: A Framework to Specify Machine Learning Pipelines for Digital Twinning
    de Conto, Eduardo
    Genest, Blaise
    Easwaran, Arvind
    PROCEEDINGS OF THE 1ST ACM INTERNATIONAL CONFERENCE ON AI-POWERED SOFTWARE, AIWARE 2024, 2024, : 19 - 27
  • [32] Machine learning in empirical asset pricing
    Alois Weigand
    Financial Markets and Portfolio Management, 2019, 33 : 93 - 104
  • [33] Machine learning in empirical asset pricing
    Weigand, Alois
    FINANCIAL MARKETS AND PORTFOLIO MANAGEMENT, 2019, 33 (01) : 93 - 104
  • [34] Automatic Generation of Visualizations for Machine Learning Pipelines
    Liu, Lei
    Chen, Wei-Peng
    Bahrami, Mehdi
    Prasad, Mukul
    PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
  • [35] Automated Machine Learning and Asset Pricing
    Healy, Jerome V.
    Gregoriou, Andros
    Hudson, Robert
    RISKS, 2024, 12 (09)
  • [36] Option pricing using Machine Learning
    Ivascu, Codrut-Florin
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 163
  • [37] Time Travel and Provenance for Machine Learning Pipelines
    Ormenisan, Alexandru A.
    Meister, Moritz
    Buso, Fabio
    Andersson, Robin
    Haridi, Seif
    Dowling, Jim
    PROCEEDINGS OF THE 2020 USENIX CONFERENCE ON OPERATIONAL MACHINE LEARNING (OPML '20), 2020, : 13 - 15
  • [38] Towards Observability for Production Machine Learning Pipelines
    Shankar, Shreya
    Parameswaran, Aditya G.
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (13): : 4015 - 4022
  • [39] An Intermediate Representation for Optimizing Machine Learning Pipelines
    Kunft, Andreas
    Katsifodimos, Asterios
    Schelter, Sebastian
    Bress, Sebastian
    Rabl, Tilmann
    Markl, Volker
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (11): : 1553 - 1567
  • [40] Can Machine Learning Pipelines Be Better Configured?
    Wang, Yibo
    Wang, Ying
    Zhang, Tingwei
    Yu, Yue
    Cheung, Shing-Chi
    Yu, Hai
    Zhu, Zhiliang
    PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 463 - 475