Data pricing in machine learning pipelines

被引:0
|
作者
Zicun Cong
Xuan Luo
Jian Pei
Feida Zhu
Yong Zhang
机构
[1] Simon Fraser University,
[2] Singapore Management University,undefined
[3] Huawei Technologies Canada,undefined
来源
Knowledge and Information Systems | 2022年 / 64卷
关键词
Data assets; Data pricing; Data products; Machine learning; AI;
D O I
暂无
中图分类号
学科分类号
摘要
Machine learning is disruptive. At the same time, machine learning can only succeed by collaboration among many parties in multiple steps naturally as pipelines in an eco-system, such as collecting data for possible machine learning applications, collaboratively training models by multiple parties and delivering machine learning services to end users. Data are critical and penetrating in the whole machine learning pipelines. As machine learning pipelines involve many parties and, in order to be successful, have to form a constructive and dynamic eco-system, marketplaces and data pricing are fundamental in connecting and facilitating those many parties. In this article, we survey the principles and the latest research development of data pricing in machine learning pipelines. We start with a brief review of data marketplaces and pricing desiderata. Then, we focus on pricing in three important steps in machine learning pipelines. To understand pricing in the step of training data collection, we review pricing raw data sets and data labels. We also investigate pricing in the step of collaborative training of machine learning models and overview pricing machine learning models for end users in the step of machine learning deployment. We also discuss a series of possible future directions.
引用
收藏
页码:1417 / 1455
页数:38
相关论文
共 50 条
  • [41] Machine Learning Lineage for Trustworthy Machine Learning Systems: Information Framework for MLOps Pipelines
    Raatikainen, Mikko
    Souris, Charalampos
    Remes, Jukka
    Stirbu, Vlad
    IEEE SOFTWARE, 2025, 42 (01) : 51 - 58
  • [42] Machine Learning-Driven Data Valuation for Optimizing High-Throughput Screening Pipelines
    Hesse, Joshua
    Boldini, Davide
    Sieber, Stephan A.
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (21) : 8142 - 8152
  • [43] A Cloud-based Framework for Implementing Portable Machine Learning Pipelines for Neural Data Analysis
    Ellis, Charles A.
    Gu, Ping
    Sendi, Mohammad S. E.
    Huddleston, Daniel
    Sharma, Ashish
    Mahmoudi, Babak
    2019 41ST ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2019, : 4466 - 4469
  • [44] A Prestudy of Machine Learning in Industrial Quality Control Pipelines
    Ravnican, Joze
    Marinko, Anze
    Noveski, Gjorgji
    Kalabakov, Stefan
    Jovanovi, Marko
    Gazvoda, Samo
    Gams, Matjaz
    INFORMATICA-AN INTERNATIONAL JOURNAL OF COMPUTING AND INFORMATICS, 2022, 46 (02): : 187 - 196
  • [45] Towards Accelerating Generic Machine Learning Prediction Pipelines
    Scolari, Alberto
    Lee, Yunseong
    Weimer, Markus
    Interlandi, Matteo
    2017 IEEE 35TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2017, : 431 - 434
  • [46] Mechanism Design, Machine Learning, and Pricing Problems
    Balcan, Maria-Florina
    Blum, Avrim
    ACM SIGECOM EXCHANGES, 2007, 7 (01)
  • [47] Smart Parking Pricing: A Machine Learning Approach
    Simhon, Eran
    Liao, Christopher
    Starobinski, David
    2017 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2017, : 641 - 646
  • [48] Cryptocurrencies asset pricing via machine learning
    Wang, Qiyu
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2021, 12 (02) : 175 - 183
  • [49] Applying Machine Learning to the Fuel Theft Problem on Pipelines
    Ventriglia, Rachel Martins
    Dantas, Leila Figueiredo
    Brandao, Bianca
    Hamacher, Silvio
    Rocha, Marcos Vinicius Belle
    David, Andre Silveira
    Ribeiro, Frederico Chalita
    JOURNAL OF PIPELINE SYSTEMS ENGINEERING AND PRACTICE, 2023, 14 (02)
  • [50] Review on automated condition assessment of pipelines with machine learning
    Liu, Yiming
    Bao, Yi
    ADVANCED ENGINEERING INFORMATICS, 2022, 53