Data Management in Machine Learning Systems

被引:0
|
作者
Boehm, Matthias [1 ]
Kumar, Arun [2 ]
Yang, Jun [3 ]
机构
[1] Graz University of Technology, Austria
[2] University of California, San Diego, United States
[3] Duke University, United States
来源
Synthesis Lectures on Data Management | 2019年 / 11卷 / 01期
关键词
Information management;
D O I
10.2200/S00895ED1V01Y201901DTM057
中图分类号
学科分类号
摘要
Large-scale data analytics using machine learning (ML) underpins many modern data-driven applications. ML systems provide means of specifying and executing these ML workloads in an efficient and scalable manner. Data management is at the heart of many ML systems due to data-driven application characteristics, data-centric workload characteristics, and system architectures inspired by classical data management techniques. In this book, we follow this data-centric view of ML systems and aim to provide a comprehensive overview of data management in ML systems for the end-to-end data science or ML lifecycle. We review multiple interconnected lines of work: (1) ML support in database (DB) systems, (2) DB-inspired ML systems, and (3) ML lifecycle systems. Covered topics include: in-database analytics via query generation and user-defined functions, factorized and statistical-relational learning; optimizing compilers for ML workloads; execution strategies and hardware accelerators; data access methods such as compression, partitioning and indexing; resource elasticity and cloud markets; as well as systems for data preparation for ML, model selection, model management, model debugging, and model serving. Given the rapidly evolving field, we strive for a balance between an up-to-date survey of ML systems, an overview of the underlying concepts and techniques, as well as pointers to open research questions. Hence, this book might serve as a starting point for both systems researchers and developers. © 2019 by Morgan & Claypool.
引用
收藏
页码:1 / 173
相关论文
共 50 条
  • [21] Machine Learning with Distributed Data Management and Process Architecture
    Baysal, Engin
    Bayilmis, Cuneyt
    2019 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2019, : 53 - 57
  • [22] Big data and machine learning to tackle diabetes management
    Pina, Ana F.
    Meneses, Maria Joao
    Sousa-Lima, Ines
    Henriques, Roberto
    Raposo, Joao F.
    Macedo, Maria Paula
    EUROPEAN JOURNAL OF CLINICAL INVESTIGATION, 2023, 53 (01)
  • [23] Health Management of Systems using Telemetry & Machine Learning
    Vichare, Nikhil
    2022 IEEE INTERNATIONAL RELIABILITY PHYSICS SYMPOSIUM (IRPS), 2022,
  • [24] Machine Learning Assisted Management of Photonic Switching Systems
    Khan, Ihtesham
    Masood, M. Umar
    Tunesi, Lorenzo
    Bardella, Paolo
    Ghillino, Enrico
    Carena, Andrea
    Curri, Vittorio
    2021 CONFERENCE ON LASERS AND ELECTRO-OPTICS (CLEO), 2021,
  • [25] MACHINE LEARNING FOR TEXT CLASSIFICATION IN BUILDING MANAGEMENT SYSTEMS
    Mesa-Jimenez, Jose Joaquin
    Stokes, Lee
    Yang, QingPing
    Livina, Valerie N.
    JOURNAL OF CIVIL ENGINEERING AND MANAGEMENT, 2022, 28 (05) : 408 - 421
  • [26] Combining simulation and machine learning for the management of healthcare systems
    Ricciardi, Carlo
    Cesarelli, Giuseppe
    Ponsiglione, Alfonso Maria
    De Tommasi, Gianmaria
    Cesarelli, Mario
    Romano, Maria
    Improta, Giovanni
    Amato, Francesco
    2022 IEEE INTERNATIONAL CONFERENCE ON METROLOGY FOR EXTENDED REALITY, ARTIFICIAL INTELLIGENCE AND NEURAL ENGINEERING (METROXRAINE), 2022, : 335 - 339
  • [27] Optimal Machine Learning Enabled Performance Monitoring for Learning Management Systems
    Dutta, Ashit Kumar
    Alqahtani, Mazen Mushabab
    Albagory, Yasser
    Sait, Abdul Rahaman Wahab
    Alsanea, Majed
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2023, 44 (03): : 2277 - 2292
  • [28] Machine Learning Data Market Based on Multiagent Systems
    Baghcheband, Hajar
    Soares, Carlos
    Reis, Luis Paulo
    IEEE INTERNET COMPUTING, 2024, 28 (04) : 7 - 13
  • [29] Towards FAIR Data in Distributed Machine Learning Systems
    Mou, Yongli
    Guo, Fengyang
    Lu, Wei
    Li, Yongzhao
    Beyan, Oya
    Rose, Thomas
    Dustdar, Schahram
    Decker, Stefan
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 6450 - 6455
  • [30] Machine Learning Systems Applied to Health Data and System
    Bonifazi, Fedele
    Volpe, Elisabetta
    Digregorio, Giuseppe
    Giannuzzi, Viviana
    Ceci, Adriana
    EUROPEAN JOURNAL OF HEALTH LAW, 2020, 27 (03) : 242 - 258