DiffML: End-to-end Differentiable ML Pipelines

被引:3
|
作者
Hilprecht, Benjamin [1 ]
Hammacher, Christian [2 ]
Reis, Eduardo [1 ]
Abdelaal, Mohamed
Binnig, Carsten [1 ]
机构
[1] Tech Univ Darmstadt, Darmstadt, Germany
[2] Software AG, Mainz, Germany
关键词
data engineering; differentiable ML pipelines; data cleaning;
D O I
10.1145/3595360.3595857
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present our vision of differentiable ML pipelines called DiffML that truly allows to automate the construction of ML pipelines in an end-to-end fashion. DiffML allows to jointly train not just the ML model itself but also the entire pipeline including data engineering steps, e.g., data cleaning, data augmentation, etc. Our core idea is to formulate all steps in a differentiable way such that the entire pipeline can be trained using backpropagation. However, this is a non-trivial problem and opens up many new research questions. To show the feasibility of this direction, we demonstrate initial ideas and a general principle of how typical data engineering steps can be formulated as differentiable programs and jointly learned with the ML model. Moreover, we discuss a research roadmap and core challenges that have to be systematically tackled to enable fully differentiable ML pipelines.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] End-to-End Differentiable Reactive Molecular Dynamics Simulations Using JAX
    Kaymak, Mehmet Cagri
    Schoenholz, Samuel S.
    Cubuk, Ekin D.
    O’Hearn, Kurt A.
    Merz Jr, Kenneth M.
    Aktulga, Hasan Metin
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2023, 13948 LNCS : 202 - 219
  • [32] Automatic differentiable nonequilibrium Green's function formalism: An end-to-end differentiable quantum transport simulator
    Zhouyin, Zhanghao
    Chen, Xiang
    Zhang, Peng
    Wang, Jun
    Wang, Lei
    PHYSICAL REVIEW B, 2023, 108 (19)
  • [33] End-to-End Differentiable Physics Temperature Estimation for Permanent Magnet Synchronous Motor
    Wang, Pengyuan
    Wang, Xinjian
    Wang, Yunpeng
    WORLD ELECTRIC VEHICLE JOURNAL, 2024, 15 (04):
  • [34] Unlocking Efficiency: Understanding End-to-End Performance in Distributed Analytics Pipelines
    Souza, Abel
    Ng, Nathan
    Abdelzaher, Tarek
    Towsley, Don
    Shenoy, Prashant
    MILCOM 2023 - 2023 IEEE MILITARY COMMUNICATIONS CONFERENCE, 2023,
  • [35] A feasible region for meeting aperiodic end-to-end deadlines in resource pipelines
    Abdelzaher, T
    Thaker, G
    Lardieri, P
    24TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, PROCEEDINGS, 2004, : 436 - 445
  • [36] End-to-end learning of multiple sequence alignments with differentiable Smith-Waterman
    Petti, Samantha
    Bhattacharya, Nicholas
    Rao, Roshan
    Dauparas, Justas
    Thomas, Neil
    Zhou, Juannan
    Rush, Alexander M.
    Koo, Peter
    Ovchinnikov, Sergey
    BIOINFORMATICS, 2023, 39 (01)
  • [37] End-to-end Lane Detection through Differentiable Least-Squares Fitting
    Van Gansbeke, Wouter
    De Brabandere, Bert
    Neven, Davy
    Proesmans, Marc
    Van Gool, Luc
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 905 - 913
  • [38] Differentiable Compound Optics and Processing Pipeline Optimization for End-to-end Camera Design
    Tseng, Ethan
    Mosleh, Ali
    Mannan, Fahim
    St-Arnaud, Karl
    Sharma, Avinash
    Peng, Yifan
    Braun, Alexander
    Nowrouzezahrai, Derek
    Lalonde, Jean-Francois
    Heide, Felix
    ACM TRANSACTIONS ON GRAPHICS, 2021, 40 (02):
  • [39] End-to-end LPCNet: A Neural Vocoder With Fully-Differentiable LPC Estimation
    Subramani, Krishna
    Valin, Jean-Marc
    Isik, Umut
    Smaragdis, Paris
    Krishnaswamy, Arvindh
    INTERSPEECH 2022, 2022, : 818 - 822
  • [40] Industrializing AI/ML during the end-to-end drug discovery process
    Yoo, Jiho
    Kim, Tae Yong
    Joung, InSuk
    Song, Sang Ok
    CURRENT OPINION IN STRUCTURAL BIOLOGY, 2023, 79