The Hopsworks Feature Store for Machine Learning

被引:0
|
作者
Martinez, Javier de la Rua [1 ,2 ]
Buso, Fabio [1 ]
Kouzoupis, Antonios [1 ]
Ormenisan, Alexandru A. [1 ]
Niazi, Salman [1 ]
Bzhalava, Davit [1 ]
Mak, Kenneth [1 ]
Jouffrey, Victor [1 ]
Ronstrom, Mikael [1 ]
Cunningham, Raymond [1 ]
Zangis, Ralfs [1 ]
Mukhedkar, Dhananjay [1 ]
Khazanchi, Ayushman [2 ]
Vlassov, Vladimir [2 ]
Dowling, Jim [1 ,2 ]
机构
[1] Hopsworks AB, Stockholm, Sweden
[2] KTH Royal Inst Technol, Stockholm, Sweden
基金
欧盟地平线“2020”;
关键词
Feature Store; MLOps; RonDB; Arrow Flight; DuckDB;
D O I
10.1145/3626246.3653389
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data management is the most challenging aspect of building Machine Learning (ML) systems. ML systems can read large volumes of historical data when training models, but inference workloads are more varied, depending on whether it is a batch or online ML system. The feature store for ML has recently emerged as a single data platform for managing ML data throughout the ML lifecycle, from feature engineering to model training to inference. In this paper, we present the Hopsworks feature store for machine learning as a highly available platform for managing feature data with API support for columnar, row-oriented, and similarity search query workloads. We introduce and address challenges solved by the feature stores related to feature reuse, how to organize data transformations, and how to ensure correct and consistent data between feature engineering, model training, and model inference. We present the engineering challenges in building high-performance query services for a feature store and show how Hopsworks outperforms existing cloud feature stores for training and online inference query workloads.
引用
收藏
页码:135 / 147
页数:13
相关论文
共 50 条
  • [21] Kernel fusion and feature selection in machine learning
    Mottl, V
    Krasotkina, O
    Seredin, O
    Muchnik, I
    Proceedings of the Eighth IASTED International Conference on Intelligent Systems and Control, 2005, : 477 - 482
  • [22] Machine Learning for Feature-Based Analytics
    Wang, Li-C
    PROCEEDINGS OF THE 2018 INTERNATIONAL SYMPOSIUM ON PHYSICAL DESIGN (ISPD'18), 2018, : 74 - 81
  • [23] Feature selection in machine learning: A new perspective
    Cai, Jie
    Luo, Jiawei
    Wang, Shulin
    Yang, Sheng
    NEUROCOMPUTING, 2018, 300 : 70 - 79
  • [24] Feature Selection using an SVM learning machine
    El Ferchichi, Sabra
    Laabedi, Kaouther
    Zidi, Salah
    Maouche, Salah
    2009 3RD INTERNATIONAL CONFERENCE ON SIGNALS, CIRCUITS AND SYSTEMS (SCS 2009), 2009, : 485 - +
  • [25] Feature Selection Based on Extreme Learning Machine
    Wang, Zhaoxi
    Zhao, Meng
    Chen, Shengyong
    ICDLT 2019: 2019 3RD INTERNATIONAL CONFERENCE ON DEEP LEARNING TECHNOLOGIES, 2019, : 57 - 63
  • [26] Practical feature subset selection for machine learning
    Hall, MA
    Smith, LA
    PROCEEDINGS OF THE 21ST AUSTRALASIAN COMPUTER SCIENCE CONFERENCE, ACSC'98, 1998, 20 (01): : 181 - 191
  • [27] Special feature: computational statistics and machine learning
    Hiroshi Yadohisa
    Wataru Sakamoto
    Japanese Journal of Statistics and Data Science, 2019, 2 : 219 - 220
  • [28] Learning Feature Interactions with Lorentzian Factorization Machine
    Xu, Canran
    Wu, Ming
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6470 - 6477
  • [29] Differentiation and Integration of Machine Learning Feature Vectors
    Mu, Xinying
    Pavel, Ana B.
    Kon, Mark
    2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 611 - 616
  • [30] Unsupervised Feature Learning Classification Using An Extreme Learning Machine
    Lam, Dao
    Wunsch, Donald
    2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,