A Review of Big Data and Machine Learning Operations in Official Statistics: MLOps and Feature Store Adoption

被引:0
|
作者
Ramos Nunes, Carlos Eduardo [1 ]
Ashofteh, Afshin [1 ]
机构
[1] Nova Univ Lisbon, NOVA Informat Management Sch NOVA IMS, Lisbon, Portugal
关键词
Feature store; Official statistics; Machine learning operations; Data science; Big data; Data quality; QUALITY;
D O I
10.1109/COMPSAC61105.2024.00101
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Integrating machine learning (ML) into the official statisticians' toolset is gaining popularity as National Statistical Offices (NSOs) strive to improve their methodologies. This trend poses new challenges and implications for incorporating innovative techniques that ensure the reliability of the official statistical production process. A comprehensive literature review was conducted using Scopus and Web of Science databases to explore the contemporary applications of data science in official statistics. A total of 178 research articles were identified, focusing on areas such as big data, machine learning, and data quality. While the literature review revealed extensive proposals on utilizing alternative data and applying machine learning techniques to support official statistics production, it also identified research gaps in the post-training steps of the machine learning process. Areas requiring further investigation include machine learning operations in a production environment, data quality assurance, and governance.
引用
收藏
页码:711 / 718
页数:8
相关论文
共 50 条
  • [1] Official Statistics and Big Data
    Giczi, Johanna
    Szoke, Katalin
    INTERSECTIONS-EAST EUROPEAN JOURNAL OF SOCIETY AND POLITICS, 2018, 4 (01): : 159 - 182
  • [2] Big data in official statistics
    Zwick, Markus
    BUNDESGESUNDHEITSBLATT-GESUNDHEITSFORSCHUNG-GESUNDHEITSSCHUTZ, 2015, 58 (08) : 838 - 843
  • [3] Official statistics and Big Data
    Struijs, Peter
    Braaksma, Barteld
    Daas, Piet J. H.
    BIG DATA & SOCIETY, 2014, 1 (01):
  • [4] Big Data and Official Statistics†
    Abraham, Katharine G.
    REVIEW OF INCOME AND WEALTH, 2022, 68 (04) : 835 - 861
  • [5] Machine Learning Operations (MLOps): Overview, Definition, and Architecture
    Kreuzberger, Dominik
    Kuehl, Niklas
    Hirschl, Sebastian
    IEEE ACCESS, 2023, 11 : 31866 - 31879
  • [6] Big Data as a Source for Official Statistics
    Daas, Piet J. H.
    Puts, Marco J.
    Buelens, Bart
    van den Hurk, Paul A. M.
    JOURNAL OF OFFICIAL STATISTICS, 2015, 31 (02) : 249 - 262
  • [7] Big data in official statistics [Big Data in der amtlichen Statistik]
    Zwick M.
    Bundesgesundheitsblatt - Gesundheitsforschung - Gesundheitsschutz, 2015, 58 (8) : 838 - 843
  • [8] The Hopsworks Feature Store for Machine Learning
    Martinez, Javier de la Rua
    Buso, Fabio
    Kouzoupis, Antonios
    Ormenisan, Alexandru A.
    Niazi, Salman
    Bzhalava, Davit
    Mak, Kenneth
    Jouffrey, Victor
    Ronstrom, Mikael
    Cunningham, Raymond
    Zangis, Ralfs
    Mukhedkar, Dhananjay
    Khazanchi, Ayushman
    Vlassov, Vladimir
    Dowling, Jim
    COMPANION OF THE 2024 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, SIGMOD-COMPANION 2024, 2024, : 135 - 147
  • [9] Official statistics embrace big data: a review of current and developing international practice
    Plekhanov, Dmitriy
    INTERNATIONAL CONFERENCE ON ELECTRONIC GOVERNANCE AND OPEN SOCIETY: CHALLENGES IN EURASIA (EGOSE 2017), 2017, : 22 - 26
  • [10] Efficient Machine Learning for Big Data: A Review
    Al-Jarrah, Omar Y.
    Yoo, Paul D.
    Muhaidat, Sami
    Karagiannidis, George K.
    Taha, Kamal
    BIG DATA RESEARCH, 2015, 2 (03) : 87 - 93