Missing value imputation in food composition data with denoising autoencoders

被引:3
|
作者
Gjorshoska, Ivana [1 ]
Eftimov, Tome [2 ]
Trajanov, Dimitar [1 ,3 ]
机构
[1] Ss Cyril & Methodius Univ Skopje, Fac Comp Sci & Engn, ul Rudzer Boshkovikj 16, PO 393, Skopje 1000, North Macedonia
[2] Jozef Stefan Inst, Comp Syst Dept, Jamova Cesta 39, Ljubljana 1000, Slovenia
[3] Boston Univ, Metropolitan Coll, Dept Comp Sci, Boston, MA USA
关键词
Food composition data; Food composition databases; Nutrient values; Missing data; Missing value imputation; Autoencoders; Deep learning;
D O I
10.1016/j.jfca.2022.104638
中图分类号
O69 [应用化学];
学科分类号
081704 ;
摘要
Missing data is a common problem in a wide range of fields that can arise as a result of different reasons: lack of analysis, mishandling samples, measurement error, etc. The area of nutrition and food composition is no exception to the problem of missing values. Missing data in food composition databases (FCDB) significantly limits their usage. Commonly this problem is resolved by calculating mean or median from available data in the same FCDB or borrowing values from other FCDBs, however, this method produces notable errors. This paper focuses on missing value imputation using autoencoders, a deep learning algorithm that has the ability to approximate values by learning a higher-level representation of its input. The data used was from the FCDBs collected by the USDA FoodData Central. We compared the autoencoder imputation method with the commonly used approaches fill-in-with-mean and fill-in-with-median, and the results show that the autoencoder method for imputation provides superior results.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] MIDIA: exploring denoising autoencoders for missing data imputation
    Qian Ma
    Wang-Chien Lee
    Tao-Yang Fu
    Yu Gu
    Ge Yu
    [J]. Data Mining and Knowledge Discovery, 2020, 34 : 1859 - 1897
  • [2] MIDIA: exploring denoising autoencoders for missing data imputation
    Ma, Qian
    Lee, Wang-Chien
    Fu, Tao-Yang
    Gu, Yu
    Yu, Ge
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (06) : 1859 - 1897
  • [3] Imputation of Missing Traffic Flow Data Using Denoising Autoencoders
    Jiang, Boyuan
    Siddiqi, Muhammad Danial
    Asadi, Reza
    Regan, Amelia
    [J]. 12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 84 - 91
  • [4] Missing Data Imputation via Denoising Autoencoders: The Untold Story
    Costa, Adriana Fonseca
    Santos, Miriam Seoane
    Soares, Jastin Pompeu
    Abreu, Pedro Henriques
    [J]. ADVANCES IN INTELLIGENT DATA ANALYSIS XVII, IDA 2018, 2018, 11191 : 87 - 98
  • [5] Missing-Data Imputation With Position-Encoding Denoising Autoencoders for Industrial Processes
    Ou, Chen
    Zhu, Hongqiu
    Shardt, Yuri A. W.
    Ye, Lingjian
    Yuan, Xiaofeng
    Wang, Yalin
    Yang, Chunhua
    Gui, Weihua
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [6] Hyperparameter Tuning to Optimize Implementations of Denoising Autoencoders for Imputation of Missing Spatio-temporal Data
    Siddiqi, Muhammad Danial
    Jiang, Boyuan
    Asadi, Reza
    Regan, Amelia
    [J]. 12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 107 - 114
  • [7] Evaluating missing value imputation methods for food composition databases
    Ispirova, Gordana
    Eftimov, Tome
    Seljak, Barbara Korousic
    [J]. FOOD AND CHEMICAL TOXICOLOGY, 2020, 141
  • [8] Physiological Waveform Imputation of Missing Data using Convolutional Autoencoders
    Miller, Daniel
    Ward, Andrew
    Bambos, Nicholas
    Scheinker, David
    Shin, Andrew
    [J]. 2018 IEEE 20TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES (HEALTHCOM), 2018,
  • [9] Variational Autoencoders for Missing Data Imputation with Application to a Simulated Milling Circuit
    McCoy, John T.
    Kroon, Steve
    Auret, Lidia
    [J]. IFAC PAPERSONLINE, 2018, 51 (21): : 141 - 146
  • [10] MIGHT: Statistical Methodology for Missing-Data Imputation in Food Composition Databases
    Ispirova, Gordana
    Eftimov, Tome
    Korosec, Peter
    Seljak, Barbara Korousic
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (19):