Development of a Scalable Method for Creating Food Groups Using the NHANES Dataset and MapReduce

被引:0
|
作者
Wyatt, Michael R., II [1 ]
Johnston, Travis [1 ]
Papas, Mia [2 ]
Taufer, Michela [1 ]
机构
[1] Univ Delaware, Comp & Info Sci, Newark, DE 19716 USA
[2] Univ Delaware, Behav Hlth & Nutr, Newark, DE 19716 USA
基金
美国国家科学基金会;
关键词
Clustering methods; data processing; Apache Spark; dietary data; micro- and macro-nutrients; DIETARY PATTERNS;
D O I
10.1145/2975167.2975179
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper we tackle the need for meaningful food group classifications in dietary datasets such as the National Health and Nutrition Examination Survey (NAHNES) that are less subjective in nature by defining a new objective method of identifying food groups exclusively based on the food's micro- and macro-nutrient content. We first perform extensive preprocessing of the NHANES raw data to mitigate impacts of missing nutrient values, redundancies, and different food intake quantities and scales. We then utilize an unsupervised learning clustering algorithm to create food groups within the preprocessed NHANES data and identify food groups with similar nutrient content. Finally we parallelize our method to benefit from the scalable MapReduce paradigm. Our results show that our method identifies food groups with smaller diameter and larger cluster separation distances than the standard, expert-informed, method of grouping food items.
引用
收藏
页码:118 / 127
页数:10
相关论文
共 50 条
  • [1] A Scalable XML Indexing Method Using MapReduce
    Hsu, Wen-Chiao
    Shih, Hsiao-Chen
    Liao, I-En
    2014 FOURTH INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING TECHNOLOGY (INTECH), 2014, : 81 - 86
  • [2] A scalable and accurate method for classifying protein-ligand binding geometries using a MapReduce approach
    Estrada, T.
    Zhang, B.
    Cicotti, P.
    Armen, R. S.
    Taufer, M.
    COMPUTERS IN BIOLOGY AND MEDICINE, 2012, 42 (07) : 758 - 771
  • [3] Development of Korean Food Image Classification Model Using Public Food Image Dataset and Deep Learning Methods
    Chun, Minki
    Jeong, Hyeonhak
    Lee, Hyunmin
    Yoo, Taewon
    Jung, Hyunggu
    IEEE ACCESS, 2022, 10 : 128732 - 128741
  • [4] IRPDP_HT2: a scalable data pre-processing method in web usage mining using Hadoop MapReduce
    Srivastava, Atul Kumar
    Srivastava, Mitali
    SOFT COMPUTING, 2023, 27 (12) : 7907 - 7923
  • [5] IRPDP_HT2: a scalable data pre-processing method in web usage mining using Hadoop MapReduce
    Atul Kumar Srivastava
    Mitali Srivastava
    Soft Computing, 2023, 27 : 7907 - 7923
  • [6] Development of AI Educational Datasets Library Using Synthetic Dataset Generation Method
    Kim, Seul Ki
    Kim, Kwihoon
    Kim, Taeyoung
    2023 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION, ICAIIC, 2023, : 674 - 677
  • [7] Development of Compound Non-Standard Word Dataset Using Crowdsourcing Method
    Sebastian, Danny
    Nugraha, Kristian Adi
    2021 7TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND INFORMATION ENGINEERING (ICEEIE 2021), 2021, : 586 - 591
  • [8] Development of an alcohol sensor based on ZnO nanorods synthesized using a scalable solvothermal method
    Yin, Mingli
    Liu, Mengdi
    Liu, Shengzhong
    SENSORS AND ACTUATORS B-CHEMICAL, 2013, 185 : 735 - 742
  • [9] Attaining consensus on a core dataset for upper limb lymphoedema using the Delphi method: A foundational step in creating a clinical support system
    Sierla, Robyn
    Dylke, Elizabeth
    Poon, Simon
    Shaw, Tim
    Kilbreath, Sharon
    HEALTH INFORMATION MANAGEMENT JOURNAL, 2025, 54 (01) : 64 - 72
  • [10] Development of a maize breakage test method using a commercial food processor
    Noble, SD
    Brown, RB
    Davidson, VJ
    JOURNAL OF AGRICULTURAL ENGINEERING RESEARCH, 2000, 77 (04): : 385 - 390