Evaluation of the principal-component and expectation-maximization methods for estimating missing data in morphometric studies

被引:0
|
作者
Strauss, RE [1 ]
Atanassov, MN
De Oliveira, JA
机构
[1] Texas Tech Univ, Dept Sci Biol, Lubbock, TX 79409 USA
[2] Texas Tech Univ, Dept Geosci, Lubbock, TX 79409 USA
[3] Univ Fed Rio de Janeiro, Museu Nacl, Dept Vertebrados, Rio de Janeiro, Brazil
关键词
D O I
10.1671/0272-4634(2003)023[0284:EOTPAE]2.0.CO;2
中图分类号
Q91 [古生物学];
学科分类号
0709 ; 070903 ;
摘要
Vertebrate skeletons, particularly fossils, commonly have damaged, distorted, or missing structures. Because multivariate morphometric methods require complete data matrices, there are two possible solutions: to omit the specimens or characters having missing values, or to estimate missing values from the remainder of the data. Omission of specimens or characters reduces the data available for analysis, and thus the power to detect patterns or differences. Univariate and bivariate-regression methods are known to reduce the total variance of the data, and thus are not considered here. We compared the two most common multivariate methods: expectation-maximization (EM), which uses the covariance matrix directly, and principal-component (PC) estimation, based on regression of characters on principal components. Performance was evaluated by computer simulation of randomly introduced missing data in constructed data sets of known structure, and in several complete fossil (Pterodactylus skeleton) and recent (Alligator skeleton, Canis skull) data sets. The EM and PC methods displayed consistent and similar patterns of behavior for varying combinations of specimens and characters and across a broad range of amounts of missing data. Reliability was greatest for moderate numbers of characters (6-12) and larger sample sizes. For fewer characters the maximum amount of missing data that can be predicted increases substantially, but with a decrease in reliability. Both methods produce accurate estimates of missing values, but EM estimates are more precise. EM also outperforms the PC method in the maximum proportion of missing values that can be reliably estimated (almost 50% for small numbers of characters).
引用
收藏
页码:284 / 296
页数:13
相关论文
共 44 条
  • [1] Expectation-Maximization Approach to Fault Diagnosis With Missing Data
    Zhang, Kangkang
    Gonzalez, Ruben
    Huang, Biao
    Ji, Guoli
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2015, 62 (02) : 1231 - 1240
  • [2] Alternative expectation approaches for expectation-maximization missing data imputations in cox regression
    Saglam, Fatih
    Sanli, Tuba
    Cengiz, Mehmet Ali
    Terzi, Yuksel
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023, 52 (12) : 5966 - 5974
  • [3] Target Localization and Signature Extraction in GPR Data Using Expectation-Maximization and Principal Component Analysis
    Reichman, Daniel
    Morton, Kenneth D., Jr.
    Collins, Leslie M.
    Torrione, Peter A.
    [J]. DETECTION AND SENSING OF MINES, EXPLOSIVE OBJECTS, AND OBSCURED TARGETS XIX, 2014, 9072
  • [4] Missing data imputation via the expectation-maximization algorithm can improve principal component analysis aimed at deriving biomarker profiles and dietary patterns
    Malan, Linda
    Smuts, Cornelius M.
    Baumgartner, Jeannine
    Ricci, Cristian
    [J]. NUTRITION RESEARCH, 2020, 75 : 67 - 76
  • [5] Incremental expectation maximization principal component analysis for missing value imputation for coevolving EEG data
    Sun Hee KIM
    Hyung Jeong YANG
    Kam Swee NG
    [J]. Journal of Zhejiang University-Science C(Computers & Electronics)., 2011, 12 (08) - 697
  • [6] Incremental expectation maximization principal component analysis for missing value imputation for coevolving EEG data
    Kim, Sun Hee
    Yang, Hyung Jeong
    Ng, Kam Swee
    [J]. JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2011, 12 (08): : 687 - 697
  • [7] Incremental expectation maximization principal component analysis for missing value imputation for coevolving EEG data
    Sun Hee Kim
    Hyung Jeong Yang
    Kam Swee Ng
    [J]. Journal of Zhejiang University SCIENCE C, 2011, 12 : 687 - 697
  • [8] Incremental expectation maximization principal component analysis for missing value imputation for coevolving EEG data
    Sun Hee KIM
    Hyung Jeong YANG
    Kam Swee NG
    [J]. Frontiers of Information Technology & Electronic Engineering, 2011, 12 (08) : 687 - 697
  • [9] Recent developments in expectation-maximization methods for analyzing complex data
    Ng, Shu-Kay
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2013, 5 (06): : 415 - 431
  • [10] Missing Step Count Data? Step Away From the Expectation-Maximization Algorithm
    Tackney, Mia S.
    Stahl, Daniel
    Williamson, Elizabeth
    Carpenter, James
    [J]. JOURNAL FOR THE MEASUREMENT OF PHYSICAL BEHAVIOUR, 2022, 5 (04) : 205 - 214