A hierarchical Bayesian approach for handling missing classification data

被引:5
|
作者
Ketz, Alison C. [1 ,2 ]
Johnson, Therese L. [3 ]
Hooten, Mevin B. [4 ,5 ,6 ]
Hobbs, N. Thompson [1 ,2 ]
机构
[1] Colorado State Univ, Dept Ecosyst Sci & Sustainabil, Nat Resource Ecol Lab, Ft Collins, CO 80523 USA
[2] Colorado State Univ, Grad Degree Program Ecol, Ft Collins, CO 80523 USA
[3] Nat Pk Serv, Rocky Mt Natl Pk, Estes Pk, CO USA
[4] Colorado State Univ, US Geol Survey, Colorado Cooperat Fish & Wildlife Res Unit, Ft Collins, CO 80523 USA
[5] Colorado State Univ, Dept Fish Wildlife & Conservat Biol, Ft Collins, CO 80523 USA
[6] Colorado State Univ, Dept Stat, Ft Collins, CO 80523 USA
来源
ECOLOGY AND EVOLUTION | 2019年 / 9卷 / 06期
基金
美国国家科学基金会;
关键词
Cervus elaphus nelsoni; classification data; demographic ratio; elk; hierarchical Bayesian statistics; missing not at random data; multinomial distribution; proportion estimation; sex ratio; Wildlife Management; MARK-RECAPTURE; LIFE-HISTORY; DISEASE PROGRESSION; DEMOGRAPHIC DRIVERS; SEXUAL SEGREGATION; AGE RATIOS; POPULATION; CAPTURE; ABUNDANCE; INFERENCE;
D O I
10.1002/ece3.4927
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Ecologists use classifications of individuals in categories to understand composition of populations and communities. These categories might be defined by demographics, functional traits, or species. Assignment of categories is often imperfect, but frequently treated as observations without error. When individuals are observed but not classified, these "partial" observations must be modified to include the missing data mechanism to avoid spurious inference. We developed two hierarchical Bayesian models to overcome the assumption of perfect assignment to mutually exclusive categories in the multinomial distribution of categorical counts, when classifications are missing. These models incorporate auxiliary information to adjust the posterior distributions of the proportions of membership in categories. In one model, we use an empirical Bayes approach, where a subset of data from one year serves as a prior for the missing data the next. In the other approach, we use a small random sample of data within a year to inform the distribution of the missing data. We performed a simulation to show the bias that occurs when partial observations were ignored and demonstrated the altered inference for the estimation of demographic ratios. We applied our models to demographic classifications of elk (Cervus elaphus nelsoni) to demonstrate improved inference for the proportions of sex and stage classes. We developed multiple modeling approaches using a generalizable nested multinomial structure to account for partially observed data that were missing not at random for classification counts. Accounting for classification uncertainty is important to accurately understand the composition of populations and communities in ecological studies.
引用
收藏
页码:3130 / 3140
页数:11
相关论文
共 50 条
  • [21] Diabetes classification application with efficient missing and outliers data handling algorithms
    Torkey, Hanaa
    Ibrahim, Elhossiny
    Hemdan, E. Z. Z. El-Din
    El-Sayed, Ayman
    Shouman, Marwa A.
    COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (01) : 237 - 253
  • [22] Diabetes classification application with efficient missing and outliers data handling algorithms
    Hanaa Torkey
    Elhossiny Ibrahim
    EZZ El-Din Hemdan
    Ayman El-Sayed
    Marwa A. Shouman
    Complex & Intelligent Systems, 2022, 8 : 237 - 253
  • [23] Event Classification with Imbalanced and Missing Data for an Air-Handling Unit
    Huotari, Matti
    Framling, Kary
    2022 IEEE THE 5TH INTERNATIONAL CONFERENCE ON BIG DATA AND ARTIFICIAL INTELLIGENCE (BDAI 2022), 2022, : 82 - 86
  • [24] A heuristic approach to handling missing data in biologics manufacturing databases
    Jeanet Mante
    Nishanthi Gangadharan
    David J. Sewell
    Richard Turner
    Ray Field
    Stephen G. Oliver
    Nigel Slater
    Duygu Dikicioglu
    Bioprocess and Biosystems Engineering, 2019, 42 : 657 - 663
  • [25] A heuristic approach to handling missing data in biologics manufacturing databases
    Mante, Jeanet
    Gangadharan, Nishanthi
    Sewell, David J.
    Turner, Richard
    Field, Ray
    Oliver, Stephen G.
    Slater, Nigel
    Dikicioglu, Duygu
    BIOPROCESS AND BIOSYSTEMS ENGINEERING, 2019, 42 (04) : 657 - 663
  • [26] Bayesian classification results using data containing missing class labels
    Lynch, RS
    Willett, PK
    2003 IEEE AEROSPACE CONFERENCE PROCEEDINGS, VOLS 1-8, 2003, : 2105 - 2111
  • [27] PROPOSAL FOR HANDLING MISSING DATA
    GLEASON, TC
    STAELIN, R
    PSYCHOMETRIKA, 1975, 40 (02) : 229 - 252
  • [28] Conservative handling of missing data
    Berger, Vance W.
    CONTEMPORARY CLINICAL TRIALS, 2012, 33 (03) : 460 - 460
  • [29] The prevention and handling of the missing data
    Kang, Hyun
    KOREAN JOURNAL OF ANESTHESIOLOGY, 2013, 64 (05) : 402 - 406
  • [30] A Bayesian learning approach to linear system identification with missing data
    Pillonetto, Gianluigi
    Chiuso, Alessandro
    PROCEEDINGS OF THE 48TH IEEE CONFERENCE ON DECISION AND CONTROL, 2009 HELD JOINTLY WITH THE 2009 28TH CHINESE CONTROL CONFERENCE (CDC/CCC 2009), 2009, : 4698 - 4703