Analysis of data consistency identifies measurement abnormality in Howells' craniometric test data set

被引:1
|
作者
Pang, Jinyong [1 ,2 ]
Dong, Yibo [1 ,2 ,4 ]
Turner, Christopher [3 ]
Li, Chang [1 ,2 ]
Liu, Xiaoming [1 ,2 ]
机构
[1] Univ S Florida, USF Genom, 3720 Spectrum Blvd,Suite 304, Tampa, FL 33612 USA
[2] Univ S Florida, Coll Publ Hlth, 3720 Spectrum Blvd,Suite 304, Tampa, FL 33612 USA
[3] Univ S Florida, Coll Arts & Sci, Dept Anthropol, Tampa, FL 33612 USA
[4] Bur Publ Hlth Labs, 1217 N Pearl St, Jacksonville, FL USA
来源
关键词
data contency; SIS; Howells' craniometric data; simotic chord; simotic subtense; sis; WNB;
D O I
10.1002/ajpa.24631
中图分类号
Q98 [人类学];
学科分类号
030303 ;
摘要
Howells' craniometric data set is the largest publicly available craniometric data set on the internet and has been widely used in craniometric methods development. The data consists of a main data set of 2524 human crania from 28 populations and an additional "test" data set of 524 crania. Up to 82 measurements were recorded from those crania. We studied the data consistency between the main and test data sets for potential combined usage of the two. We found that the two data sets can be separated clearly via Uniform Manifold Approximation and Projection, suggesting some data inconsistency between the two. To further investigate the cause, we split the two data sets into six continental groups (African, Austro-Melanesian, East Asian, European, Native American, and Polynesian) and tested the distribution difference between the two data sets for each of the groups. We found that the measures of simotic chord (WNB) and simotic subtense (SIS) are significantly and abnormally larger in the test data set than in the main data set. After removing the two measures, the two data sets are broadly comparable. We further showed the evidence that missing decimal points likely caused the abnormality.
引用
收藏
页码:687 / 692
页数:6
相关论文
共 50 条
  • [21] Consistency Analysis of Sensor Data Distribution
    Reali, Gianluca
    Femminella, Mauro
    2013 9TH INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING CONFERENCE (IWCMC), 2013, : 1442 - 1447
  • [22] Clustering Consistency in Neuroimaging Data Analysis
    Liu, Chao
    Abu-Jamous, Basel
    Brattico, Elvira
    Nandi, Asoke
    2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2015, : 1118 - 1122
  • [23] Consistency in ordinal data analysis I
    Herden, G
    Pallack, A
    MATHEMATICAL SOCIAL SCIENCES, 2002, 43 (01) : 79 - 113
  • [24] CONSISTENCY TEST OF AZEOTROPIC DATA IN TERNARY-SYSTEMS
    MATSUYAMA, H
    KAGAKU KOGAKU RONBUNSHU, 1988, 14 (04) : 510 - 516
  • [25] Enhanced data consistency of a portable gait measurement system
    Lin, Hsien-I
    Chiang, Y. P.
    REVIEW OF SCIENTIFIC INSTRUMENTS, 2013, 84 (11):
  • [26] Analysis of the Westland data set
    Wen, F
    Willett, P
    Deb, S
    COMPONENT AND SYSTEMS DIAGNOSTICS, PROGNOSIS AND HEALTH MANAGEMENT, 2001, 4389 : 204 - 215
  • [27] A TEST TO DETERMINE THE MULTIVARIATE NORMALITY OF A DATA SET
    SMITH, SP
    JAIN, AK
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1988, 10 (05) : 757 - 761
  • [28] Data Base Analysis using a Compact Data Set
    Ferrnando Kuri-Morales, Angel
    2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 227 - 233
  • [29] Large pathway and gene set analysis of GWAS data identifies novel associations for pancreatic cancer
    Walsh, Naomi
    Zhang, Han
    Hyland, Paula L.
    Yang, Qi
    Mocci, Evelina
    Zhang, Mingfeng
    Childs, Erica J.
    Wang, Zhaoming
    Chanock, Stephen
    Hartge, Patricia
    Hoover, Robert
    Kraft, Peter
    Li, Donghui
    Jacobs, Eric J.
    Petersen, Gloria M.
    Wolpin, Brian M.
    Risch, Harvey A.
    Amundadottir, Laufey T.
    Yu, Kai
    Klein, Alison P.
    Stolzenberg-Solomon, Rachael Z.
    CANCER RESEARCH, 2018, 78 (13)
  • [30] IMPROVING THE VALIDITY OF TEST DATA OBTAINED WITH DATA MEASUREMENT SYSTEMS
    SYCHEV, EI
    KHRAMENKOV, VN
    MEASUREMENT TECHNIQUES USSR, 1988, 31 (08): : 730 - 733