Data Augmentation for Infant Cry Classification

被引:1
|
作者
Kachhi, Aastha [1 ]
Chaturvedi, Shreya [1 ]
Patil, Hemant A. [1 ]
Singh, Dipesh Kumar [1 ]
机构
[1] DA IICT, Speech Res Lab, Gandhinagar, India
关键词
Infant Cry Classification; Data Augmentation; Cepstral Features; Baby Chilanto; GMM;
D O I
10.1109/ISCSLP57327.2022.10037931
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classification of normal vs. pathological infant cries is a challenging task as infant cry has higher fundamental (or pitch) frequency (F-0) and hence, distantly-spaced pitch source harmonics that sample vocal tract spectrum resulting in poor spectral resolution. This paper first presents infant cry classification system on DA-IICT infant cry corpus using two cepstral features, namely, Mel Frequency Cepstral Coefficients (MFCC), and Constant-Q Transform Cepstral Coefficients (CQCC) on statistical classifiers, namely, Gaussian Mixture Model (GMM). Results are then presented on cross-dataset scenarios on Baby Chilanto and DA-IICT datasets and found to degrade substantially. The classification results are also presented by merging these two corpora, where results were found to degrade for a certain parameter setting in CQCC. To mitigate this degradation in performance, we present three data augmentation methods, namely, signal perturbation using tempo, volume, and speed. The results obtained, indicated that the tempo perturbation enhances the performance by a small margin of 0.2 % and 0.98 % using MFCC and CQCC, respectively. However, results are found to degrade using speed perturbation by 3.49 % and 21.93 % using MFCC and CQCC, respectively. This indicates that F-0 and it's dynamics are very crucial acoustic cues for normal vs. pathological infant cries and which is why cry modes based on F-0 were used historically for infant cry classification in the literature. Additionally, the performance of the various feature sets are compared using J-statistics.
引用
收藏
页码:433 / 437
页数:5
相关论文
共 50 条
  • [1] Automatic Methods for Infant Cry Classification
    Banica, Ioana-Alina
    Cucu, Horia
    Buzo, Andi
    Burileanu, Dragos
    Burileanu, Corneliu
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMMUNICATIONS (COMM 2016), 2016, : 51 - 54
  • [2] A review of infant cry analysis and classification
    Ji, Chunyan
    Mudiyanselage, Thosini Bamunu
    Gao, Yutong
    Pan, Yi
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [3] A review of infant cry analysis and classification
    Chunyan Ji
    Thosini Bamunu Mudiyanselage
    Yutong Gao
    Yi Pan
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [4] Infant Cry Classification Integrated ANC System for Infant Incubators
    Liu, Lichuan
    Kuo, Kevin
    Kuo, Sen M.
    [J]. 2013 10TH IEEE INTERNATIONAL CONFERENCE ON NETWORKING, SENSING AND CONTROL (ICNSC), 2013, : 383 - 387
  • [5] Feature Set Optimisation for Infant Cry Classification
    Vignolo, Leandro D.
    Marcelo Albornoz, Enrique
    Ernesto Martinez, Cesar
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2018, 2018, 11238 : 455 - 466
  • [6] Infant Cry Classification: Time Frequency Analysis
    Saraswathy, J.
    Hariharan, M.
    Khairunizam, Wan
    Yaacob, Sazali
    Thiyagar, N.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON CONTROL SYSTEM, COMPUTING AND ENGINEERING (ICCSCE 2013), 2013, : 499 - +
  • [7] Robustness of Whisper Features for Infant Cry Classification
    Charola, Monil
    Rathod, Siddharth
    Patil, Hemant A.
    [J]. SPEECH AND COMPUTER, SPECOM 2023, PT II, 2023, 14339 : 421 - 433
  • [8] ANALYSIS OF ACOUSTIC FEATURES OF INFANT CRY FOR CLASSIFICATION PURPOSES
    Messaoud, Ali
    Tadj, Chakib
    [J]. 2011 24TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2011, : 89 - 92
  • [9] A review: survey on automatic infant cry analysis and classification
    Jeyaraman S.
    Muthusamy H.
    Khairunizam W.
    Jeyaraman S.
    Nadarajaw T.
    Yaacob S.
    Nisha S.
    [J]. Health and Technology, 2018, 8 (5) : 391 - 404
  • [10] Linear Frequency Residual Features for Infant Cry Classification
    Uthiraa, S.
    Kachhi, Aastha
    Patil, Hemant A.
    [J]. SPEECH AND COMPUTER, SPECOM 2023, PT I, 2023, 14338 : 550 - 561