Phonetic feature extraction for context-sensitive glottal source processing

被引:9
|
作者
Kane, John [1 ]
Aylett, Matthew [2 ,3 ]
Yanushevskaya, Irena [1 ]
Gobl, Christer [1 ]
机构
[1] Trinity Coll Dublin, Sch Linguist Speech & Commun Sci, Phonet & Speech Lab, Dublin, Ireland
[2] Univ Edinburgh, Sch Informat, Edinburgh EH8 9YL, Midlothian, Scotland
[3] CereProc Ltd, Edinburgh, Midlothian, Scotland
基金
爱尔兰科学基金会;
关键词
Voice quality; Phonation type; Glottal source; Expressive speech; Speech synthesis; DEEP NEURAL-NETWORKS; SPEECH;
D O I
10.1016/j.specom.2013.12.003
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The effectiveness of glottal source analysis is known to be dependent on the phonetic properties of its concomitant supraglottal features. Phonetic classes like nasals and fricatives are particularly problematic. Their acoustic characteristics, including zeros in the vocal tract spectrum and aperiodic noise, can have a negative effect on glottal inverse filtering, a necessary pre-requisite to glottal source analysis. In this paper, we first describe and evaluate a set of binary feature extractors, for phonetic classes with relevance for glottal source analysis. As voice quality classification is typically achieved using feature data derived by glottal source analysis, we then investigate the effect of removing data from certain detected phonetic regions on the classification accuracy. For the phonetic feature extraction, classification algorithms based on Artificial Neural Networks (ANNs), Gaussian Mixture Models (GMMs) and Support Vector Machines (SVMs) are compared. Experiments demonstrate that the discriminative classifiers (i.e. ANNs and SVMs) in general give better results compared with the generative learning algorithm (i.e. GMMs). This accuracy generally decreases according to the sparseness of the feature (e.g., accuracy is lower for nasals compared to syllabic regions). We find best classification of voice quality when just using glottal source parameter data derived within detected syllabic regions. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:10 / 21
页数:12
相关论文
共 50 条
  • [1] Probabilistic speech feature extraction with context-sensitive Bottleneck neural networks
    Woellmer, Martin
    Schuller, Bjoern
    NEUROCOMPUTING, 2014, 132 : 113 - 120
  • [2] Using phonetic feature extraction to determine optimal speech regions for maximising the effectiveness of glottal source analysis
    Kane, John
    Yanushevskaya, Irena
    Dalton, John
    Gobl, Christer
    Chasaide, Ailbhe Ni
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 29 - 33
  • [3] Bayesian metanetwork for context-sensitive feature relevance
    Terziyan, Vagan
    ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 3955 : 356 - 366
  • [4] Context-sensitive feature selection for lazy learners
    Domingos, P
    ARTIFICIAL INTELLIGENCE REVIEW, 1997, 11 (1-5) : 227 - 253
  • [5] PROBABILISTIC ASR FEATURE EXTRACTION APPLYING CONTEXT-SENSITIVE CONNECTIONIST TEMPORAL CLASSIFICATION NETWORKS
    Woellmer, Martin
    Schuller, Bjoern
    Rigoll, Gerhard
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7125 - 7129
  • [6] Context-Sensitive Temporal Feature Learning for Gait Recognition
    Huang, Xiaohu
    Zhu, Duowang
    Wang, Hao
    Wang, Xinggang
    Yang, Bo
    He, Botao
    Liu, Wenyu
    Feng, Bin
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 12889 - 12898
  • [7] The growing context-sensitive languages are the acyclic context-sensitive languages
    Niemann, G
    Woinowski, JR
    DEVELOPMENTS IN LANGUAGE THEORY, 2002, 2295 : 197 - 205
  • [8] PROCESSING EFFECTS ON A CONTEXT-SENSITIVE FRAGMENT COMPLETION TEST
    HORTON, KD
    NASH, BD
    BULLETIN OF THE PSYCHONOMIC SOCIETY, 1990, 28 (06) : 493 - 494
  • [9] A context-sensitive liar
    Juhl, CF
    ANALYSIS, 1997, 57 (03) : 202 - 204
  • [10] CONTEXT-SENSITIVE SUBSTITUTIONS
    KATS, BE
    REITBORT, IM
    NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1973, (08): : 38 - 39