Phonetic feature extraction for context-sensitive glottal source processing

被引：9

作者：

Kane, John ^{[1
]}

Aylett, Matthew ^{[2
,3
]}

Yanushevskaya, Irena ^{[1
]}

Gobl, Christer ^{[1
]}

机构：

[1] Trinity Coll Dublin, Sch Linguist Speech & Commun Sci, Phonet & Speech Lab, Dublin, Ireland

[2] Univ Edinburgh, Sch Informat, Edinburgh EH8 9YL, Midlothian, Scotland

[3] CereProc Ltd, Edinburgh, Midlothian, Scotland

来源：

SPEECH COMMUNICATION | 2014年 / 59卷

基金：

爱尔兰科学基金会;

关键词：

Voice quality; Phonation type; Glottal source; Expressive speech; Speech synthesis; DEEP NEURAL-NETWORKS; SPEECH;

D O I：

10.1016/j.specom.2013.12.003

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The effectiveness of glottal source analysis is known to be dependent on the phonetic properties of its concomitant supraglottal features. Phonetic classes like nasals and fricatives are particularly problematic. Their acoustic characteristics, including zeros in the vocal tract spectrum and aperiodic noise, can have a negative effect on glottal inverse filtering, a necessary pre-requisite to glottal source analysis. In this paper, we first describe and evaluate a set of binary feature extractors, for phonetic classes with relevance for glottal source analysis. As voice quality classification is typically achieved using feature data derived by glottal source analysis, we then investigate the effect of removing data from certain detected phonetic regions on the classification accuracy. For the phonetic feature extraction, classification algorithms based on Artificial Neural Networks (ANNs), Gaussian Mixture Models (GMMs) and Support Vector Machines (SVMs) are compared. Experiments demonstrate that the discriminative classifiers (i.e. ANNs and SVMs) in general give better results compared with the generative learning algorithm (i.e. GMMs). This accuracy generally decreases according to the sparseness of the feature (e.g., accuracy is lower for nasals compared to syllabic regions). We find best classification of voice quality when just using glottal source parameter data derived within detected syllabic regions. (C) 2013 Elsevier B.V. All rights reserved.

引用

页码：10 / 21

页数：12

共 50 条

[41] Secure context-sensitive authorization
Minami, Kazuhiro
Kotz, David
PERVASIVE AND MOBILE COMPUTING, 2005, 1 (01) : 123 - 156
[42] Secure context-sensitive authorization
Minami, K
Kotz, D
THIRD IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS, PROCEEDINGS, 2005, : 257 - 268
[43] ON GROWING CONTEXT-SENSITIVE LANGUAGES
BUNTROCK, G
LORYS, K
LECTURE NOTES IN COMPUTER SCIENCE, 1992, 623 : 77 - 88
[44] Context-Sensitive Airway Management
Hung, Orlando
Murphy, Michael
ANESTHESIA AND ANALGESIA, 2010, 110 (04): : 982 - 983
[45] Context-Sensitive Document Ranking
Chang, Li-Jun
Yu, Jeffrey Xu
Qin, Lu
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 25 (03) : 444 - 457
[46] Context-sensitive resource discovery
Chen, GL
Kotz, D
PROCEEDINGS OF THE FIRST IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS (PERCOM 2003), 2003, : 243 - 252
[47] PARALLEL CONTEXT-SENSITIVE COMPILATION
ASTHAGIRI, CR
POTTER, JL
SOFTWARE-PRACTICE & EXPERIENCE, 1994, 24 (09): : 801 - 822
[48] CONTEXT-SENSITIVE RULES IN PANINI
STAAL, JF
FOUNDATIONS OF LANGUAGE, 1965, 1 (01): : 63 - 72
[49] A context-sensitive search mechanism
Hasan, O
Atwood, ME
Waters, J
Char, BW
INMIC 2004: 8th International Multitopic Conference, Proceedings, 2004, : 368 - 374
[50] Context-sensitive elemental theory
Wagner, AR
QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY SECTION B-COMPARATIVE AND PHYSIOLOGICAL PSYCHOLOGY, 2003, 56 (01): : 7 - 29

← 1 2 3 4 5 →