Data-driven detection and analysis of the patterns of creaky voice

被引:38
|
作者
Drugman, Thomas [1 ]
Kane, John [2 ]
Gobl, Christer [2 ]
机构
[1] Univ Mons 31, TCTS Lab, B-7000 Mons, Belgium
[2] Univ Dublin Trinity Coll, Sch Linguist Speech & Commun Sci, Phonet & Speech Lab, Dublin 2, Ireland
来源
COMPUTER SPEECH AND LANGUAGE | 2014年 / 28卷 / 05期
基金
爱尔兰科学基金会;
关键词
Creaky voice; Vocal fry; Irregular phonation; Glottal source; IRREGULAR PHONATION; AUTOMATIC DETECTION; QUALITY; GLOTTALIZATION; DURATION;
D O I
10.1016/j.csl.2014.03.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates the temporal excitation patterns of creaky voice. Creaky voice is a voice quality frequently used as a phrase-boundary marker, but also as a means of portraying attitude, affective states and even social status. Consequently, the automatic detection and modelling of creaky voice may have implications for speech technology applications. The acoustic characteristics of creaky voice are, however, rather distinct from modal phonation. Further, several acoustic patterns can bring about the perception of creaky voice, thereby complicating the strategies used for its automatic detection, analysis and modelling. The present study is carried out using a variety of languages, speakers, and on both read and conversational data and involves a mutual information-based assessment of the various acoustic features proposed in the literature for detecting creaky voice. These features are then exploited in classification experiments where we achieve an appreciable improvement in detection accuracy compared to the state of the art. Both experiments clearly highlight the presence of several creaky patterns. A subsequent qualitative and quantitative analysis of the identified patterns is provided, which reveals a considerable speaker-dependent variability in the usage of these creaky patterns. We also investigate how creaky voice detection systems perform across creaky patterns. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1233 / 1253
页数:21
相关论文
共 50 条
  • [1] Data-driven voice source waveform analysis and synthesis
    Gudnason, Jon
    Thomas, Mark R. P.
    Ellis, Daniel P. W.
    Naylor, Patrick A.
    [J]. SPEECH COMMUNICATION, 2012, 54 (02) : 199 - 211
  • [2] Data-Driven Analysis of Animal Behavioral Patterns
    Menaker, Tom
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON ANIMAL-COMPUTER INTERACTION, ACI 2020: Embodied Dialogues, 2020,
  • [3] Evaluating automatic creaky voice detection methods
    White, Hannah
    Penney, Joshua
    Gibson, Andy
    Szakay, Anita
    Cox, Felicity
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2022, 152 (03): : 1476 - 1486
  • [4] Resonator-based Creaky Voice Detection
    Drugman, Thomas
    Kane, John
    Gobl, Christer
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1590 - 1593
  • [5] DATA-DRIVEN VOICE SOURCE WAVEFORM MODELLING
    Thomas, Mark R. R.
    Gudnason, Jon
    Naylor, Patrick A.
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3965 - 3968
  • [6] Data-Driven Network Analysis for Anomaly Traffic Detection
    Alam, Shumon
    Alam, Yasin
    Cui, Suxia
    Akujuobi, Cajetan
    [J]. SENSORS, 2023, 23 (19)
  • [7] Data-Driven Road Detection
    Alvarez, Jose M.
    Salzmann, Mathieu
    Barnes, Nick
    [J]. 2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 1134 - 1141
  • [8] Intelligent Data-Driven Model for Diabetes Diurnal Patterns Analysis
    Eissa, Mohammad R.
    Good, Tim
    Elliott, Jackie
    Benaissa, Mohammed
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2020, 24 (10) : 2984 - 2992
  • [9] Data-driven traffic congestion patterns analysis: a case of Beijing
    Li X.
    Gui J.
    Liu J.
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (7) : 9035 - 9048
  • [10] Automatic detection of creaky voice using epoch parameters
    Narendra, N. P.
    Rao, K. Sreenivasa
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2347 - 2351