An Automatic System for Detecting Prosodic Prominence in American English Continuous Speech

被引:0
|
作者
Tamburini, F. [1 ,2 ]
Caini, C. [3 ]
机构
[1] Univ Bologna, Ctr Interfacolta Linguist Teor & Appl, Bologna, Italy
[2] Dipartimento Elettr Informat & Sistemist, Bologna, Italy
[3] Univ Bologna, Dipartimento Elettr Informat & Sistemist, Bologna, Italy
关键词
prosody; automatic feature extraction; prominence; stress accent; pitch accent;
D O I
10.1007/s10772-005-4760-z
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A precise identification of prosodic phenomena and the construction of tools able to properly manage such phenomena are essential steps to disambiguate the meaning of certain utterances. In particular they are useful for a wide variety of tasks: automatic recognition of spontaneous speech, automatic enhancement of speechgeneration systems, solving ambiguities in natural language interpretation, the construction of large annotated language resources, such as prosodically tagged speech corpora, and teaching languages to foreign students using Computer Aided Language Learning (CALL) systems. This paper presents a study on the automatic detection of prosodic prominence in continuous speech, with particular reference to American English, but with good prospects of application to other languages. Prosodic prominence involves two different prosodic features: pitch accent and stress accent. Pitch accent is acoustically connected with fundamental frequency (F0) movements and overall syllable energy, whereas stress exhibits a strong correlation with syllable nuclei duration and mid-to-high-frequency emphasis. This paper shows that a careful measurement of these acoustic parameters, as well as the identification of their connection to prosodic parameters, makes it possible to build an automatic system capable of identifying prominent syllables in utterances with performance comparable with the inter-human agreement reported in the literature. Two different prominence detectors were studied and developed: the first uses a training corpus to set up thresholds properly, while the second uses a pure unsupervised method. In both cases, it is worth stressing that only acoustic parameters derived directly from speech waveforms are exploited.
引用
下载
收藏
页码:33 / 44
页数:12
相关论文
共 50 条
  • [1] An Automatic System for Detecting Prosodic Prominence in American English Continuous Speech
    F. Tamburini
    C. Caini
    International Journal of Speech Technology, 2005, 8 (1) : 33 - 44
  • [2] Prosodic prominence detection in speech
    Tamburini, F
    SEVENTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOL 1, PROCEEDINGS, 2003, : 385 - 388
  • [3] PROSODIC ASPECTS OF AMERICAN ENGLISH SPEECH RHYTHM
    NAKATANI, LH
    OCONNOR, KD
    ASTON, CH
    PHONETICA, 1981, 38 (1-3) : 84 - 106
  • [4] Automatic Detection of Prosodic Focus in American English
    Cho, Sunghye
    Liberman, Mark
    Lee, Yong-cheol
    INTERSPEECH 2019, 2019, : 3470 - 3474
  • [5] AUTOMATIC DETECTION OF PROSODIC PROMINENCE BY MEANS OF ACOUSTIC ANALYSES
    Tamburini, Fabio
    LINGUE E LINGUAGGIO, 2015, 14 (01) : 131 - 148
  • [6] Automatic Labelling of Prosodic Prominence, Phrasing and Disfluencies in French Speech by Simulating the Perception of Naive and Expert Listeners
    Christodoulides, George
    Avanzi, Mathieu
    Simon, Anne Catherine
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3936 - 3940
  • [7] Prosodic Prominence and Focus: Expectation Affects Interpretation in Samoan and English
    Calhoun, Sasha
    Wollum, Emma
    Va'ai, Emma Kruse
    LANGUAGE AND SPEECH, 2021, 64 (02) : 346 - 380
  • [8] AUTOMATIC DETECTION OF PROSODIC BOUNDARIES IN SPEECH
    CAMPBELL, N
    SPEECH COMMUNICATION, 1993, 13 (3-4) : 343 - 354
  • [9] Use of static/dynamic parameters in automatic phonemic segmentation system for English continuous speech
    Furuichi, C
    Aizawa, K
    Imai, S
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 1996, 79 (03): : 130 - 142
  • [10] Fully Automatic Segmentation for Prosodic Speech Corpora
    Hoffmann, Sarah
    Pfister, Beat
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1389 - 1392