Modelling representations in speech normalization of prosodic cues

被引:0
|
作者
Chen Si
Caicai Zhang
Puiyin Lau
Yike Yang
Bei Li
机构
[1] The Hong Kong Polytechnic University,Department of Chinese and Bilingual Studies
[2] Hong Kong Polytechnic University-Peking University Research Centre on Chinese Linguistics,Research Centre for Language, Cognition, and Neuroscience
[3] University of Hong Kong,Department of Statistics and Actuarial Science
[4] University of Hong Kong,Department of Chinese Language and Literature
[5] Hong Kong Shue Yan University,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
The lack of invariance problem in speech perception refers to a fundamental problem of how listeners deal with differences of speech sounds produced by various speakers. The current study is the first to test the contributions of mentally stored distributional information in normalization of prosodic cues. This study starts out by modelling distributions of acoustic cues from a speech corpus. We proceeded to conduct three experiments using both naturally produced lexical tones with estimated distributions and manipulated lexical tones with f0 values generated from simulated distributions. State of the art statistical techniques have been used to examine the effects of distribution parameters in normalization and identification curves with respect to each parameter. Based on the significant effects of distribution parameters, we proposed a probabilistic parametric representation (PPR), integrating knowledge from previously established distributions of speakers with their indexical information. PPR is still accessed during speech perception even when contextual information is present. We also discussed the procedure of normalization of speech signals produced by unfamiliar talker with and without contexts and the access of long-term stored representations.
引用
收藏
相关论文
共 50 条
  • [1] Modelling representations in speech normalization of prosodic cues
    Si, Chen
    Zhang, Caicai
    Lau, Puiyin
    Yang, Yike
    Li, Bei
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [2] THE EMOTIONAL STATE EFFECT OF PROSODIC SPEECH CUES
    BERGMANN, G
    GOLDBECK, T
    SCHERER, KR
    [J]. ZEITSCHRIFT FUR EXPERIMENTELLE UND ANGEWANDTE PSYCHOLOGIE, 1988, 35 (02): : 167 - 200
  • [3] Prosodic Cues in Polite and Rude Mandarin Speech
    Fan, Ping
    Gu, Wentao
    [J]. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [4] Prosodic and other cues to speech recognition failures
    Hirschberg, J
    Litman, D
    Swerts, M
    [J]. SPEECH COMMUNICATION, 2004, 43 (1-2) : 155 - 175
  • [5] Prosodic cues for rated politeness in Japanese speech
    Ofuka, E
    McKeown, JD
    Waterman, MG
    Roach, PJ
    [J]. SPEECH COMMUNICATION, 2000, 32 (03) : 199 - 217
  • [6] Modelling the Interplay of Multiple Cues in Prosodic Focus Marking
    Arnhold, Anja
    Kyrolainen, Aki-Juhani
    [J]. LABORATORY PHONOLOGY, 2017, 8 (01):
  • [7] Segmenting Speech by Mouth: The Role of Oral Prosodic Cues for Visual Speech Segmentation
    Mitchel, Aaron D.
    Lusk, Laina G.
    Wellington, Ian
    Mook, Alexis T.
    [J]. LANGUAGE AND SPEECH, 2023, 66 (04) : 819 - 832
  • [8] Prosodic feature normalization for emotion recognition by using synthesized speech
    Suzuki, Motoyuki
    Nakagawa, Shohei
    Kita, Kenji
    [J]. ADVANCES IN KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, 2012, 243 : 306 - 313
  • [9] ON GRANULARITY OF PROSODIC REPRESENTATIONS IN EXPRESSIVE TEXT-TO-SPEECH
    Babianski, Mikolaj
    Pokora, Kamil
    Shah, Raahil
    Sienkiewicz, Rafal
    Korzekwa, Daniel
    Klimkov, Viacheslav
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 892 - 899
  • [10] Prosodic cues of sarcastic speech in French: slower, higher, wider
    Loevenbruck, Helene
    Ben Jannet, Mohamed Ameur
    D'Imperio, Mariapaola
    Spini, Mathilde
    Champagne-Lavau, Maud
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3504 - 3508