Varying acoustic-phonemic ambiguity reveals that talker normalization is obligatory in speech processing

被引:29
|
作者
Choi, Ja Young [1 ,2 ]
Hu, Elly R. [1 ]
Perrachione, Tyler K. [1 ]
机构
[1] Boston Univ, Dept Speech Language & Hearing Sci, 635 Commonwealth Ave, Boston, MA 02215 USA
[2] Harvard Univ, Program Speech & Hearing Biosci & Technol, Cambridge, MA 02138 USA
基金
美国国家卫生研究院;
关键词
Speech perception; Categorization; SPOKEN WORD RECOGNITION; SPEAKING RATE; STIMULUS VARIABILITY; TIME-COURSE; VOWEL; INFORMATION; MODEL; IDENTIFICATION; REPRESENTATION; DEPENDENCIES;
D O I
10.3758/s13414-017-1395-5
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
The nondeterministic relationship between speech acoustics and abstract phonemic representations imposes a challenge for listeners to maintain perceptual constancy despite the highly variable acoustic realization of speech. Talker normalization facilitates speech processing by reducing the degrees of freedom for mapping between encountered speech and phonemic representations. While this process has been proposed to facilitate the perception of ambiguous speech sounds, it is currently unknown whether talker normalization is affected by the degree of potential ambiguity in acoustic-phonemic mapping. We explored the effects of talker normalization on speech processing in a series of speeded classification paradigms, parametrically manipulating the potential for inconsistent acoustic-phonemic relationships across talkers for both consonants and vowels. Listeners identified words with varying potential acoustic-phonemic ambiguity across talkers (e.g., beet/boat vs. boot/boat) spoken by single or mixed talkers. Auditory categorization of words was always slower when listening to mixed talkers compared to a single talker, even when there was no potential acoustic ambiguity between target sounds. Moreover, the processing cost imposed by mixed talkers was greatest when words had the most potential acoustic-phonemic overlap across talkers. Models of acoustic dissimilarity between target speech sounds did not account for the pattern of results. These results suggest (a) that talker normalization incurs the greatest processing cost when disambiguating highly confusable sounds and (b) that talker normalization appears to be an obligatory component of speech perception, taking place even when the acoustic-phonemic relationships across sounds are unambiguous.
引用
收藏
页码:784 / 797
页数:14
相关论文
共 4 条
  • [1] Varying acoustic-phonemic ambiguity reveals that talker normalization is obligatory in speech processing
    Ja Young Choi
    Elly R. Hu
    Tyler K. Perrachione
    [J]. Attention, Perception, & Psychophysics, 2018, 80 : 784 - 797
  • [2] APPLYING MULTITASK LEARNING TO ACOUSTIC-PHONEMIC MODEL FOR MISPRONUNCIATION DETECTION AND DIAGNOSIS IN L2 ENGLISH SPEECH
    Mao, Shaoguang
    Wu, Zhiyong
    Li, Runnan
    Li, Xu
    Meng, Helen
    Cai, Lianhong
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6254 - 6258
  • [3] INTEGRATING ARTICULATORY FEATURES INTO ACOUSTIC-PHONEMIC MODEL FOR MISPRONUNCIATION DETECTION AND DIAGNOSIS IN L2 ENGLISH SPEECH
    Mao, Shaoguang
    Wu, Zhiyong
    Li, Xu
    Li, Runnan
    Wu, Xixin
    Meng, Helen
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [4] Principal component decomposition of acoustic and neural representations of time-varying pitch reveals adaptive efficient coding of speech covariation patterns
    Llanos, Fernando
    Gnanateja, G. Nike
    Chandrasekaran, Bharath
    [J]. BRAIN AND LANGUAGE, 2022, 230