Speech recognition under conditions of frequency-place compression and expansion

被引:80
|
作者
Baskent, D
Shannon, RV
机构
[1] House Ear Res Inst, Dept Auditory Implants & Percept, Los Angeles, CA 90057 USA
[2] Univ So Calif, Dept Biomed Engn, Los Angeles, CA 90089 USA
来源
关键词
D O I
10.1121/1.1558357
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In normal acoustic hearing the mapping of acoustic frequency information onto the appropriate, cochlear place is a natural biological function, but in cochlear implants it is controlled by. the speech processor. The cochlear tonotopic range of the implant is determined by the length and insertion depth of the electrode array. Conventional cochlear implant electrode arrays are designed for an insertion of 25 mm inside the round window and the active electrodes occupy 16 mm, which would place the electrodes in a cochlear region corresponding to an acoustic frequency range of 500-6000 Hz. However, some implant speech processors map an,acoustic frequency range from 150 to 10 000 Hz onto these electrodes. While this mapping preserves the entire range of acoustic frequency information, it also results in a compression of the tonotopic pattern of speech information delivered to the brain. The present study measured the effects of such a compression of frequency-to-place mapping on speech recognition using acoustic simulations. Also measured were the effects, of an expansion of the frequency-to-place mapping, which produces an expanded representation of speech in the cochlea. Such an expanded representation might improve speech recognition. by improving the relative spatial (tonotopic) resolution, like. an "acoustic fovea." Phoneme and sentence recognition was measured as a function of linear (in terms of cochlear distance) frequency-place compression and expansion. These conditions were presented to normal-hearing listeners using a noise-band vocoder, simulating cochlear implant electrodes with different insertion depths and different number of electrode channels. The cochlear tonotopic range was held constant by employing the same noise carrier bands for each condition, while the analysis frequency range was either compressed or expanded relative to the carrier frequency range. For each condition, the result was compared to that of the perfect frequency-place match, where the carrier and the analysis bands were perfectly matched. Speech recognition in the matched conditions was generally better than any,condition of frequency-place expansion and compression, even when the matched condition. eliminated a considerable amount of acoustic information. This result suggests that speech recognition, at least without training, is dependent. on the. mapping of acoustic frequency information onto the appropriate cochlear place. C 2003 Acoustical Society of America.
引用
收藏
页码:2064 / 2076
页数:13
相关论文
共 50 条
  • [21] Robust Speech Emotion Recognition under Different Encoding Conditions
    Oates, Christopher
    Triantafyllopoulos, Andreas
    Steiner, Ingmar
    Schuller, Bjoern
    INTERSPEECH 2019, 2019, : 3935 - 3939
  • [22] KALDI Recipes for the Czech Speech Recognition Under Various Conditions
    Mizera, Petr
    Fiala, Jiri
    Brich, Ales
    Pollak, Petr
    TEXT, SPEECH, AND DIALOGUE, 2016, 9924 : 391 - 399
  • [23] Assessment of Frequency-Place Mismatch by Flat-Panel CT and Correlation With Cochlear Implant Performance
    Zanetti, Diego
    Conte, Giorgio
    Di Berardino, Federica
    Lo Russo, Francesco
    Cavicchiolo, Sara
    Triulzi, Fabio
    OTOLOGY & NEUROTOLOGY, 2021, 42 (01) : 165 - 173
  • [24] Frequency-to-Place Mismatch Impacts Cochlear Implant Quality of Life, But Not Speech Recognition
    Sturm, Joshua J.
    Ma, Cheng
    Mcrackan, Theodore R.
    Schvartz-Leyzac, Kara C.
    LARYNGOSCOPE, 2024, 134 (06): : 2898 - 2905
  • [25] The effect of speech and audio compression on speech recognition performance
    Besacier, L
    Bergamini, C
    Vaufreydaz, D
    Castelli, E
    2001 IEEE FOURTH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2001, : 301 - 306
  • [26] Various Speech Processing Techniques For Speech Compression And Recognition
    Karam, Jalal
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 26, PARTS 1 AND 2, DECEMBER 2007, 2007, 26 : 704 - 708
  • [27] ONTOGENESIS OF TONOTOPY IN INFERIOR COLLICULUS OF A HIPPOSIDERID BAT REVEALS POSTNATAL SHIFT IN FREQUENCY-PLACE CODE
    RUBSAMEN, R
    NEUWEILER, G
    MARIMUTHU, G
    JOURNAL OF COMPARATIVE PHYSIOLOGY A-SENSORY NEURAL AND BEHAVIORAL PHYSIOLOGY, 1989, 165 (06): : 755 - 769
  • [28] AN APPARATUS FOR SPEECH COMPRESSION AND EXPANSION AND FOR REPLAYING VISIBLE SPEECH RECORDS
    VILBIG, F
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1950, 22 (06): : 754 - 761
  • [29] DEVELOPMENT OF SPEECH RECOGNITION AND COMPRESSION DEVICE
    KAMIYA, S
    KIYAMA, J
    KIMURA, Y
    HAMAGUCHI, S
    KAWAMA, S
    SUMI, K
    KITOH, A
    TANAKA, A
    SHARP TECHNICAL JOURNAL, 1992, (54): : 19 - 22
  • [30] An embedded system for speech recognition and compression
    Yang, ZZ
    Liu, J
    Eric, C
    Guan, LC
    Chin, CK
    International Symposium on Communications and Information Technologies 2005, Vols 1 and 2, Proceedings, 2005, : 287 - 290