Across-speaker Articulatory Normalization for Speaker-independent Silent Speech Recognition

被引:0
|
作者
Wang, Jun [1 ,2 ]
Samal, Ashok [3 ]
Green, Jordan R. [4 ]
机构
[1] Univ Texas Dallas, Dept Bioengn, Dallas, TX USA
[2] Univ Texas Dallas, Callier Ctr Commun Disorders, Dallas, TX USA
[3] Univ Nebraska, Dept Comp Sci & Engn, Lincoln, NE 68588 USA
[4] MGH Inst Hlth Profess, Dept Commun Sci & Disorders, Boston, MA USA
基金
美国国家卫生研究院;
关键词
silent speech recognition; speech kinematics; Procrustes analysis; support vector machine; VOICE; ELECTROLARYNX; KNOWLEDGE; SYSTEM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Silent speech interfaces (SSIs), which recognize speech from articulatory information (i.e., without using audio information), have the potential to enable persons with laryngectomy or a neurological disease to produce synthesized speech with a natural sounding voice using their tongue and lips. Current approaches to SSIs have largely relied on speaker-dependent recognition models to minimize the negative effects of talker variation on recognition accuracy. Speaker-independent approaches are needed to reduce the large amount of training data required from each user; only limited articulatory samples are often available for persons with moderate to severe speech impairments, due to the logistic difficulty of data collection. This paper reported an across-speaker articulatory normalization approach based on Procrustes matching, a bidimensional regression technique for removing translational, scaling, and rotational effects of spatial data. A dataset of short functional sentences was collected from seven English talkers. A support vector machine was then trained to classify sentences based on normalized tongue and lip movements. Speaker-independent classification accuracy (tested using leave-one-subject-out cross validation) improved significantly, from 68.63% to 95.90%, following normalization. These results support the feasibility of a speaker-independent SSI using Procrustes matching as the basis for articulatory normalization across speakers.
引用
收藏
页码:1179 / 1183
页数:5
相关论文
共 50 条
  • [31] Speaker-independent recognition of Chinese tones
    GUAN Cuntai and CHEN Yongbin(Dep. of Radio Eng.
    [J]. Chinese Journal of Acoustics, 1993, (02) : 142 - 148
  • [32] SPEAKER-INDEPENDENT DIGIT RECOGNITION SYSTEM
    SAMBUR, MR
    RABINER, LR
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 56 : S26 - S26
  • [33] A speaker-independent continuous speech recognition system using biomimetic pattern recognition
    Wang Shoujue
    Qin Hong
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2006, 15 (03) : 460 - 462
  • [34] Computer-independent and speaker-independent real time speech recognition system
    [J]. Dianxin Kexue/Telecommunications Science, 13 (11): : 28 - 31
  • [35] HMM-based integrated method for speaker-independent speech recognition
    Tsinghua Univ, Beijing, China
    [J]. Int Conf Signal Process Proc, (613-616):
  • [36] Independent and Automatic Evaluation of Speaker-Independent Acoustic-to-Articulatory Reconstruction
    Parrot, Maud
    Millet, Juliette
    Dunbar, Ewan
    [J]. INTERSPEECH 2020, 2020, : 3740 - 3744
  • [37] REFERENCE TEMPLATE ADAPTATION IN SPEAKER-INDEPENDENT ISOLATED WORD SPEECH RECOGNITION
    MCINNES, FR
    JACK, MA
    [J]. ELECTRONICS LETTERS, 1987, 23 (24) : 1304 - 1305
  • [38] SPEAKER-INDEPENDENT SPEECH RECOGNITION UNIT DEVELOPMENT FOR TELEPHONE LINE USE
    ISHII, N
    IMAI, Y
    NAKATSU, R
    ANDO, M
    [J]. JAPAN TELECOMMUNICATIONS REVIEW, 1982, 24 (03): : 267 - 274
  • [39] NORMALIZING THE VOCAL-TRACT LENGTH FOR SPEAKER-INDEPENDENT SPEECH RECOGNITION
    LIN, QG
    CHE, CW
    [J]. IEEE SIGNAL PROCESSING LETTERS, 1995, 2 (11) : 201 - 203
  • [40] SPEAKER-INDEPENDENT SPEECH-RECOGNITION SYSTEM BASED ON LINEAR PREDICTION
    GUPTA, VN
    BRYAN, JK
    GOWDY, JN
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1978, 26 (01): : 27 - 33