Across-speaker Articulatory Normalization for Speaker-independent Silent Speech Recognition

被引:0
|
作者
Wang, Jun [1 ,2 ]
Samal, Ashok [3 ]
Green, Jordan R. [4 ]
机构
[1] Univ Texas Dallas, Dept Bioengn, Dallas, TX USA
[2] Univ Texas Dallas, Callier Ctr Commun Disorders, Dallas, TX USA
[3] Univ Nebraska, Dept Comp Sci & Engn, Lincoln, NE 68588 USA
[4] MGH Inst Hlth Profess, Dept Commun Sci & Disorders, Boston, MA USA
基金
美国国家卫生研究院;
关键词
silent speech recognition; speech kinematics; Procrustes analysis; support vector machine; VOICE; ELECTROLARYNX; KNOWLEDGE; SYSTEM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Silent speech interfaces (SSIs), which recognize speech from articulatory information (i.e., without using audio information), have the potential to enable persons with laryngectomy or a neurological disease to produce synthesized speech with a natural sounding voice using their tongue and lips. Current approaches to SSIs have largely relied on speaker-dependent recognition models to minimize the negative effects of talker variation on recognition accuracy. Speaker-independent approaches are needed to reduce the large amount of training data required from each user; only limited articulatory samples are often available for persons with moderate to severe speech impairments, due to the logistic difficulty of data collection. This paper reported an across-speaker articulatory normalization approach based on Procrustes matching, a bidimensional regression technique for removing translational, scaling, and rotational effects of spatial data. A dataset of short functional sentences was collected from seven English talkers. A support vector machine was then trained to classify sentences based on normalized tongue and lip movements. Speaker-independent classification accuracy (tested using leave-one-subject-out cross validation) improved significantly, from 68.63% to 95.90%, following normalization. These results support the feasibility of a speaker-independent SSI using Procrustes matching as the basis for articulatory normalization across speakers.
引用
收藏
页码:1179 / 1183
页数:5
相关论文
共 50 条