A probabilistic framework for feature-based speech recognition

被引:0
|
作者
Glass, J
Chang, J
McCandless, M
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Most current speech recognizers use an observation space which is based on a temporal sequence of ''frames'' (e.g., Mel-cepstra). There is another class of recognizer which further processes these frames to produce a segment-based network, and represents each segment by fixed-dimensional ''features,'' In such feature-based recognizers the observation space takes the form of a temporal network of feature vectors, so that a single segmentation of an utterance will use a subset of all possible feature vectors. In this work we examine a maximum a posteriori decoding strategy for feature-based recognizers and develop a normalization criterion useful for a segment-based Viterbi or A* search. We report experimental results far the task of phonetic recognition on the TIMIT corpus where we achieved context-independent and context-dependent (using diphones) results on the core test set of 64.1% and 69.5% respectively.
引用
收藏
页码:2277 / 2280
页数:4
相关论文
共 50 条