DNN Online with iVectors Acoustic Modeling and Doc2Vec Distributed Representations for Improving Automated Speech Scoring

被引：1

作者：

Tao, Jidong ^{[1
]}

Chen, Lei ^{[1
]}

Lee, Chong Min ^{[1
]}

机构：

[1] Educ Testing Serv, 660 Rosedale Rd, Princeton, NJ 08541 USA

来源：

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年

关键词：

automated speech scoring; non-native spontaneous speech; automatic speech recognition; unsupervised language model adaptation; content vector analysis; Doc2Vec;

D O I：

10.21437/Interspeech.2016-1457

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

When applying automated speech-scoring technology to the rating of globally administered real assessments, there are several practical challenges: (a) ASR accuracy on non-native spontaneous speech is generally low; (b) due to the data mismatch between an ASR systems training stage and its final usage, the recognition accuracy obtained in practice is even lower; (c) content-relevance was not widely used in the scoring models in operation due to various technical and logistical issues. For this paper, an ASR in a deep neural network (DNN) architecture of multi-splice with iVectors was trained and resulted in a performance at 19.1% word error rate (WER). Secondly, we applied language model (LM) adaptation for the prompts that were not covered in ASR training by using the spoken responses acquired from previous operational tests, and we were able to reduce the relative WER by more than 8%. The boosted ASR performance improves the scoring performance without any extra human annotation cost. Finally, the developed ASR system allowed us to apply content features in practice. Besides the conventional frequency-based approach, content vector analysis (CVA), we also explored distributed representations with Doc2Vec and found an improvement on content measurement.

引用

页码：3117 / 3121

页数：5

共 1 条

[1] Automated Scoring of Interview Videos using Doc2Vec Multimodal Feature Extraction Paradigm
Chen, Lei
Feng, Gary
Leong, Chee Wee
Lehman, Blair
Martin-Raugh, Michelle
Kell, Harrison
Lee, Chong Min
Yoon, Su-Youn
ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 161 - 168

← 1 →