Advances in mandarin broadcast speech transcription at IBM under the DARPA GALE program

被引:0
|
作者
Qin, Yong [1 ]
Shi, Qin [1 ]
Liu, Yi Y. [1 ]
Aronowitz, Hagai [2 ]
Chu, Stephen M. [2 ]
Kuo, Hong-Kwang [2 ]
Zweig, Geoffrey [2 ]
机构
[1] IBM China Res Lab, Beijing 100094, Peoples R China
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
discriminative training; topic-adaptive language model; mandarin; broadcast news; broadcast conversation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes the technical and system building advances in the automatic transcription of Mandarin broadcast speech made at IBM in the first year of the DARPA GALE program. In particular, we discuss the application of minimum phone error (MPE) discriminative training and a new topic-adaptive language modeling technique. We present results on both the RT04 evaluation data and two larger community-defined test sets designed to cover both the broadcast news and the broadcast conversation domain. It is shown that with the described advances, the new transcription system achieves a 26.3% relative reduction in character error rate over our previous best-performing system, and is competitive with published numbers on these datasets. The results are further analyzed to give a comprehensive account of the relationship between the errors and the properties of the test data.
引用
收藏
页码:410 / +
页数:3
相关论文
共 15 条
  • [1] Advances in Arabic Speech Transcription at IBM Under the DARPA GALE Program
    Soltau, Hagen
    Saon, George
    Kingsbury, Brian
    Kuo, Hong-Kwang Jeff
    Mangu, Lidia
    Povey, Daniel
    Emami, Ahmad
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (05): : 884 - 894
  • [2] Advances in speech transcription at IBM under the DARPA EARS program
    Chen, Stanley F.
    Kingsbury, Brian
    Mangu, Lidia
    Povey, Daniel
    Saon, George
    Soltau, Hagen
    Zweig, Geoffrey
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05): : 1596 - 1608
  • [3] THE 2009 IBM GALE MANDARIN BROADCAST TRANSCRIPTION SYSTEM
    Chu, Stephen M.
    Povey, Daniel
    Kuo, Hong-Kwang
    Mangu, Lidia
    Zhang, Shilei
    Shi, Qin
    Qin, Yong
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4374 - 4377
  • [4] The IBM mandarin broadcast speech transcription system
    Chu, Stephen M.
    Kuo, Hong-kwang
    Liu, Yi Y.
    Qin, Yong
    Shi, Qin
    Zweig, Geoffrey
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 345 - +
  • [5] Recent advances in the IBM GALE Mandarin transcription system
    Chu, Stephen M.
    Kuo, Rong-kwang
    Mangu, Lidia
    Liu, Ji
    Qin, Yong
    Shi, Qin
    Zhang, Shi Lei
    Aronowitz, Hagai
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4329 - 4332
  • [6] Advances in Mandarin Broadcast Speech Recognition
    Hwang, Mei-Yuh
    Wang, Wen
    Lei, Xin
    Zheng, Jing
    Cetin, Ozgur
    Peng, Gang
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2876 - +
  • [7] THE IBM 2008 GALE ARABIC SPEECH TRANSCRIPTION SYSTEM
    Saon, George
    Soltau, Hagen
    Chaudhari, Upendra
    Chu, Stephen
    Kingsbury, Brian
    Kuo, Hong-Kwang
    Mangu, Lidia
    Povey, Daniel
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4378 - 4381
  • [8] THE IBM 2009 GALE ARABIC SPEECH TRANSCRIPTION SYSTEM
    Kingsbury, Brian
    Soltau, Hagen
    Saon, George
    Chu, Stephen
    Kuo, Hong-Kwang
    Mangu, Lidia
    Ravuri, Suman
    Morgan, Nelson
    Janin, Adam
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4672 - 4675
  • [9] A Very Large Scale Mandarin Chinese Broadcast Collection for the GALE Program
    Yi, Liu
    Fung, Pascale
    Yang Yongsheng
    DiPersio, Denise
    Glenn, Meghan Lammie
    Strassel, Stephanie M.
    Cieri, Christopher
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : J83 - J88
  • [10] The IBM Speech Activity Detection System for the DARPA RATS Program
    Saon, George
    Thomas, Samuel
    Soltau, Hagen
    Ganapathy, Sriram
    Kingsbury, Brian
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3464 - 3468