Identification of soundbite and its speaker name using transcripts of broadcast news speech

被引:0
|
作者
Liu F. [1 ]
Liu Y. [1 ]
机构
[1] Computer Science Department, University of Texas at Dallas, Richardson, TX 75080
来源
ACM Transactions on Asian Language Information Processing | 2010年 / 9卷 / 01期
关键词
Automatic speech recognition; Sentence segmentation; Soundbite detection; Speaker name recognition;
D O I
10.1145/1731035.1731037
中图分类号
学科分类号
摘要
This article presents a pipeline framework for identifying soundbite and its speaker name from Mandarin broadcast news transcripts. Both of the two modules, soundbite segment detection and soundbite speaker name recognition, are based on a supervised classification approach using multiple linguistic features. We systematically evaluated performance for each module as well as the entire system, and investigated the effect of using speech recognition (ASR) output and automatic sentence segmentation. We found that both of the two components impact the pipeline system, with more degradation in the entire system performance due to automatic speaker name recognition errors than soundbite segment detection. In addition, our experimental results show that using ASR output degrades the system performance significantly, and that using automatic sentence segmentation greatly impacts soundbite detection, but has much less effect on speaker name recognition. © 2010 ACM.
引用
收藏
相关论文
共 50 条
  • [1] Soundbite identification using reference and automatic transcripts of broadcast news speech
    Liu, Feifan
    Liu, Yang
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 653 - 658
  • [2] Speaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition
    Mori, K
    Nakagawa, S
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 413 - 416
  • [3] Open-Set Speaker Identification in Broadcast News
    Gao, Chao
    Saikumar, Guruprasad
    Srivastava, Amit
    Natarajan, Premkumar
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5280 - 5283
  • [4] IMPACT OF OVERLAPPING SPEECH DETECTION ON SPEAKER DIARIZATION FOR BROADCAST NEWS AND DEBATES
    Charlet, Delphine
    Barras, Claude
    Lienard, Jean-Sylvain
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7707 - 7711
  • [5] Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
    Silovsky, Jan
    Nouza, Jan
    RADIOENGINEERING, 2006, 15 (03) : 42 - 48
  • [6] Speaker Diarization in Broadcast News Using SubGlottal Resonances
    Kadijani, Homa Afaghi
    Razzazi, Farbod
    2019 5TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS 2019), 2019,
  • [7] Comparative Study of Speaker Personality Traits Recognition in Conversational and Broadcast News Speech
    Alam, Firoj
    Riccardi, Giuseppe
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2850 - 2854
  • [8] Using Placement and Name for Speaker Identification in Captioning
    Vy, Quoc V.
    Fels, Deborah I.
    COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PROCEEDINGS, PT 1, 2010, 6179 : 247 - 254
  • [9] Speaker Identification using Whispered Speech
    Jawarkar, Naresh P.
    Holambe, Raghunath S.
    Basu, Tapan Kumar
    2013 INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT 2013), 2013, : 778 - 781
  • [10] Improving Speaker Identification in TV-shows using person name detection in overlaid text and speech
    Charlet, Delphine
    Fredouille, Corinne
    Damnati, Geraldine
    Senay, Gregory
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2777 - 2781