Identification of soundbite and its speaker name using transcripts of broadcast news speech

被引：0

作者：

Liu F. ^{[1
]}

Liu Y. ^{[1
]}

机构：

[1] Computer Science Department, University of Texas at Dallas, Richardson, TX 75080

来源：

ACM Transactions on Asian Language Information Processing | 2010年 / 9卷 / 01期

关键词：

Automatic speech recognition; Sentence segmentation; Soundbite detection; Speaker name recognition;

D O I：

10.1145/1731035.1731037

中图分类号：

学科分类号：

摘要：

This article presents a pipeline framework for identifying soundbite and its speaker name from Mandarin broadcast news transcripts. Both of the two modules, soundbite segment detection and soundbite speaker name recognition, are based on a supervised classification approach using multiple linguistic features. We systematically evaluated performance for each module as well as the entire system, and investigated the effect of using speech recognition (ASR) output and automatic sentence segmentation. We found that both of the two components impact the pipeline system, with more degradation in the entire system performance due to automatic speaker name recognition errors than soundbite segment detection. In addition, our experimental results show that using ASR output degrades the system performance significantly, and that using automatic sentence segmentation greatly impacts soundbite detection, but has much less effect on speaker name recognition. © 2010 ACM.

引用

共 50 条

[1] Soundbite identification using reference and automatic transcripts of broadcast news speech
Liu, Feifan
Liu, Yang
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 653 - 658
[2] Speaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition
Mori, K
Nakagawa, S
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 413 - 416
[3] Open-Set Speaker Identification in Broadcast News
Gao, Chao
Saikumar, Guruprasad
Srivastava, Amit
Natarajan, Premkumar
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5280 - 5283
[4] IMPACT OF OVERLAPPING SPEECH DETECTION ON SPEAKER DIARIZATION FOR BROADCAST NEWS AND DEBATES
Charlet, Delphine
Barras, Claude
Lienard, Jean-Sylvain
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7707 - 7711
[5] Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
Silovsky, Jan
Nouza, Jan
RADIOENGINEERING, 2006, 15 (03) : 42 - 48
[6] Speaker Diarization in Broadcast News Using SubGlottal Resonances
Kadijani, Homa Afaghi
Razzazi, Farbod
2019 5TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS 2019), 2019,
[7] Comparative Study of Speaker Personality Traits Recognition in Conversational and Broadcast News Speech
Alam, Firoj
Riccardi, Giuseppe
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2850 - 2854
[8] Using Placement and Name for Speaker Identification in Captioning
Vy, Quoc V.
Fels, Deborah I.
COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PROCEEDINGS, PT 1, 2010, 6179 : 247 - 254
[9] Speaker Identification using Whispered Speech
Jawarkar, Naresh P.
Holambe, Raghunath S.
Basu, Tapan Kumar
2013 INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT 2013), 2013, : 778 - 781
[10] Improving Speaker Identification in TV-shows using person name detection in overlaid text and speech
Charlet, Delphine
Fredouille, Corinne
Damnati, Geraldine
Senay, Gregory
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2777 - 2781

← 1 2 3 4 5 →