Automatic utterance segmentation tool for speech corpus

被引：0

作者：

Ozawa, Mitsuhiro ^{[1
]}

Tsuge, Satoru ^{[2
]}

Shishibori, Masami ^{[2
]}

Kita, Kenji ^{[3
]}

Fukumi, Minoru ^{[2
]}

Ren, Fuji ^{[4
]}

Kuroiwa, Shingo ^{[2
]}

机构：

[1] Univ Tokushima, Grad Sch Adv Technol & Sci, Tokushima, Japan

[2] Univ Tokushima, Inst Technol & Sci, Tokushima, Japan

[3] Univ Tokushima, Ctr Adv Informat Technol, Tokushima, Japan

[4] Univ Tokushima, Beijing Univ Posts & Telecommun, Inst Technol & Sci, Tokushima, Japan

来源：

PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07) | 2007年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, we collect the speech data for investigating an intra-speakers' speech variability over a short and long time. In general, to reduce the load of speakers, the speech data are collected as one file from collecting start to collecting end. Hence, there are some noises, non-speech sections and mistaken sections in this file. Consequently, we must segment this file into individual utterances and select the useful utterances. This process requires a lot of time and efforts. In this paper, we propose an automatic utterance segmentation tool for dividing the collected speech data. The proposed tool is composed of four processes, which are a voice activity detection, speech recognition, a DP matching, and a correct of speech section. For evaluating the proposed tool, we conduct the evaluation experiments using a female speaker's speech data in our corpus. Experimental results show that the proposed method can reduce a filing time by 90% compared to a manual filing. In This paper, first, we introduced the large speech corpus. This speech corpus contains is the speech data collected by specific speaker over long and short time periods. And, we explained the automatic utterance segmentation tool which we made in the case of corpus build. And inspected the validity. As a result, it was demonstrated that the automatic utterance segmentation tool was high-performance. Furthermore, it was demonstrated that speech corpus build became simple by using the automatic utterance segmentation tool.

引用

页码：401 / +

页数：2

共 50 条

[1] Automatic Speech Segmentation and Multi Level Labeling Tool
Kumar, R. Ravindra
Sulochana, K. G.
Stephen, Jose
INFORMATION SYSTEMS FOR INDIAN LANGUAGES, 2011, 139 : 9 - 14
[2] Signal energy-based Automatic Speech Splitter: A tool for developing speech corpus
Suyanto
TENCON 2007 - 2007 IEEE REGION 10 CONFERENCE, VOLS 1-3, 2007, : 475 - 478
[3] UTTERANCE CLASSIFICATION CONFIDENCE IN AUTOMATIC SPEECH RECOGNITION
KIMBALL, R
ROTHKOPF, MH
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1976, 24 (02): : 188 - 189
[4] ALISA: An automatic lightly supervised speech segmentation and alignment tool
Stan, A.
Mamiya, Y.
Yamagishi, J.
Bell, P.
Watts, O.
Clark, R. A. J.
King, S.
COMPUTER SPEECH AND LANGUAGE, 2016, 35 : 116 - 133
[5] Automatic Speech Segmentation for Automatic Speech Translation
Klosowski, Piotr
Dustor, Adam
COMPUTER NETWORKS, CN 2013, 2013, 370 : 466 - 475
[6] An automatic close copy speech synthesis tool for large-scale speech corpus evaluation
Gibbon, Dafydd
Bachan, Jolanta
SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 902 - 907
[7] Corpus for automatic speech recognition
Adda-Decker, Martine
REVUE FRANCAISE DE LINGUISTIQUE APPLIQUEE, 2007, 12 (01): : 71 - 84
[8] AUTOMATIC SEGMENTATION OF SPEECH
VANHEMERT, JP
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1991, 39 (04) : 1008 - 1012
[9] The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition
Mukiibi, Jonathan
Katumba, Andrew
Nakatumba-Nabende, Joyce
Hussein, Ali
Meyer, Josh
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1945 - 1954
[10] Joint, Incremental Disfluency Detection and Utterance Segmentation from Speech
Hough, Julian
Schlangen, David
15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, 2017, : 326 - 336

← 1 2 3 4 5 →