Automatic utterance segmentation tool for speech corpus

被引：0

作者：

Ozawa, Mitsuhiro ^{[1
]}

Tsuge, Satoru ^{[2
]}

Shishibori, Masami ^{[2
]}

Kita, Kenji ^{[3
]}

Fukumi, Minoru ^{[2
]}

Ren, Fuji ^{[4
]}

Kuroiwa, Shingo ^{[2
]}

机构：

[1] Univ Tokushima, Grad Sch Adv Technol & Sci, Tokushima, Japan

[2] Univ Tokushima, Inst Technol & Sci, Tokushima, Japan

[3] Univ Tokushima, Ctr Adv Informat Technol, Tokushima, Japan

[4] Univ Tokushima, Beijing Univ Posts & Telecommun, Inst Technol & Sci, Tokushima, Japan

来源：

PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07) | 2007年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, we collect the speech data for investigating an intra-speakers' speech variability over a short and long time. In general, to reduce the load of speakers, the speech data are collected as one file from collecting start to collecting end. Hence, there are some noises, non-speech sections and mistaken sections in this file. Consequently, we must segment this file into individual utterances and select the useful utterances. This process requires a lot of time and efforts. In this paper, we propose an automatic utterance segmentation tool for dividing the collected speech data. The proposed tool is composed of four processes, which are a voice activity detection, speech recognition, a DP matching, and a correct of speech section. For evaluating the proposed tool, we conduct the evaluation experiments using a female speaker's speech data in our corpus. Experimental results show that the proposed method can reduce a filing time by 90% compared to a manual filing. In This paper, first, we introduced the large speech corpus. This speech corpus contains is the speech data collected by specific speaker over long and short time periods. And, we explained the automatic utterance segmentation tool which we made in the case of corpus build. And inspected the validity. As a result, it was demonstrated that the automatic utterance segmentation tool was high-performance. Furthermore, it was demonstrated that speech corpus build became simple by using the automatic utterance segmentation tool.

引用

页码：401 / +

页数：2

共 50 条

[21] Bangladeshi Bangla speech corpus for automatic speech recognition research
Kibria, Shafkat
Samin, Ahnaf Mozib
Kobir, M. Humayon
Rahman, M. Shahidur
Selim, M. Reza
Iqbal, M. Zafar
Speech Communication, 2022, 136 : 84 - 97
[22] KsponSpeech: Korean Spontaneous Speech Corpus for Automatic Speech Recognition
Bang, Jeong-Uk
Yun, Seung
Kim, Seung-Hi
Choi, Mu-Yeol
Lee, Min-Kyu
Kim, Yeo-Jeong
Kim, Dong-Hyun
Park, Jun
Lee, Young-Jik
Kim, Sang-Hun
APPLIED SCIENCES-BASEL, 2020, 10 (19): : 1 - 17
[23] An automatic speech recognition system for spontaneous Punjabi speech corpus
Kumar Y.
Singh N.
International Journal of Speech Technology, 2017, 20 (2) : 297 - 303
[24] Automatic Construction of the Finnish Parliament Speech Corpus
Mansikkaniemi, Andre
Smit, Peter
Kurimo, Mikko
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3762 - 3766
[25] DNN ADAPTATION FOR RECOGNITION OF CHILDREN SPEECH THROUGH AUTOMATIC UTTERANCE SELECTION
Matassoni, Marco
Falavigna, Daniele
Giuliani, Diego
2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 644 - 651
[26] Analysis of HMM Temporal Evolution for Automatic Speech Recognition and Utterance Verification
Casar, Marta
Fonollosa, Jose A. R.
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 613 - 616
[27] CEASR: A Corpus for Evaluating Automatic Speech Recognition
Ulasik, Malgorzata Anna
Huerlimann, Manuela
Germann, Fabian
Gedik, Esin
Benites, Fernando
Cieliebak, Mark
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6477 - 6485
[28] Multimodal English corpus for automatic speech recognition
Kunka, Bartosz
Kupryjanow, Adam
Dalka, Piotr
Bratoszewski, Piotr
Szczodrak, Maciej
Spaleniak, Pawel
Szykulski, Marcin
Czyzewski, Andrzej
2013 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2013, : 106 - 111
[29] Frame Distance Array Algorithm Parameter Tune-up for TIMIT Corpus Automatic Speech Segmentation
Seddiq, Yasser M.
Alotaibi, Yousef A.
Selouani, Sid-Ahmed
2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRO/INFORMATION TECHNOLOGY (EIT), 2015, : 241 - 245
[30] Automatic speech segmentation to improve speech synthesis performance
1600, IEEE Computer Society

← 1 2 3 4 5 →