Automatic segmentation and labelling of multi-lingual speech data

被引:18
|
作者
Vorstermans, A [1 ]
Martens, JP [1 ]
VanCoile, B [1 ]
机构
[1] STATE UNIV GHENT, ELIS, B-9000 GHENT, BELGIUM
关键词
automatic segmentation and labelling; multi-lingual; neural networks;
D O I
10.1016/S0167-6393(96)00037-4
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A new system for the automatic segmentation and labelling of speech is presented. The system is capable of labelling speech originating from different languages without requiring extensive linguistic knowledge or large (manually segmented and labeled) training databases of that language. The system comprises small neural networks for the segmentation and the broad phonetic classification of the speech. These networks were originally trained on one task (Flemish continuous speech), and are automatically adapted to a new task. Due to the limited size of the neural networks, the segmentation and labelling strategy requires but a limited amount of computations, and the adaptation to a new task can be accomplished very quickly. The system was first evaluated on five isolated word corpora designed for the development of Dutch, French, American English, Spanish and Korean text-to-speech systems. The results show that the accuracy of the obtained automatic segmentation and labelling is comparable to that of human experts. In order to provide segmentation and labelling results which can be compared to data reported in the literature, additional tests were run on TIMIT and on the English, Danish and Italian portions of the EUROMO continuous speech utterances. The performance of our system appears to compare favourably to that of other systems.
引用
收藏
页码:271 / 293
页数:23
相关论文
共 50 条
  • [1] Multi-lingual Transformer Training for Khmer Automatic Speech Recognition
    Soky, Kak
    Li, Sheng
    Kawahara, Tatsuya
    Seng, Sopheap
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1893 - 1896
  • [2] Automatic learning of numeral grammars for multi-lingual speech synthesizers
    Flach, G
    Holzapfel, M
    Just, C
    Wachtler, A
    Wolff, M
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1291 - 1294
  • [3] Parliament Archives Used for Automatic Training of Multi-lingual Automatic Speech Recognition Systems
    Nouza, Jan
    Safarik, Radek
    [J]. TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 174 - 182
  • [4] Automatic Language Identification Using Speech Rhythm Features for Multi-Lingual Speech Recognition
    Kim, Hwamin
    Park, Jeong-Sik
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (07):
  • [5] Multi-lingual interoperability in speech technology
    Steeneken, HJM
    [J]. SPEECH COMMUNICATION, 2001, 35 (1-2) : 1 - 3
  • [6] An automatic machine translation system for multi-lingual speech to Indian sign language
    Dhanjal, Amandeep Singh
    Singh, Williamjeet
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (03) : 4283 - 4321
  • [7] An automatic machine translation system for multi-lingual speech to Indian sign language
    Amandeep Singh Dhanjal
    Williamjeet Singh
    [J]. Multimedia Tools and Applications, 2022, 81 : 4283 - 4321
  • [8] Automatic Multi-lingual Script Recognition Application
    Abu-Ain, Waleed Abdel Karim
    Abdullah, Siti Norul Huda Sheikh
    Omar, Khairuddin
    Abd Rahman, Siti Zaharah
    [J]. GEMA ONLINE JOURNAL OF LANGUAGE STUDIES, 2018, 18 (03): : 203 - 221
  • [9] Dataset and Evaluation of Automatic Speech Recognition for Multi-lingual Intent Recognition on Social Robots
    Andriella, Antonio
    Ros, Raquel
    Ellinson, Yoav
    Gannot, Sharon
    Lemaignan, Severin
    [J]. PROCEEDINGS OF THE 2024 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI 2024, 2024, : 865 - 869
  • [10] Multi-lingual and multi-modal speech processing and applications
    Ivanecky, J
    Fischer, J
    Mast, M
    Kunzmann, S
    Ross, T
    Fischer, V
    [J]. PATTERN RECOGNITION, PROCEEDINGS, 2005, 3663 : 149 - 159