Phoneme alignment of Filipino Speech Corpus

被引:0
|
作者
Sagum, RG [1 ]
Ensomo, RA [1 ]
Tan, EM [1 ]
Guevara, RCL [1 ]
机构
[1] Univ Philippines, Dept Elect & Elect Engn, Quezon City, Metro Manila, Philippines
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Segmentation and transcription of a speech corpus is a prerequisite in the development of an Automatic Speech Recognition (ASR) system. In this paper, we develop a method for automatically segmenting and transcribing the Filipino Speech Corpus that is being developed at the DSP Laboratory. A Multi-Layer Perceptron (MLP) will take speech feature inputs, multiply them by weights computed from a trianing set of labeled speech. The system is based on a Multi-Layer Perceptron and s Start Synchronous decoder. The corpus was divided into three subcorpora, the paragraphs and sentences sub-corpus (par+sen), the words sub-corpus and the syllables sub-corpus. For the par+sen sub-corpus and the syllables sub-corpus. For the par+sen sub-corpus, we obtained a 62.64% phoneme recognition rate with 75.68% of labels within 20ms of hand-labeled transcriptions; for the worlds-subcorpus, 63.93% phoneme recognition rate with 72/38% within 20ms of hand-labeled transcriptions; and for the syllables sub-corpus, 72.60% phoneme recognition rate with 75.69% within 20ms of hand-labeled transcriptions.
引用
收藏
页码:964 / 968
页数:5
相关论文
共 50 条
  • [1] An Automatic Phoneme Recognizer for Children's Filipino Read Speech
    Dimzon, Francis D.
    Pascual, Ronald M.
    [J]. PROCEEDINGS OF 2020 IEEE INTERNATIONAL CONFERENCE ON TEACHING, ASSESSMENT, AND LEARNING FOR ENGINEERING (IEEE TALE 2020), 2020, : 852 - 856
  • [2] Automatic acquisition of phoneme models and its application to phoneme labeling of a large size of speech corpus
    Suzuki, M
    Maeda, T
    Mori, H
    Makino, S
    [J]. DISCOVERY SCIENCE, 1998, 1532 : 437 - 438
  • [3] Phoneme Alignment Using the Information on Phonological Processes in Continuous Speech
    Kocharov, Daniil
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 1944 - 1948
  • [4] JOINT PHONEME ALIGNMENT AND TEXT-INFORMED SPEECH SEPARATION ON HIGHLY CORRUPTED SPEECH
    Schulze-Forster, Kilian
    Doire, Clement S. J.
    Richard, Gael
    Badeau, Roland
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7274 - 7278
  • [5] Text to Phoneme Alignment and Mapping for Speech Technology: A Neural Networks Approach
    Bullinaria, John A.
    [J]. 2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 625 - 632
  • [6] Visualization of Speech Perception Analysis via Phoneme Alignment: A Pilot Study
    Ratnanather, J. Tilak
    Wang, Lydia C.
    Bae, Seung-Ho
    O'Neill, Erin R.
    Sagi, Elad
    Tward, Daniel J.
    [J]. FRONTIERS IN NEUROLOGY, 2022, 12
  • [7] A COMPUTATIONAL APPROACH TO ANALYZING SENTENTIAL SPEECH-PERCEPTION - PHONEME-TO-PHONEME STIMULUS-RESPONSE ALIGNMENT
    BERNSTEIN, LE
    DEMOREST, ME
    EBERHARDT, SP
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 95 (06): : 3617 - 3622
  • [8] A large margin algorithm for speech-to-phoneme and music-to-score alignment
    Keshet, Joseph
    Shalev-Shwartz, Shai
    Singer, Yoram
    Chazan, Dan
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (08): : 2373 - 2382
  • [9] EVIDENCE FOR THE STRENGTH OF THE RELATIONSHIP BETWEEN AUTOMATIC SPEECH RECOGNITION AND PHONEME ALIGNMENT PERFORMANCE
    Baghai-Ravary, Ladan
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5262 - 5265
  • [10] Phoneme segmentation of speech
    Ziolko, Bartosz
    Manandhar, Suresh
    Wilson, Richard C.
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 282 - +