Improved Syllable-Based Text to Speech Synthesis for Tone Language Systems

被引:2
|
作者
Ekpenyong, Moses [1 ]
Udoh, EmemObong [2 ]
Udosen, Escor [3 ]
Urua, Eno-Abasi [2 ]
机构
[1] Univ Uyo, Dept Comp Sci, Uyo, Nigeria
[2] Univ Uyo, Dept Linguist & Nigerian Languages, Uyo, Nigeria
[3] Univ Calabar, Dept Linguist & Commun Studies, Calabar, Nigeria
关键词
FST; HMM; NLP; Speech synthesis; Tone modelling;
D O I
10.1007/978-3-319-08958-4_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this contribution, we document the series of progress towards attaining a generic and replicable system that is applicable not only to Nigerian languages but also other African languages. The current system implements a state-of-the-art approach called the Hidden Markov Model (HMM) approach and aims at a hybridised version which front end components would serve other NLP tasks, as well as future research and developments. We continue to tackle the language specific problems and the 'unity of purpose' phenomenon for tone language systems and improve on the speech quality as an extension of our LTC' 2011 paper. Specifically, we address issues bordering on tone modelling using syllables as basic synthesis units, with an 'eye ball' assessment of the synthesised speech quality. The results of this research offer hope for further improvements, and we envisage an unsupervised system to minimise the labour intensive aspects of the current design. Also, with the active collaboration network established in the course of this research, we are certain that a more robust system that would serve a wide variety of applications will evolve.
引用
下载
收藏
页码:3 / 15
页数:13
相关论文
共 50 条
  • [41] A Syllable-Based Framework for Unit Selection Synthesis in 13 Indian Languages
    Patil, Hemant A.
    Patel, Tanvina B.
    Shah, Nirmesh J.
    Sailor, Hardik B.
    Krishnan, Raghava
    Kasthuri, G. R.
    Nagarajan, T.
    Christina, Lilly
    Kumar, Naresh
    Raghavendra, Veera
    Kishore, S. P.
    Prasanna, S. R. M.
    Adiga, Nagaraj
    Singh, Sanasam Ranbir
    Anand, Konjengbam
    Kumar, Pranaw
    Singh, Bira Chandra
    Kumar, S. L. Binil
    Bhadran, T. G.
    Sajini, T.
    Saha, Arup
    Basu, Tulika
    Rao, K. Sreenivasa
    Narendra, N. P.
    Sao, Anil Kumar
    Kumar, Rakesh
    Talukdar, Pranhari
    Acharyaa, Purnendu
    Chandra, Somnath
    Lata, Swaran
    Murthy, Hema A.
    2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,
  • [42] Deep learning approach for dysphagia detection by syllable-based speech analysis with daily conversations
    Heo, Seokhyeon
    Uhm, Kyeong Eun
    Yuk, Doyoung
    Kwon, Bo Mi
    Yoo, Byounghyun
    Kim, Jisoo
    Lee, Jongmin
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [43] Syllable-Based Acoustic Modeling with CTC for Multi-Scenarios Mandarin speech recognition
    Zhao, Yuanyuan
    Dong, Linhao
    Xu, Shuang
    Xu, Bo
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [44] Syllable duration prediction for Farsi text-to-speech systems
    Nazari, B.
    Nayebi, K.
    Sheikhzadeh, H.
    Scientia Iranica, 2004, 11 (03) : 225 - 233
  • [45] A Rule-Based Concatenative Approach to Speech Synthesis in Indian Language Text-to-Speech Systems
    Panda, Soumya Priyadarsini
    Nayak, Ajit Kumar
    INTELLIGENT COMPUTING, COMMUNICATION AND DEVICES, 2015, 309 : 523 - 531
  • [46] Subjective Assessment of Text to Speech Synthesis Systems for the Serbian Language
    Pakoci, Edvin
    Mak, Robert
    Ostrogonac, Stevan
    2012 20TH TELECOMMUNICATIONS FORUM (TELFOR), 2012, : 732 - 735
  • [47] The Training of the Tone of Mandarin Two-syllable Words Based on Pitch Projection Synthesis Speech
    Xie, Yanlu
    Zhang, Bei
    Zhang, Jinsong
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 435 - 435
  • [48] An Efficient Syllable-Based Speech Segmentation Model Using Fuzzy and Threshold-Based Boundary Detection
    Kumari, Ruchika
    Dev, Amita
    Kumar, Ashwani
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2022, 21 (02)
  • [49] A Syllable-Based Turkish Speech Recognition System by Using Time Delay Neural Networks (TDNNs)
    Can, Burcu
    Artuner, Harun
    2013 INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2013, : 219 - 224
  • [50] Syllable-based relevance feedback techniques for Mandarin voice record retrieval using speech queries
    Lee, LS
    Bai, BR
    Chien, LF
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1459 - 1462