ICA-based hierarchical text classification for multi-domain text-to-speech synthesis

被引：0

作者：

Sevillano, X ^{[1
]}

Alías, F ^{[1
]}

Socoró, JC ^{[1
]}

机构：

[1] Univ Ramon Llull, Dept Commun & Signal Theory, Barcelona 08022, Spain

来源：

2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION | 2004年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the framework of multi-domain Text-to-Speech synthesis it is essential to (i) design a hierarchically structured database for allowing several domains in the same speech corpus and (ii) include a text classification module that, at run time, assigns the input sentences to a domain or set of domains from the database. In this paper, we present a hierarchical text classifier based on Independent Component Analysis (ICA), which is capable of (i) organizing the contents of the corpus in a hierarchical manner and (ii) classifying the texts to be synthesized according to the learned structure. The document organization and classification performance of our ICA-based hierarchical classifier are evaluated in several encouraging experiments conducted on a journalistic-style text corpus for speech synthesis in Catalan.

引用

页码：697 / 700

页数：4

共 50 条

[41] EXPRESSIVITY TRANSFER IN TRANSFORMER-BASED TEXT-TO-SPEECH SYNTHESIS
Hamed, Mohamed
Lachiri, Zied
2024 IEEE 7TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES, SIGNAL AND IMAGE PROCESSING, ATSIP 2024, 2024, : 443 - 448
[42] Pre-trained Text Embeddings for Enhanced Text-to-Speech Synthesis
Hayashi, Tomoki
Watanabe, Shinji
Toda, Tomoki
Takeda, Kazuya
Toshniwal, Shubham
Livescu, Karen
INTERSPEECH 2019, 2019, : 4430 - 4434
[43] ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis
Xue, Jinlong
Deng, Yayue
Han, Yichen
Li, Ya
Sun, Jianqing
Liang, Jiaen
2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 230 - 234
[44] A prosodic model for text-to-speech synthesis in French
Di Cristo, A
Di Cristo, P
Campione, E
Véronis, J
INTONATION: ANALYSIS, MODELLING AND TECHNOLOGY, 2000, 15 : 321 - 355
[45] A stochastic model of intonation for text-to-speech synthesis
Véronis, J
Di Cristo, P
Courtois, F
Chaumette, C
SPEECH COMMUNICATION, 1998, 26 (04) : 233 - 244
[46] FACTORIZED CONTEXT MODELLING FOR TEXT-TO-SPEECH SYNTHESIS
Lu, Heng
King, Simon
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7849 - 7853
[47] Accented Text-to-Speech Synthesis With Limited Data
Zhou, Xuehao
Zhang, Mingyang
Zhou, Yi
Wu, Zhizheng
Li, Haizhou
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1699 - 1711
[48] Text-to-speech synthesis system for Punjabi language
Dept. of Computer Sc. & Engg, Guru Nanak Dev Engg. College, Ludhiana
Pb, India
不详
Pb, India
Commun. Comput. Info. Sci., (302-303):
[49] A single chip solution for text-to-speech synthesis
Aktan, O
Baskaya, IF
Dündar, G
Proceedings of the 2005 European Conference on Circuit Theory and Design, Vol 3, 2005, : 449 - 452
[50] Hierarchical Text Classification based on LDA and Domain Ontology
An, Wei
Liu, Qihua
INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY II, PTS 1-4, 2013, 411-414 : 1112 - +

← 1 2 3 4 5 →