Thinking about the present and future of the complex speech recognition

被引:0
|
作者
Vicsi, Klara [1 ]
机构
[1] Budapest Univ Technol & Econ, Dept Telecommun & Mediainformat, Lab Speech Acoust, Budapest, Hungary
关键词
component; speech recognition; speech to text transformation system; multi-modal speech processing; multi-stream modelling; FEATURES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A critical point of the most cognitive info-communication systems is the state of the development of speech recognition technology. The paper gives a short introduction of the principles of this speech recognition technology today. It highlights the fact that these systems in the market are only speech-to-text transformers giving only a word chain at the output, where the speech prosody, speech emotion, speech style and more other information are not involved. Many uncertainties exist in this operational system. Some up to date research tendencies, mostly the parallel processing are introduced aiming to increase the efficiencies of the recognition. At the end, research agenda of META NET are shortly introduced for Multilingual Europe in 2020.
引用
收藏
页码:371 / 376
页数:6
相关论文
共 50 条