SPEED: Open-source framework to accelerate speech recognition on embedded GPUs

被引:0
|
作者
Intesa, Leonardo [1 ]
Jafri, Syed M. A. H. [1 ]
Hemani, Ahmed [1 ]
机构
[1] Royal Inst Technol KTH, Stockholm, Sweden
关键词
DESIGN;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to high accuracy, inherent redundancy, and embarrassingly parallel nature, the neural networks are fast becoming mainstream machine learning algorithms. However, these advantages come at the cost of high memory and processing requirements (that can be met by either GPUs, FPGAs or ASICs). For embedded systems, the requirements are particularly challenging because of stiff power and timing budgets. Due to the availability of efficient mapping tools, GPUs are an appealing platforms to implement the neural networks. While, there is significant work that implements the image recognition (in particular Convolutional Neural Networks) on GPUs, only a few works deal with efficiently implement of speech recognition on GPUs. The work that does focus on implementing speech recognition does not address embedded systems. To tackle this issue, this paper presents SPEED (Open-source framework to accelerate speech recognition on embedded GPUs). We have used Eesen speech recognition framework because it is considered as the most accurate speech recognition technique. Experimental results reveal that the proposed techniques offer 2.6X speedup compared to state of the art.
引用
收藏
页码:94 / 101
页数:8
相关论文
共 50 条
  • [1] THE BAVIECA OPEN-SOURCE SPEECH RECOGNITION TOOLKIT
    Bolanos, Daniel
    [J]. 2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 354 - 359
  • [2] PPCU Sam: Open-source face recognition framework
    Csaba, Botos
    Tamas, Hakkel
    Horvath, Andras
    Olah, Andras
    Reguly, Istvan Z.
    [J]. KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES 2019), 2019, 159 : 1947 - 1956
  • [3] A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline
    Khassanov, Yerbolat
    Mussakhojayeva, Saida
    Mirzakhmetov, Almas
    Adiyev, Alen
    Nurpeiissov, Mukhamet
    Varol, Huseyin Atakan
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 697 - 706
  • [4] Speech Recognition System Using Open-Source Speech Engine for Indian Names
    Kallole, Nitin Arun
    Prakash, R.
    [J]. INTELLIGENT EMBEDDED SYSTEMS, ICNETS2, VOL II, 2018, 492 : 263 - 274
  • [5] AISHELL-1: AN OPEN-SOURCE MANDARIN SPEECH CORPUS AND A SPEECH RECOGNITION BASELINE
    Bu, Hui
    Du, Jiayu
    Na, Xingyu
    Wu, Bengu
    Zheng, Hao
    [J]. 2017 20TH CONFERENCE OF THE ORIENTAL CHAPTER OF THE INTERNATIONAL COORDINATING COMMITTEE ON SPEECH DATABASES AND SPEECH I/O SYSTEMS AND ASSESSMENT (O-COCOSDA), 2017, : 58 - 62
  • [6] acados—a modular open-source framework for fast embedded optimal control
    Robin Verschueren
    Gianluca Frison
    Dimitris Kouzoupis
    Jonathan Frey
    Niels van Duijkeren
    Andrea Zanelli
    Branimir Novoselnik
    Thivaharan Albin
    Rien Quirynen
    Moritz Diehl
    [J]. Mathematical Programming Computation, 2022, 14 : 147 - 183
  • [7] Open-source toolkit for end-to-end Korean speech recognition
    Kim, Soohwan
    Bae, Seyoung
    Won, Cheolhwang
    [J]. SOFTWARE IMPACTS, 2021, 7
  • [8] Towards an Open-Source Dutch Speech Recognition System for the Healthcare Domain
    Tejedor-Garcia, Cristian
    van der Molen, Berrie
    van den Heuvel, Henk
    van Hessen, Arjan
    Pieters, Toine
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1032 - 1039
  • [9] Towards an Open-Source Dutch Speech Recognition System for the Healthcare Domain
    Tejedor-García, Cristian
    van der Molen, Berrie
    van den Heuvel, Henk
    van Hessen, Arjan
    Pieters, Toine
    [J]. 2022 Language Resources and Evaluation Conference, LREC 2022, 2022, : 1032 - 1039
  • [10] Sautrela: A highly modular open source speech recognition framework
    Penagarikano, M
    Bordel, G
    [J]. 2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2005, : 386 - 391