Korean speech recognition using deep learning

被引:1
|
作者
Lee, Suji [1 ]
Han, Seokjin [1 ]
Park, Sewon [1 ]
Lee, Kyeongwon [1 ]
Lee, Jaeyong [1 ]
机构
[1] Seoul Natl Univ, Dept Stat, 1 Gwanak Ro, Seoul 08826, South Korea
基金
新加坡国家研究基金会;
关键词
Korean speech recognition; end to end deep learning; Connectionist temporal classification; Attention; Bayesian deep learning;
D O I
10.5351/KJAS.2019.32.2.213
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, we propose an end-to-end deep learning model combining Bayesian neural network with Korean speech recognition. In the past, Korean speech recognition was a complicated task due to the excessive parameters of many intermediate steps and needs for Korean expertise knowledge. Fortunately, Korean speech recognition becomes manageable with the aid of recent breakthroughs in "End-to-end" model. The end-to-end model decodes mel-frequency cepstral coefficients directly as text without any intermediate processes. Especially, Connectionist Temporal Classification loss and Attention based model are a kind of the end-to-end. In addition, we combine Bayesian neural network to implement the end-to-end model and obtain Monte Carlo estimates. Finally, we carry out our experiments on the "WorimalSam" online dictionary dataset. We obtain 4.58% Word Error Rate showing improved results compared to Google and Naver API.
引用
收藏
页码:213 / 227
页数:15
相关论文
共 50 条
  • [1] Speech Recognition using Deep Learning
    Lakkhanawannakun, Phoemporn
    Noyunsan, Chaluemwut
    [J]. 2019 34TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2019), 2019, : 514 - 517
  • [2] Persian speech recognition using deep learning
    Veisi, Hadi
    Haji Mani, Armita
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (04) : 893 - 905
  • [3] Speech Emotion Recognition Using Deep Learning
    Alagusundari, N.
    Anuradha, R.
    [J]. ARTIFICIAL INTELLIGENCE: THEORY AND APPLICATIONS, VOL 1, AITA 2023, 2024, 843 : 313 - 325
  • [4] Speech Command Recognition Using Deep Learning
    Ayache, Mohammad
    Kanaan, Hussien
    Kassir, Kawthar
    Kassir, Yasser
    [J]. 2021 SIXTH INTERNATIONAL CONFERENCE ON ADVANCES IN BIOMEDICAL ENGINEERING (ICABME), 2021, : 24 - 29
  • [5] Speech Emotion Recognition Using Deep Learning
    Ahmed, Waqar
    Riaz, Sana
    Iftikhar, Khunsa
    Konur, Savas
    [J]. ARTIFICIAL INTELLIGENCE XL, AI 2023, 2023, 14381 : 191 - 197
  • [6] Fake Speech Recognition Using Deep Learning
    Camacho, Steven
    Maria Ballesteros, Dora
    Renza, Diego
    [J]. APPLIED COMPUTER SCIENCES IN ENGINEERING, WEA 2021, 2021, 1431 : 38 - 48
  • [7] Persian speech recognition using deep learning
    Hadi Veisi
    Armita Haji Mani
    [J]. International Journal of Speech Technology, 2020, 23 : 893 - 905
  • [8] Recognition of English speech - using a deep learning algorithm
    Wang, Shuyan
    [J]. JOURNAL OF INTELLIGENT SYSTEMS, 2023, 32 (01)
  • [9] Deep learning-based speech recognition for Korean elderly speech data including dementia patients
    Mun, Jeonghyeon
    Kang, Joonseo
    Kim, Kiwoong
    Bae, Jongbin
    Lee, Hyeonjun
    Lim, Changwon
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2023, 36 (01) : 33 - 48
  • [10] Speech Emotion Recognition Based on Two-Stream Deep Learning Model Using Korean Audio Information
    Jo, A-Hyeon
    Kwak, Keun-Chang
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (04):