Offline to online speaker adaptation for real-time deep neural network based LVCSR systems

被引:0
|
作者
Yanhua Long
Yijie Li
Bo Zhang
机构
[1] Shanghai Normal University,Department of Electronic and Information Engineering
[2] Beijing Unisound Information Technology Co.,undefined
[3] Ltd.,undefined
来源
关键词
Speaker adaptation; Offline-to-online iVector; Deep neural network; Real-time speech recognition;
D O I
暂无
中图分类号
学科分类号
摘要
In this study, we investigate an offline to online strategy for speaker adaptation of automatic speech recognition systems. These systems are trained using the traditional feed-forward and the recent proposed lattice-free maximum mutual information (MMI) time-delay deep neural networks. In this strategy, the test speaker identity is modeled as an iVector which is offline estimated and then used in an online style during speech decoding. In order to ensure the quality of iVectors, we introduce a speaker enrollment stage which can ensure sufficient reliable speech for estimating an accurate and stable offline iVector. Furthermore, different iVector estimation techniques are also reviewed and investigated for speaker adaptation in large vocabulary continuous speech recognition (LVCSR) tasks. Experimental results on several real-time speech recognition tasks demonstrate that, the proposed strategy can not only provide a fast decoding speed, but also can result in significant reductions in word error rates (WERs) than traditional iVector based speaker adaptation frameworks.
引用
收藏
页码:28101 / 28119
页数:18
相关论文
共 50 条
  • [1] Offline to online speaker adaptation for real-time deep neural network based LVCSR systems
    Long, Yanhua
    Li, Yijie
    Zhang, Bo
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (21) : 28101 - 28119
  • [2] Comparison of real-time online and offline neural network models for a UAV
    Puttige, Vishwas R.
    Anavatti, Sreenatha G.
    [J]. 2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 412 - +
  • [3] ONLINE SPEAKER ADAPTATION FOR LVCSR BASED ON ATTENTION MECHANISM
    Pan, Jia
    Liu, Diyuan
    Wan, Genshun
    Du, Jun
    Liu, Qingfeng
    Ye, Zhongfu
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 183 - 186
  • [4] Applying GPGPU to Recurrent Neural Network Language Model based Fast Network Search in the Real-Time LVCSR
    Lee, Kyungmin
    Park, Chiyoun
    Kim, Ilhwan
    Kim, Namhoon
    Lee, Jaewon
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2102 - 2106
  • [5] Lightweight Deep Neural Network-based Real-Time Pose Estimation on Embedded Systems
    Heo, Junho
    Kim, Ginam
    Park, Jaeseo
    Kim, Yeonsu
    Cho, Sung-Sik
    Lee, Chang Won
    Kang, Suk-Ju
    [J]. 2020 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2020, : 1066 - 1071
  • [6] Real-Time Modular Deep Neural Network-Based Adaptive Control of Nonlinear Systems
    Le, Duc M.
    Greene, Max L.
    Makumi, Wanjiku A.
    Dixon, Warren E.
    [J]. IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 476 - 481
  • [7] Comparison of Regularization Constraints in Deep Neural Network based Speaker Adaptation
    Shen, Peng
    Lu, Xugang
    Kawai, Hisashi
    [J]. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [8] Deep Neural Network Based Real-Time Intrusion Detection System
    Sharuka Promodya Thirimanne
    Lasitha Jayawardana
    Lasith Yasakethu
    Pushpika Liyanaarachchi
    Chaminda Hewage
    [J]. SN Computer Science, 2022, 3 (2)
  • [9] Offline Replication and Online Energy Management for Hard Real-Time Multicore Systems
    Poursafaei, Farimah R.
    Safari, Sepideh
    Ansari, Mohsen
    Salehi, Mohammad
    Ejlali, Alireza
    [J]. 2015 CSI SYMPOSIUM ON REAL-TIME AND EMBEDDED SYSTEMS AND TECHNOLOGIES (RTEST), 2015,
  • [10] Real-Time Object Recognition Algorithm Based on Deep Convolutional Neural Network
    Yang, Lihong
    Wang, Liewei
    Wu, Shuo
    [J]. 2018 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA), 2018, : 331 - 335