Gated recurrent unit predictor model-based adaptive differential pulse code modulation speech decoder

被引:2
|
作者
Sheferaw, Gebremichael Kibret [1 ]
Mwangi, Waweru [1 ]
Kimwele, Michael [1 ]
Mamuye, Adane [2 ]
机构
[1] Jomo Kenyatta Univ Agr & Technol, Sch Comp & Informat Technol, Nairobi, Kenya
[2] Addis Ababa Univ, Sch Informat Technol & Engn, Inst Technol, Addis Ababa, Ethiopia
关键词
Speech coding; Gated recurrent unit; Nonlinear prediction; Waveform coding; Audio coding; Adaptive differential pulse code modulation; Speech compression; NEURAL-NETWORKS;
D O I
10.1186/s13636-023-00325-3
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech coding is a method to reduce the amount of data needs to represent speech signals by exploiting the statistical properties of the speech signal. Recently, in the speech coding process, a neural network prediction model has gained attention as the reconstruction process of a nonlinear and nonstationary speech signal. This study proposes a novel approach to improve speech coding performance by using a gated recurrent unit (GRU)-based adaptive differential pulse code modulation (ADPCM) system. This GRU predictor model is trained using a data set of speech samples from the DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus actual sample and the ADPCM fixed-predictor output speech sample. Our contribution lies in the development of an algorithm for training the GRU predictive model that can improve its performance in speech coding prediction and a new offline trained predictive model for speech decoder. The results indicate that the proposed system significantly improves the accuracy of speech prediction, demonstrating its potential for speech prediction applications. Overall, this work presents a unique application of the GRU predictive model with ADPCM decoding in speech signal compression, providing a promising approach for future research in this field.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Adaptive model-based technique for robust speech recognition
    Graciarena, M
    CONFERENCE RECORD OF THE THIRTY-FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2000, : 1512 - 1516
  • [22] State space model-based Runge-Kutta gated recurrent unit networks for structural response prediction
    Zhai, Weida
    Bao, Yuequan
    Tao, Dongwang
    NONLINEAR DYNAMICS, 2024, 112 (24) : 21901 - 21921
  • [23] DOUBLE PREDICTOR DIFFERENTIAL-PULSE CODE MODULATION ALGORITHM FOR IMAGE DATA-COMPRESSION
    DAUT, DG
    ZHAO, DM
    WU, JC
    OPTICAL ENGINEERING, 1993, 32 (07) : 1514 - 1523
  • [24] An algorithm for MWD data compression based on differential pulse code modulation
    Zhang, Yu
    Qiu, Zhengding
    Xiong, Ke
    Wang, Shenghui
    Shiyou Kantan Yu Kaifa/Petroleum Exploration and Development, 2010, 37 (06): : 748 - 755
  • [25] THE USE OF ADAPTIVE DIFFERENTIAL PULSE CODE MODULATION FOR THE ANALOG DIGITAL CONVERSION OF BROADCAST SIGNALS
    GITLITS, MV
    CHETKIN, SV
    TELECOMMUNICATIONS AND RADIO ENGINEERING, 1982, 36-7 (01) : 49 - 54
  • [26] GPU-based parallel clustered differential pulse code modulation
    Wu, Jiaji
    Li, Wenze
    Kong, Wanqiu
    HIGH-PERFORMANCE COMPUTING IN REMOTE SENSING V, 2015, 9646
  • [27] Time-adaptive transient stability assessment based on gated recurrent unit
    Chen, Qifan
    Wang, Huaiyuan
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2021, 133
  • [28] SUBJECTIVE EVALUATION OF DIFFERENTIAL PULSE-CODE MODULATION USING SPEECH GOODNESS RATING SCALE
    GRETHER, CB
    STROH, RW
    IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1973, AU21 (03): : 179 - 184
  • [29] Real-time FPGA Demonstration of Low-latency Adaptive Fronthaul Compression based on Adaptive Differential Pulse Code Modulation
    Zhu, Paikun
    Yoshida, Yuki
    Kitayama, Ken-ichi
    2018 ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE (ACP), 2018,
  • [30] Adaptive Speech Understanding for Intuitive Model-based Spoken Dialogues
    Heinroth, Tobias
    Grotz, Maximilian
    Nothdurft, Florian
    Minker, Wolfgang
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1281 - 1288