Enhancement of Assamese Speech Signals Using Learning Based Techniques

被引:0
|
作者
Sharma, Mridusmita [1 ]
Sarma, Kandarpa Kumar [1 ]
机构
[1] Gauhati Univ, Dept Elect & Commun Engn, Gauhati, Assam, India
来源
关键词
AUTO-ENCODER; ASSAMESE; ARTIFICIAL NEURAL NETWORK; NOISE; SPEECH;
D O I
10.21786/bbrc/14.5/20
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
For efficient recognition and proper interpretation of a speech signal in ASR models, de-noising of corrupted signals becomes very important. There are many traditional noise removing techniques which proved to be reliable but researchers have found the learning based approaches to be more fruitful than those traditional methods. Deep learning techniques have been popularly used in speech de-noising and recognition purposes and have become an integral part of such ASR systems. In this work, Auto-encoders and other ANN models have been implemented to remove noise from Assamese speech signals irrespective of the speaker, gender and the dialectal identity of the samples. The sample set consist of clean Assamese sentences of two, three, four, five and six words and corrupted signals with AWGN variation of 3 dB to -3 dB which increases the sample size and also makes the system robust. Despite of certain limitations, the satisfactory experimental results justifies the proposed de-noising method for Assamese speech.
引用
收藏
页码:100 / +
页数:5
相关论文
共 50 条
  • [1] An Overview of Speech Enhancement Based on Deep Learning Techniques
    Jannu, Chaitanya
    Vanambathina, Sunny Dayal
    [J]. INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2023,
  • [2] Machine learning based sample extraction for automatic speech recognition using dialectal Assamese speech
    Agarwalla, Swapna
    Sarma, Kandarpa Kumar
    [J]. NEURAL NETWORKS, 2016, 78 : 97 - 111
  • [3] EPG2S: Speech Generation and Speech Enhancement Based on Electropalatography and Audio Signals Using Multimodal Learning
    Chen, Li-Chin
    Chen, Po-Hsun
    Tsai, Richard Tzong-Han
    Tsao, Yu
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2582 - 2586
  • [4] AsPOS: Assamese Part of Speech Tagger using Deep Learning Approach
    Pathak, Dhrubajyoti
    Nandi, Sukumar
    Sarmah, Priyankoo
    [J]. 2022 IEEE/ACS 19TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2022,
  • [5] Speech Enhancement: Traditional and Deep Learning Techniques
    Gaddamedi, Satya Prasad
    Patel, Anuj
    Chandra, Sabyasachi
    Bharati, Puja
    Ghosh, Nirmalya
    Das Mandal, Shyamal Kumar
    [J]. PROCEEDINGS OF 27TH INTERNATIONAL SYMPOSIUM ON FRONTIERS OF RESEARCH IN SPEECH AND MUSIC, FRSM 2023, 2024, 1455 : 75 - 86
  • [6] WATERMARKING OF SPEECH SIGNALS BASED ON FORMANT ENHANCEMENT
    Wang, Shengbei
    Unoki, Masashi
    [J]. 2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 1257 - 1261
  • [7] Speech Enhancement Using Transform Domain Techniques
    Deshmukh, Pradnyesh
    Bhalke, D. G.
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT 2016, VOL 1, 2017, 468 : 241 - 250
  • [8] An Emotion Recognition Method Using Speech Signals Based on Deep Learning
    Byun, Sung-woo
    Shin, Bo-ra
    Lee, Seok-Pil
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 124 : 181 - 182
  • [9] Detecting Parkinson's disease with sustained phonation and speech signals using machine learning techniques
    Almeida, Jefferson S.
    Reboucas Filho, Pedro R.
    Carneiro, Tiago
    Wei, Wei
    Damasevicius, Robertas
    Maskeliunas, Rytis
    de Albuquerque, Victor Hugo C.
    [J]. PATTERN RECOGNITION LETTERS, 2019, 125 : 55 - 62
  • [10] Pause Insertion in Assamese Synthesized Speech Using Speech Specific Features
    Sharma, Bidisha
    Prasanna, S. R. Mahadeva
    [J]. 2017 TWENTY-THIRD NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2017,