Speech Recognition Using Principal Components Analysis and Neural Networks

被引:0
|
作者
Shabani, Shaham [1 ]
Norouzi, Yaser [2 ]
机构
[1] Univ Bologna, DEI Dept, Bologna, Italy
[2] Amirkabir Univ Technol, Dept Elect Engn, Tehran, Iran
关键词
component; speech recognition; feature extraction; principal components analysis (PCA); Mel frequency cepstral coefficient (MFCC); neural network;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we intend to introduce a new approach to recognize discrete speeches, specifically pre-assumed words. Our approach is mainly based on Principal Components Analysis (PCA) and Neural Networks (NN). To do so, initially we build a data base which is provided by 20 speakers who uttered each predefined word 5 times and overall 10 Persian words. Then we apply Voice Activity Detection (VAD) and eliminate the useless portions of each frame and then by computing Mel Frequency Cepstral Coefficients (MFCCs), which are our useful features in the recognition process, and then applying PCA to reduce the size of our data set, we will successfully provide the inputs of the NN block. Using PCA will enable us to provide inputs with lower size to our recognition system which is an important feature of our approach by speeding up the training procedure while keeping the accuracy as high as possible. In another words, PCA will decrease the amount of computations we have to deal with usually in most recognition systems. We use 90% of our data set to train our algorithm and the remained 10% to test our algorithm and measure the accuracy of recognition process.
引用
收藏
页码:90 / 95
页数:6
相关论文
共 50 条
  • [1] Neural networks for seismic principal components analysis
    Huang, Kou-Yuan
    IEEE Transactions on Geoscience and Remote Sensing, 1999, 37 (1 pt 1): : 297 - 311
  • [2] Speech recognition using neural networks
    Khan, SU
    Sharma, G
    Rao, PRK
    PROCEEDINGS OF IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY 2000, VOLS 1 AND 2, 2000, : 432 - 437
  • [3] SPEECH RECOGNITION USING NEURAL NETWORKS
    Kumar, T. Lalith
    Kumar, T. Kishore
    Rajan, K. Soundar
    PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING SYSTEMS, 2009, : 248 - +
  • [4] Neural networks for seismic principal components analysis
    Huang, KY
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 1999, 37 (01): : 297 - 311
  • [5] Artificial Neural Networks Combined with the Principal Component Analysis for Non-Fluent Speech Recognition
    Swietlicka, Izabela
    Kuniszyk-Jozkowiak, Wieslawa
    Swietlicki, Michal
    SENSORS, 2022, 22 (01)
  • [6] Emotion recognition in speech using neural networks
    Nicholson, J
    Takahashi, K
    Nakatsu, R
    AFFECTIVE MINDS, 2000, : 215 - 220
  • [7] Emotion recognition in speech using neural networks
    Nicholson, J
    Takahashi, K
    Nakatsu, R
    NEURAL COMPUTING & APPLICATIONS, 2000, 9 (04): : 290 - 296
  • [8] Speech Recognition Using Scaly Neural Networks
    Othman, Akram M.
    Riadh, May H.
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 28, 2008, 28 : 253 - +
  • [9] Emotion Recognition in Speech Using Neural Networks
    J. Nicholson
    K. Takahashi
    R. Nakatsu
    Neural Computing & Applications, 2000, 9 : 290 - 296
  • [10] Speech recognition using Elman neural networks
    Rothkrantz, LJM
    Nollen, D
    TEXT, SPEECH AND DIALOGUE, 1999, 1692 : 146 - 151