Audio Recognition Using Deep Learning for Edge Devices

被引：0

作者：

Kulkarni, Aditya ^{[1
]}

Jabade, Vaishali ^{[1
]}

Patil, Aniket ^{[1
,2
]}

机构：

[1] Vishwakarma Inst Technol, Elect & Telecommun, Pune, India

[2] Ifm Engn Private Ltd, Artificial Intelligence Team, Pune, India

来源：

ADVANCES IN COMPUTING AND DATA SCIENCES (ICACDS 2022), PT II | 2022年 / 1614卷

关键词：

Convolutional Neural Networks; Deep learning; Short Time Fourier Transform; Spectrogram; SPEECH RECOGNITION;

D O I：

10.1007/978-3-031-12641-3_16

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper has proposed a methodology that creates an automatic speech recognition system, the task for which would be to recognize keywords. Deep learning was deployed for classifying the spoken words. The created audio data set consisted of short audio clips which were then converted to a Spectrogram by computing Short Time Fourier Transform (STFT) of each audio sample from the data. Spectrogram is a picture of spectrum of frequencies of a signal. Convolutional neural network is a deep learning algorithm, prominently used for classifying image data. In our case Spectrogram, which is the audio representation, was used to train the CNN model and the model which achieved a higher recognition rate was deployed on the hardware. The proposed research has been motivated from the requirement of an audio classification system that can be deployed on hardware and further based on the classification the hardware has been assigned certain task to be completed.

引用

页码：186 / 198

页数：13

共 50 条

[1] Deep Learning for Activity Recognition Using Audio and Video
Reinolds, Francisco
Neto, Cristiana
Machado, Jose
[J]. ELECTRONICS, 2022, 11 (05)
[2] Crowd Counting Using Deep Learning in Edge Devices
Huang, Zuo
Sinnott, Richard O.
Ke, Qiuhong
[J]. 8TH IEEE/ACM INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING, APPLICATIONS AND TECHNOLOGIES, BDCAT 2021, 2021, : 28 - 37
[3] A Lightweight Deep Learning Model for Human Activity Recognition on Edge Devices
Agarwal, Preeti
Alam, Mansaf
[J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA SCIENCE, 2020, 167 : 2364 - 2373
[4] Deep learning for edge devices
Zaniolo, Luiz
Garbin, Christian
Marques, Oge
[J]. IEEE Potentials, 2023, 42 (04): : 39 - 45
[5] Speech Emotion Recognition Using Deep Learning on audio recordings
Suganya, S.
Charles, E. Y. A.
[J]. 2019 19TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER - 2019), 2019,
[6] Audio-visual speech recognition using deep learning
Noda, Kuniaki
Yamaguchi, Yuki
Nakadai, Kazuhiro
Okuno, Hiroshi G.
Ogata, Tetsuya
[J]. APPLIED INTELLIGENCE, 2015, 42 (04) : 722 - 737
[7] Audio-visual speech recognition using deep learning
Kuniaki Noda
Yuki Yamaguchi
Kazuhiro Nakadai
Hiroshi G. Okuno
Tetsuya Ogata
[J]. Applied Intelligence, 2015, 42 : 722 - 737
[8] Emotion recognition of audio/speech data using deep learning approaches
Gupta, Vedika
Juyal, Stuti
Singh, Gurvinder Pal
Killa, Chirag
Gupta, Nishant
[J]. JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2020, 41 (06): : 1309 - 1317
[9] Fish Recognition in Underwater Environments using Deep Learning and Audio Data
Laplante, Jean-Francois
Akhloufi, Moulay A.
Gervaise, Cedric
[J]. OCEAN SENSING AND MONITORING XIII, 2021, 11752
[10] Digit Recognition Applied to Reconstructed Audio Signals Using Deep Learning
Toufa, Anastasia-Sotiria
Kotropoulos, Constantine
[J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3050 - 3057

← 1 2 3 4 5 →