Attention-Based Bimodal Neural Network Speech Recognition System on FPGA

被引：0

作者：

Chen, Aiwu ^{[1
]}

机构：

[1] College of Intelligent Manufacturing (CIM), Hunan University of Science and Engineering (HUSE), Yongzhou,425199, China

来源：

Informatica (Slovenia) | 2025年 / 49卷 / 13期

关键词：

Audiovisual - Gates (transistor) - Neural networks - Speech enhancement - Speech recognition;

D O I：

10.31449/inf.v49i13.7154

中图分类号：

学科分类号：

摘要：

To further improve the accuracy of speech recognition technology, a neural network speech recognition system based on a field programmable gate array is designed. Firstly, a neural network audiovisual bimodal speech recognition algorithm based on an attention mechanism is designed. Then, a speech recognition platform based on on-site programmable gate arrays is built. The results showed that the word error rate and the character error rate of this research algorithm were 3.17% and 1.56%, respectively, which were significantly lower than the traditional Lip-Reading Network algorithm's 26.24% and 12.56%. The algorithm converged quickly when the training rounds were less than 10 and tended to stabilize when it was 20. The proposed speech recognition platform used many DSP units in its design, with a utilization rate of 83.2%, the lowest power consumption of 2.21W, the highest energy efficiency ratio of 26.15, and the shortest processing time and faster running speed. In summary, the research algorithm can reasonably allocate learning weights, improve training speed, and has certain feasibility and effectiveness because of introducing attention mechanism. It has good application effects in speech recognition, which helps to improve the accuracy of language recognition algorithms and promote communication between humans and machines. © 2025 Slovene Society Informatika. All rights reserved.

引用

页码：1 / 12

共 50 条

[21] EEG emotion recognition using attention-based convolutional transformer neural network
Gong, Linlin
Li, Mingyang
Zhang, Tao
Chen, Wanzhong
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 84
[22] Attention-based recurrent neural network for automatic behavior laying hen recognition
Laleye, Frejus A. A.
Mousse, Mikael A.
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (22) : 62443 - 62458
[23] Attention-Based Deep Neural Network and Its Application to Scene Text Recognition
He, Haizhen
Li, Jiehan
2019 IEEE 11TH INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN 2019), 2019, : 672 - 677
[24] 4D attention-based neural network for EEG emotion recognition
Guowen Xiao
Meng Shi
Mengwen Ye
Bowen Xu
Zhendi Chen
Quansheng Ren
Cognitive Neurodynamics, 2022, 16 : 805 - 818
[25] Underwater acoustic target recognition using attention-based deep neural network
Xiao, Xu
Wang, Wenbo
Ren, Qunyan
Gerstoft, Peter
Ma, Li
JASA EXPRESS LETTERS, 2021, 1 (10):
[26] 4D attention-based neural network for EEG emotion recognition
Xiao, Guowen
Shi, Meng
Ye, Mengwen
Xu, Bowen
Chen, Zhendi
Ren, Quansheng
COGNITIVE NEURODYNAMICS, 2022, 16 (04) : 805 - 818
[27] Attention-Based Multi-Filter Convolutional Neural Network for Inappropriate Speech Detection
Lin, Shu-Yu
Chen, Yi-Ling
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[28] Effective Exploitation of Posterior Information for Attention-Based Speech Recognition
Tang, Jian
Hou, Junfeng
Song, Yan
Dai, Li-Rong
McLoughlin, Ian
IEEE ACCESS, 2020, 8 (08): : 108988 - 108999
[29] Attention-based Contextual Language Model Adaptation for Speech Recognition
Martinez, Richard Diehl
Novotney, Scott
Bulyko, Ivan
Rastrow, Ariya
Stolcke, Andreas
Gandhe, Ankur
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1994 - 2003
[30] An attention-based network for serial number recognition on banknotes
Lin, Zhijie
He, Zhaoshui
Tan, Beihai
Shen, Yijiang
Wang, Peitao
Liu, Taiheng
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2022, 106

← 1 2 3 4 5 →