Adversarial Attacks on Automatic Speech Recognition (ASR): A Survey

被引：1

作者：

Bhanushali, Amisha Rajnikant ^{[1
]}

Mun, Hyunjun ^{[1
]}

Yun, Joobeom ^{[1
]}

机构：

[1] Sejong Univ, Dept Comp & Informat Secur & Convergence Engn Inte, Seoul 05006, South Korea

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Surveys; Taxonomy; Automatic speech recognition; Task analysis; Internet; Text recognition; Adversarial machine learning; Artificial neural networks; Adversarial attacks; adversarial samples; automatic speech recognition (ASR); deep neural network (DNN); COEFFICIENTS; EXAMPLES;

D O I：

10.1109/ACCESS.2024.3416965

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Automatic Speech Recognition (ASR) systems have improved and eased how humans interact with devices. ASR system converts an acoustic waveform into the relevant text form. Modern ASR inculcates deep neural networks (DNNs) to provide faster and better results. As the use of DNN continues to expand, there is a need for examination against various adversarial attacks. Adversarial attacks are synthetic samples crafted carefully by adding particular noise to legitimate examples. They are imperceptible, yet they prove catastrophic to DNNs. Recently, adversarial attacks on ASRs have increased but previous surveys lack generalization of the different methods used for attacking ASR, and the scope of the study is narrowed to a particular application, making it difficult to determine the relationships and trade-offs between the attack techniques. Therefore, this survey provides a taxonomy illustrating the classification of the adversarial attacks on ASR based on their characteristics and behavior. Additionally, we have analyzed the existing methods for generating adversarial attacks and presented their comparative analysis. We have clearly drawn the outline to indicate the efficiency of the adversarial techniques, and based on the lacunae found in the existing studies, we have stated the future scope.

引用

页码：88279 / 88302

页数：24

共 50 条

[21] Synthesising Audio Adversarial Examples for Automatic Speech Recognition
Qu, Xinghua
Wei, Pengfei
Gao, Mingyong
Sun, Zhu
Ong, Yew-Soon
Ma, Zejun
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 1430 - 1440
[22] Automatic Speech Emotion Recognition: A Survey
Chandrasekar, Purnima
Chapaneri, Santosh
Jayaswal, Deepak
2014 INTERNATIONAL CONFERENCE ON CIRCUITS, SYSTEMS, COMMUNICATION AND INFORMATION TECHNOLOGY APPLICATIONS (CSCITA), 2014, : 341 - 346
[23] SVMs for Automatic Speech Recognition:: A survey
Solera-Urena, R.
Padrell-Sendra, J.
Martin-Iglesias, D.
Gallardo-Antolin, A.
Pelaez-Moreno, C.
Diaz-de-Maria, F.
PROGRESS IN NONLINEAR SPEECH PROCESSING, 2007, 4391 : 190 - +
[24] Automatic speech recognition (ASR) for the diagnosis of pronunciation of speech sound disorders in Korean children
Ahn, Taekyung
Hong, Yeonjung
Im, Younggon
Kim, Do Hyung
Kang, Dayoung
Jeong, Joo Won
Kim, Jae Won
Kim, Min Jung
Cho, Ah-Ra
Nam, Hosung
Jang, Dae-Hyun
CLINICAL LINGUISTICS & PHONETICS, 2024,
[25] Automatic Speech Recognition (ASR) based Approach for Speech Therapy of Aphasic Patients: A Review
Jamal, Norezmi
Shanta, Shahnoor
Mahmud, Farhanahani
Sha'abani, M. N. A. H.
ADVANCES IN ELECTRICAL AND ELECTRONIC ENGINEERING: FROM THEORY TO APPLICATIONS, 2017, 1883
[26] Diagnostic assessment of childhood apraxia of speech using automatic speech recognition (ASR) methods
Hosom, JP
Shriberg, L
Green, JR
JOURNAL OF MEDICAL SPEECH-LANGUAGE PATHOLOGY, 2004, 12 (04) : 167 - 171
[27] A review on Gujarati language based automatic speech recognition (ASR) systems
Dua, Mohit
Bhagat, Bhavesh
Dua, Shelza
Chakravarty, Nidhi
International Journal of Speech Technology, 27 (01): : 133 - 156
[28] A review on Gujarati language based automatic speech recognition (ASR) systems
Dua M.
Bhagat B.
Dua S.
Chakravarty N.
International Journal of Speech Technology, 2024, 27 (1) : 133 - 156
[29] Automatic Speech Recognition (ASR) Systems for Children: A Systematic Literature Review
Bhardwaj, Vivek
Ben Othman, Mohamed Tahar
Kukreja, Vinay
Belkhier, Youcef
Bajaj, Mohit
Goud, B. Srikanth
Rehman, Ateeq Ur
Shafiq, Muhammad
Hamam, Habib
APPLIED SCIENCES-BASEL, 2022, 12 (09):
[30] Black-box adversarial attacks through speech distortion for speech emotion recognition
Gao, Jinxing
Yan, Diqun
Dong, Mingyu
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2022, 2022 (01)

← 1 2 3 4 5 →