Machine learning methods for speech emotion recognition on telecommunication systemsMachine learning methods for speech emotion recognition on telecommunication systemsA. Osipov et al.

被引:0
|
作者
Alexey Osipov [1 ]
Ekaterina Pleshakova [1 ]
Yang Liu [2 ]
Sergey Gataullin [3 ]
机构
[1] MIREA - Russian Technological University,
[2] Xidian University,undefined
[3] Moscow Technical University of Communications and Informatics,undefined
关键词
Artificial intelligence; Neural networks; Engineering; CapsNet; Smart bracelet; Photoplethysmogram; Speech emotion recognition;
D O I
10.1007/s11416-023-00500-2
中图分类号
学科分类号
摘要
The manuscript is devoted to the study of human behavior in stressful situations using machine learning methods, which depends on the psychotype, socialization and a host of other factors. Global mobile subscribers lost approximately $53 billion in 2022 due to phone fraud and unwanted calls, with almost half (43%) of subscribers having spam blocking or caller ID apps installed. Phone scammers build their conversation focusing on the behavior of a certain category of people. Previously, a person is introduced into a state of acute stress, in which his further behavior to one degree or another can be manipulated. We were allowed to single out the target audience by research by Juniper Research. These are men under the age of 44 who have the highest risk of being deceived by scammers. This significantly narrows the scope of research and allows us to limit the behavioral features of this particular category of subscribers. In addition, this category of people uses modern gadgets, which allows researchers not to consider outdated models; has stable health indicators, which allows not to conduct additional studies of people with diseases of the heart system, because. Their percentage in this sample is minimal; and also most often undergoes a polygraph interview, for example, when applying for a job, and this allows us to get a sample sufficient for training the neural network. To teach the method, polygrams were used, marked by a polygraph examiner and a psychologist of healthy young people who underwent a scheduled polygraph test for company loyalty. For testing, the readings of the PPG sensor built into the smart bracelet were taken and analyzed within a month from young people who underwent a polygraph test. We have developed a modification of the wavelets capsular neural network—2D-CapsNet, allowing to identify the state of panic stupor by classification quality indicators: Accuracy—86.0%, Precision—84.0%, Recall = 87.5% and F-score—85.7%, according to the photoplethysmogram graph (PPG), which does not allow him to make logically sound decisions. When synchronizing a smart bracelet with a smartphone, the method allows real-time tracking of such states, which makes it possible to respond to a call from a telephone scammer during a conversation with a subscriber. The proposed method can be widely used in cyber-physical systems in order to detect illegal actions.
引用
收藏
页码:415 / 428
页数:13
相关论文
共 50 条
  • [21] Emotion Recognition in Speech with Deep Learning Architectures
    Erdal, Mehmet
    Kaechele, Markus
    Schwenker, Friedhelm
    ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, 2016, 9896 : 298 - 311
  • [22] Learning Spontaneity to Improve Emotion Recognition in Speech
    Mangalam, Karttikeya
    Guha, Tanaya
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 946 - 950
  • [23] Speech emotion recognition with unsupervised feature learning
    Zheng-wei HUANG
    Wen-tao XUE
    Qi-rong MAO
    FrontiersofInformationTechnology&ElectronicEngineering, 2015, 16 (05) : 358 - 366
  • [24] LEARNING WITH SYNTHESIZED SPEECH FOR AUTOMATIC EMOTION RECOGNITION
    Schuller, Bjoern
    Burkhardt, Felix
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5150 - 5153
  • [25] Federated Learning for Speech Emotion Recognition Applications
    Latif, Siddique
    Khalifa, Sara
    Rana, Rajib
    Jurdak, Raja
    2020 19TH ACM/IEEE INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING IN SENSOR NETWORKS (IPSN 2020), 2020, : 341 - 342
  • [26] Speech Emotion Recognition with Discriminative Feature Learning
    Zhou, Huan
    Liu, Kai
    INTERSPEECH 2020, 2020, : 4094 - 4097
  • [27] Speech Emotion Recognition Based on Learning Automata in
    Motamed, Sara
    Setayeshi, Saeed
    Farhoudi, Zeinab
    Ahmadi, Ali
    JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE-JMCS, 2014, 12 (03): : 173 - 185
  • [28] CONTRASTIVE UNSUPERVISED LEARNING FOR SPEECH EMOTION RECOGNITION
    Li, Mao
    Yang, Bo
    Levy, Joshua
    Stolcke, Andreas
    Rozgic, Viktor
    Matsoukas, Spyros
    Papayiannis, Constantinos
    Bone, Daniel
    Wang, Chao
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6329 - 6333
  • [29] Learning Transferable Features for Speech Emotion Recognition
    Marczewski, Alison
    Veloso, Adriano
    Ziviani, Nivio
    PROCEEDINGS OF THE THEMATIC WORKSHOPS OF ACM MULTIMEDIA 2017 (THEMATIC WORKSHOPS'17), 2017, : 529 - 536
  • [30] Speech emotion recognition with unsupervised feature learning
    Huang, Zheng-wei
    Xue, Wen-tao
    Mao, Qi-rong
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (05) : 358 - 366