A Near Real-Time Automatic Speaker Recognition Architecture for Voice-Based User Interface

被引:32
|
作者
Dhakal, Parashar [1 ]
Damacharla, Praveen [2 ]
Javaid, Ahmad Y. [1 ]
Devabhaktuni, Vijay [2 ]
机构
[1] Univ Toledo, Elect Engn & Comp Sci Dept, Toledo, OH 43606 USA
[2] Purdue Univ Northwest, ECE Dept, Hammond, IN 46323 USA
来源
关键词
classifiers; convolution neural network; architecture; feature extraction; machine learning; random forest; speaker recognition; voice interface;
D O I
10.3390/make1010031
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a novel pipelined near real-time speaker recognition architecture that enhances the performance of speaker recognition by exploiting the advantages of hybrid feature extraction techniques that contain the features of Gabor Filter (GF), Convolution Neural Networks (CNN), and statistical parameters as a single matrix set. This architecture has been developed to enable secure access to a voice-based user interface (UI) by enabling speaker-based authentication and integration with an existing Natural Language Processing (NLP) system. Gaining secure access to existing NLP systems also served as motivation. Initially, we identify challenges related to real-time speaker recognition and highlight the recent research in the field. Further, we analyze the functional requirements of a speaker recognition system and introduce the mechanisms that can address these requirements through our novel architecture. Subsequently, the paper discusses the effect of different techniques such as CNN, GF, and statistical parameters in feature extraction. For the classification, standard classifiers such as Support Vector Machine (SVM), Random Forest (RF) and Deep Neural Network (DNN) are investigated. To verify the validity and effectiveness of the proposed architecture, we compared different parameters including accuracy, sensitivity, and specificity with the standard AlexNet architecture.
引用
收藏
页码:504 / 520
页数:17
相关论文
共 50 条
  • [21] A voice-based real-time emotion detection technique using recurrent neural network empowered feature modelling
    Chamishka, Sadil
    Madhavi, Ishara
    Nawaratne, Rashmika
    Alahakoon, Damminda
    De Silva, Daswin
    Chilamkurti, Naveen
    Nanayakkara, Vishaka
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (24) : 35173 - 35194
  • [22] Conformer-Based Speaker Recognition Model for Real-Time Multi-Scenarios
    Xuan, Xi
    Han, Runping
    Gao, Jingxin
    Computer Engineering and Applications, 2024, 60 (07) : 147 - 156
  • [23] Real-time multilingual speech recognition and speaker diarization system based on Whisper segmentation
    Lyu, Ke-Ming
    Lyu, Ren-yuan
    Chang, Hsien-Tsung
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [24] A vision-based user interface for real-time controlling toy cars
    Iannizzotto, Giancarlo
    Costanzo, Carlo
    Lanzafame, Pietro
    La Rosa, Francesco
    ETFA 2005: 10TH IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION, VOL 1, PTS 1 AND 2, PROCEEDINGS, 2005, : 1009 - 1016
  • [25] AUTOMATIC EXTRACTION OF SEMANTIC FEATURES FOR REAL-TIME ACTION RECOGNITION USING DEPTH ARCHITECTURE NETWORKS
    Tran Thang Thanh
    Chen, Fan
    Kotani, Kazunori
    Le Bac
    2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 1540 - 1544
  • [26] Real-Time Summarization of User-Generated Videos Based on Semantic Recognition
    Wang, Xi
    Jiang, Yu-Gang
    Chai, Zhenhua
    Gu, Zichen
    Du, Xinyu
    Wang, Dong
    PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 849 - 852
  • [27] HIDSUR: A hybrid intrusion detection system based on real-time user recognition
    Seleznyov, A
    Puuronen, S
    11TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATION, PROCEEDINGS, 2000, : 41 - 45
  • [28] Near real-time automatic moment magnitude estimation
    A. Gallo
    G. Costa
    P. Suhadolc
    Bulletin of Earthquake Engineering, 2014, 12 : 185 - 202
  • [29] Near real-time automatic moment magnitude estimation
    Gallo, A.
    Costa, G.
    Suhadolc, P.
    BULLETIN OF EARTHQUAKE ENGINEERING, 2014, 12 (01) : 185 - 202
  • [30] Real-Time Model for Automatic Vocal Emotion Recognition
    Atassi, Hicham
    Smekal, Zdenek
    31ST INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING TSP 2008, 2008, : 21 - 25