Speaker Recognition using Convolutional Neural Network with Minimal Training Data for Smart Home Solutions

被引:0
|
作者
Wang, Mingshan [1 ]
Sirlapu, Tejaswini [1 ]
Kwasniewska, Alicja [2 ]
Szankin, Maciej [1 ]
Bartscherer, Marko [1 ]
Nicolas, Rey [1 ]
机构
[1] Intel Corp, San Diego, CA 92131 USA
[2] Gdansk Univ Technol, Fac Elect Telecommun & Informat, Gdansk, Poland
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the technology advancements in smart home sector, voice control and automation are key components that can make a real difference in people's lives. The voice recognition technology market continues to involve rapidly as almost all smart home devices arc providing speaker recognition capability today. However, most of them provide cloud-based solutions or use very deep Neural Networks for speaker recognition task, which are not suitable models to run on smart home devices. In this paper, we compare relatively small Convolutional Neural Networks (CNN) and evaluate effectiveness of speaker recognition using these models on edge devices. In addition, we also apply transfer learning technique to deal with a problem of limited training data. By developing solution suitable for running inference locally on edge devices, we eliminate the well-known cloud computing issues, such as data privacy and network latency, etc. The preliminary results proved that the chosen model adapts the benefit of computer vision task by using CNN and spectrograms to perform speaker classification with precision and recall similar to 84% in time less than 60 ms on mobile device with Atom Cherry Trail processor.
引用
收藏
页码:139 / 145
页数:7
相关论文
共 50 条
  • [31] Recurrent Neural Network for Human Activity Recognition in Smart Home
    Fang, Hongqing
    Si, Hao
    Chen, Long
    PROCEEDINGS OF 2013 CHINESE INTELLIGENT AUTOMATION CONFERENCE: INTELLIGENT AUTOMATION, 2013, 254 : 341 - 348
  • [32] Automated generation of convolutional neural network training data using video sources
    Kalukin, Andrew R.
    Leonard, Wade
    Green, Joan
    Burgwardt, Lester
    2017 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR), 2017,
  • [33] Convolutional neural network for human behavior recognition based on smart bracelet
    Qu, Junsuo
    Qiao, Ning
    Shi, Haonan
    Su, Chang
    Razi, Abolfazl
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (05) : 5615 - 5626
  • [34] Distributed Deep Convolutional Neural Network For Smart Camera Image Recognition
    Castillo, Emmanuel Ayuyao
    Ahmadinia, Ali
    11TH INTERNATIONAL CONFERENCE ON DISTRIBUTED SMART CAMERAS (ICDSC 2017), 2017, : 169 - 173
  • [35] Home Security System with Face Recognition based on Convolutional Neural Network
    Irjanto, Nourman S.
    Surantha, Nico
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (11) : 408 - 412
  • [36] Ensemble Speaker Modeling using Speaker Adaptive Training Deep Neural Network for Speaker Adaptation
    Li, Sheng
    Lu, Xugang
    Akita, Yuya
    Kawahara, Tatsuya
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2892 - 2896
  • [37] Speaker Diarization Using Convolutional Neural Network for Statistics Accumulation Refinement
    Zajic, Zbynek
    Hruz, Marek
    Mueller, Ladek
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3562 - 3566
  • [38] Rapid and Effective Speaker Adaptation of Convolutional Neural Network Based Models for Speech Recognition
    Abdel-Hamid, Ossama
    Jiang, Hui
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1247 - 1251
  • [39] Deep Convolutional Neural Network Learning for Activity Recognition using real-life sensor's data in smart devices
    Fekri, Maryam
    Shafiq, M. Omair
    2018 IEEE 20TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES (HEALTHCOM), 2018,
  • [40] Training Convolutional Neural Networks with Limited Training Data for Ear Recognition in the Wild
    Emersic, Ziga
    Stepec, Dejan
    Struc, Vitomir
    Peer, Peter
    2017 12TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2017), 2017, : 987 - 994