Speaker Recognition using Convolutional Neural Network with Minimal Training Data for Smart Home Solutions

被引：0

作者：

Wang, Mingshan ^{[1
]}

Sirlapu, Tejaswini ^{[1
]}

Kwasniewska, Alicja ^{[2
]}

Szankin, Maciej ^{[1
]}

Bartscherer, Marko ^{[1
]}

Nicolas, Rey ^{[1
]}

机构：

[1] Intel Corp, San Diego, CA 92131 USA

[2] Gdansk Univ Technol, Fac Elect Telecommun & Informat, Gdansk, Poland

来源：

2018 11TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION (HSI) | 2018年

关键词：

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the technology advancements in smart home sector, voice control and automation are key components that can make a real difference in people's lives. The voice recognition technology market continues to involve rapidly as almost all smart home devices arc providing speaker recognition capability today. However, most of them provide cloud-based solutions or use very deep Neural Networks for speaker recognition task, which are not suitable models to run on smart home devices. In this paper, we compare relatively small Convolutional Neural Networks (CNN) and evaluate effectiveness of speaker recognition using these models on edge devices. In addition, we also apply transfer learning technique to deal with a problem of limited training data. By developing solution suitable for running inference locally on edge devices, we eliminate the well-known cloud computing issues, such as data privacy and network latency, etc. The preliminary results proved that the chosen model adapts the benefit of computer vision task by using CNN and spectrograms to perform speaker classification with precision and recall similar to 84% in time less than 60 ms on mobile device with Atom Cherry Trail processor.

引用

页码：139 / 145

页数：7

共 50 条

[31] Recurrent Neural Network for Human Activity Recognition in Smart Home
Fang, Hongqing
Si, Hao
Chen, Long
PROCEEDINGS OF 2013 CHINESE INTELLIGENT AUTOMATION CONFERENCE: INTELLIGENT AUTOMATION, 2013, 254 : 341 - 348
[32] Automated generation of convolutional neural network training data using video sources
Kalukin, Andrew R.
Leonard, Wade
Green, Joan
Burgwardt, Lester
2017 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR), 2017,
[33] Convolutional neural network for human behavior recognition based on smart bracelet
Qu, Junsuo
Qiao, Ning
Shi, Haonan
Su, Chang
Razi, Abolfazl
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (05) : 5615 - 5626
[34] Distributed Deep Convolutional Neural Network For Smart Camera Image Recognition
Castillo, Emmanuel Ayuyao
Ahmadinia, Ali
11TH INTERNATIONAL CONFERENCE ON DISTRIBUTED SMART CAMERAS (ICDSC 2017), 2017, : 169 - 173
[35] Home Security System with Face Recognition based on Convolutional Neural Network
Irjanto, Nourman S.
Surantha, Nico
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (11) : 408 - 412
[36] Ensemble Speaker Modeling using Speaker Adaptive Training Deep Neural Network for Speaker Adaptation
Li, Sheng
Lu, Xugang
Akita, Yuya
Kawahara, Tatsuya
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2892 - 2896
[37] Speaker Diarization Using Convolutional Neural Network for Statistics Accumulation Refinement
Zajic, Zbynek
Hruz, Marek
Mueller, Ladek
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3562 - 3566
[38] Rapid and Effective Speaker Adaptation of Convolutional Neural Network Based Models for Speech Recognition
Abdel-Hamid, Ossama
Jiang, Hui
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1247 - 1251
[39] Deep Convolutional Neural Network Learning for Activity Recognition using real-life sensor's data in smart devices
Fekri, Maryam
Shafiq, M. Omair
2018 IEEE 20TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES (HEALTHCOM), 2018,
[40] Training Convolutional Neural Networks with Limited Training Data for Ear Recognition in the Wild
Emersic, Ziga
Stepec, Dejan
Struc, Vitomir
Peer, Peter
2017 12TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2017), 2017, : 987 - 994

← 1 2 3 4 5 →