A Combined Feature Approach for Speaker Segmentation Using Convolution Neural Network

被引:1
|
作者
Zhong, Jiang [1 ,2 ]
Zhang, Pan [2 ]
Li, Xue [1 ,3 ]
机构
[1] Chongqing Univ, Key Lab Dependable Serv Comp Cyber Phys Soc, Minist Educ, Chongqing 400030, Peoples R China
[2] Chongqing Univ, Coll Comp Sci, Chongqing 400030, Peoples R China
[3] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld, Australia
关键词
Combined feature; Speaker segmentation; SPECTROGRAM; MFCC; CNN; DIARIZATION; RECOGNITION;
D O I
10.1007/978-3-319-77383-4_54
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a speaker segmentation algorithm is proposed based on a Combined feature approach using the Convolution Neural Network (CNN), which is used to deal with the speaker segmentation problem of dialogue speech with partial prior knowledge in the CALL_CENTER environment. For the first time, the Mel-Frequency Cepstral Coefficients (MFCC) feature and the SPECTROGRAM feature are combined as the input of CNN to train the speakers' voice feature model and to estimate the change point. In the experiments, a real database about the dialogue voice related to insurance sales and real estate sales industry is used to compare our proposed approach with Bayesian Information Criterion (BIC) approach using different acoustic features sets. The results show that the synthetical performance is improved, and our algorithm has a better segmentation.
引用
收藏
页码:550 / 559
页数:10
相关论文
共 50 条
  • [21] Remote sensing semantic segmentation with convolution neural network using attention mechanism
    Ni Xianyang
    Cheng Yinbao
    Wang Zhongyu
    PROCEEDINGS OF 2019 14TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONIC MEASUREMENT & INSTRUMENTS (ICEMI), 2019, : 608 - 613
  • [22] Efficient hand segmentation for rehabilitation tasks using a convolution neural network with attention
    Dutta, H. Pallab Jyoti
    Bhuyan, M. K.
    Neog, Debanga Raj
    Macdorman, Karl Fredric
    Laskar, Rabul Hussain
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 234
  • [23] An effective automatic segmentation of abdominal adipose tissue using a convolution neural network
    Micomyiza, Carine
    Zou, Beiji
    Li, Yang
    DIABETES & METABOLIC SYNDROME-CLINICAL RESEARCH & REVIEWS, 2022, 16 (09)
  • [24] A Comparison of Neural Network Feature Transforms for Speaker Diarization
    Yella, Sree Harsha
    Stolcke, Andreas
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3026 - 3030
  • [25] Combined approach to dysarthric speaker verification using data augmentation and feature fusion
    Salim, Shinimol
    Shahnawazuddin, Syed
    Ahmad, Waquar
    SPEECH COMMUNICATION, 2024, 160
  • [26] Hyperspectral Image Classification Using Feature Fusion Hypergraph Convolution Neural Network
    Ma, Zhongtian
    Jiang, Zhiguo
    Zhang, Haopeng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [27] Semantic Segmentation Based on Deep Convolution Neural Network
    Shan, Jichao
    Li, Xiuzhi
    Jia, Songmin
    Zhang, Xiangyin
    3RD ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND ARTIFICIAL INTELLIGENCE (ISAI2018), 2018, 1069
  • [28] An Approach for Feature Extraction and Diagnosis of Motor Rotor Bearing Based on Convolution Neural Network
    Wang, Hao
    Yang, Dongsheng
    Pang, Yongheng
    Li, Ting
    Hu, Bo
    NEURAL INFORMATION PROCESSING (ICONIP 2018), PT I, 2018, 11301 : 284 - 296
  • [29] Speaker identification using discriminative feature selection - a growing neural gas approach
    Sabac, B
    Gavat, I
    NEUREL 2000: PROCEEDINGS OF THE 5TH SEMINAR ON NEURAL NETWORK APPLICATIONS IN ELECTRICAL ENGINEERING, 2000, : 105 - 108
  • [30] Fast Point Voxel Convolution Neural Network with Selective Feature Fusion for Point Cloud Semantic Segmentation
    Wang, Xu
    Li, Yuyan
    Duan, Ye
    ADVANCES IN VISUAL COMPUTING (ISVC 2021), PT I, 2021, 13017 : 319 - 330