Vowel classification with combining pitch detection and one-dimensional convolutional neural network based classifier for gender identification

被引:2
|
作者
Lin, Chia-Hung [1 ,4 ]
Lai, Hsiang-Yueh [1 ,4 ]
Huang, Ping-Tzan [2 ]
Chen, Pi-Yun [1 ]
Li, Chien-Ming [3 ]
机构
[1] Natl Chin Yi Univ Technol, Dept Elect Engn, Taichung, Taiwan
[2] Natl Kaohsiung Univ Sci & Technol, Dept Maritime Informat & Technol, Kaohsiung, Taiwan
[3] Chi Mei Med Ctr, Div Infect Dis, Dept Med, Tainan, Taiwan
[4] Natl Chin Yi Univ Technol, Dept Elect Engn, Taichung 41170, Taiwan
关键词
learning (artificial intelligence); speech processing; speech recognition; FUNDAMENTAL-FREQUENCY; SPEECH; RECOGNITION; ESTIMATOR; PLAQUE; HEART;
D O I
10.1049/sil2.12216
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Human speech signals may contain specific information regarding a speaker's characteristics, and these signals can be very useful in applications involving interactive voice response (IVR) and automatic speech recognition (ASR). For IVR and ASR applications, speaker classification into different ages and gender groups can be applied in human-machine interaction or computer-based interaction systems for customised advertisement, translation (text generation), machine dialog systems, or self-service applications. Hence, an IVR-based system dictates that ASR should function through users' voices (specific voice-frequency bands) to identify customers' age and gender and interact with a host system. In the present study, we intended to combine a pitch detection (PD)-based extractor and a voice classifier for gender identification. The Yet Another Algorithm for Pitch Tracking (YAAPT)-based PD method was designed to extract the voice fundamental frequency (F-0) from non-stationary speaker's voice signals, allowing us to achieve gender identification, by distinguishing differences in F-0 between adult females and males, and classify voices into adult and children groups. Then, in vowel voice signal classification, a one-dimensional (1D) convolutional neural network (CNN), consisted of a multi-round 1D kernel convolutional layer, a 1D pooling process, and a vowel classifier that could preliminary divide feature patterns into three level ranges of F-0, including adult and children groups. Consequently, a classifier was used in the classification layer to identify the speakers' gender. The proposed PD-based extractor and voice classifier could reduce complexity and improve classification efficiency. Acoustic datasets were selected from the Hillenbrand database for experimental tests on 12 vowels classifications, and K-fold cross-validations were performed. The experimental results demonstrated that our approach is a very promising method to quantify the proposed classifier's performance in terms of recall (%), precision (%), accuracy (%), and F1 score.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Structural Damage Detection Based on One-Dimensional Convolutional Neural Network
    Xue, Zhigang
    Xu, Chenxu
    Wen, Dongdong
    APPLIED SCIENCES-BASEL, 2023, 13 (01):
  • [2] Mountain Forest Type Classification Based on One-Dimensional Convolutional Neural Network
    Bai, Maoyang
    Peng, Peihao
    Zhang, Shiqi
    Wang, Xueman
    Wang, Xiao
    Wang, Juan
    Pellikka, Petri
    FORESTS, 2023, 14 (09):
  • [3] Gas pipeline event classification based on one-dimensional convolutional neural network
    An, Yang
    Ma, Xueyan
    Wang, Xiaocen
    Qu, Zhigang
    Zhu, Xixin
    Yin, Wuliang
    STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2022, 21 (03): : 826 - 834
  • [4] Mineral Spectra Classification Based on One-Dimensional Dilated Convolutional Neural Network
    Tian Qing-lin
    Guo Bang-jie
    Ye Fa-wang
    Li Yao
    Liu Peng-fei
    Chen Xue-jiao
    SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42 (03) : 873 - 877
  • [5] Biomolecule classification by multiscale one-dimensional convolutional neural network
    Chang, Chia-En
    BIOPHYSICAL JOURNAL, 2023, 122 (03) : 141A - 141A
  • [6] Identification of encrypted and malicious network traffic based on one-dimensional convolutional neural network
    Zhou, Yan
    Shi, Huiling
    Zhao, Yanling
    Ding, Wei
    Han, Jing
    Sun, Hongyang
    Zhang, Xianheng
    Tang, Chang
    Zhang, Wei
    JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS, 2023, 12 (01):
  • [7] A topology identification method based on one-dimensional convolutional neural network for distribution network
    Ni, Jielong
    Tang, Zao
    Liu, Jia
    Zeng, Pingliang
    Baldorj, Chimeddorj
    ENERGY REPORTS, 2023, 9 : 355 - 362
  • [8] Identification of encrypted and malicious network traffic based on one-dimensional convolutional neural network
    Yan Zhou
    Huiling Shi
    Yanling Zhao
    Wei Ding
    Jing Han
    Hongyang Sun
    Xianheng Zhang
    Chang Tang
    Wei Zhang
    Journal of Cloud Computing, 12
  • [9] A topology identification method based on one-dimensional convolutional neural network for distribution network
    Ni, Jielong
    Tang, Zao
    Liu, Jia
    Zeng, Pingliang
    Baldorj, Chimeddorj
    ENERGY REPORTS, 2023, 9 : 355 - 362
  • [10] Electrochemical fingerprints identification of tea based on one-dimensional convolutional neural network
    Zhao, Huanping
    Xue, Dangqin
    Zhang, Li
    JOURNAL OF FOOD MEASUREMENT AND CHARACTERIZATION, 2023, 17 (03) : 2607 - 2613