An Improved Ranking-Based Feature Enhancement Approach for Robust Speaker Recognition

被引:8
|
作者
Yan, Furong [1 ]
Men, Aidong [1 ]
Yang, Bo [1 ]
Jiang, Zhuqing [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing 100000, Peoples R China
来源
IEEE ACCESS | 2016年 / 4卷
关键词
Robustness; feature warping; missing data method; ranking feature; autocorrelation; rank correlation; open-set speaker recognition; TRANSFORMATIONS;
D O I
10.1109/ACCESS.2016.2607778
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although the field of automatic speaker or speech recognition has been extensively studied over the past decades, the lack of robustness has remained a major challenge. Feature warping is a promising approach and its effectiveness significantly depends on the relative positions of each of the features in a sliding window. However, the relative positions are changed due to the non-linear effect of noise. Aiming at the problem, this paper takes the advantage of ranking feature, which is obtained directly by sorting a feature sequence in descending order, to propose a method. It first labels the central frame in a sliding window as speech or noise dominant ("reliable'' or "unreliable''). In the unreliable case, the ranking of the central frame is estimated. Subsequently, the estimated ranking is mapped to a warped feature using a desired target distribution for recognition experiments. Through the theoretical analysis and experimental results, it is found that autocorrelation of a ranking sequence is larger than that of the corresponding feature sequence. What is more, rank correlation is not easily influenced by abnormal data or data that are highly variable. Thus, this paper deals with a ranking sequence rather than a feature sequence. The proposed feature enhancement approach is evaluated in an open-set speaker recognition system. The experimental results show that it outperforms missing data method based on linear interpolation and feature warping in terms of recognition performance in all noise conditions. Furthermore, the method proposed here is a feature-based method, which may be combined with other technologies, such as model-based, scores-based, to enhance the robustness of speaker or speech recognition system.
引用
下载
收藏
页码:5258 / 5267
页数:10
相关论文
共 50 条
  • [31] A Novel Hierarchical Approach to Ranking-Based Collaborative Filtering
    Nikolakopoulos, Athanasios N.
    Kouneli, Marianna
    Garofalakis, John
    ENGINEERING APPLICATIONS OF NEURAL NETWORKS, PT II, 2013, 384 : 50 - 59
  • [32] An Auditory Feature Extraction Method for Robust Speaker Recognition
    Hu, Fengsong
    Cao, Xiaoyu
    PROCEEDINGS OF 2012 IEEE 14TH INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY, 2012, : 1067 - 1071
  • [33] Robust Feature Extraction for Speaker Recognition Based on Constrained Nonnegative Tensor Factorization
    Qiang Wu
    Li-Qing Zhang
    Guang-Chuan Shi
    Journal of Computer Science and Technology, 2010, 25 : 783 - 792
  • [34] Robust Feature Extraction for Speaker Recognition Based on Constrained Nonnegative Tensor Factorization
    吴强
    张丽清
    石光川
    Journal of Computer Science & Technology, 2010, 25 (04) : 783 - 792
  • [35] Robust Feature Extraction for Speaker Recognition Based on Constrained Nonnegative Tensor Factorization
    Wu, Qiang
    Zhang, Li-Qing
    Shi, Guang-Chuan
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 25 (04) : 783 - 792
  • [36] Improved Multitaper PNCC Feature for Robust Speaker Verification
    Liu, Yi
    He, Liang
    Liu, Jia
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 168 - 172
  • [37] Robust Speaker Verification With A Two Classifier Format and Feature Enhancement
    Edwards, Joshua S.
    Ramachandran, Ravi P.
    Thayasivam, Umashanger
    2017 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2017,
  • [38] DNN-based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification
    Oo, Zeyan
    Kawakami, Yuta
    Wang, Longbiao
    Nakagawa, Seiichi
    Xiao, Xiong
    Iwahashi, Masahiro
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2204 - 2208
  • [39] Improved Deep Speaker Feature Learning for Text-Dependent Speaker Recognition
    Li, Lantian
    Lin, Yiye
    Zhang, Zhiyong
    Wang, Dong
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 426 - 429
  • [40] A Novel Ranking-Based Clustering Approach for Hyperspectral Band Selection
    Jia, Sen
    Tang, Guihua
    Zhu, Jiasong
    Li, Qingquan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (01): : 88 - 102