Speaker Clustering Based on Non-Negative Matrix Factorization Using Gaussian Mixture Model in Complementary Subspace

被引:0
|
作者
Nishida, Masafumi [1 ]
Yamamoto, Seiichi [2 ]
机构
[1] Shizuoka Univ, Dept Informat, Shizuoka, Japan
[2] Doshisha Univ, Dept Informat & Comp Sci, Kyoto, Japan
关键词
ACM proceedings; text tagging; DIARIZATION;
D O I
10.1145/3095713.3095721
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speech feature variations are mainly attributed to variations in phonetic and speaker information included in speech data. If these two types of information are separated from each other, more robust speaker clustering can be achieved. Principal component analysis transformation can separate speaker information from phonetic information, under the assumption that a space with large within-speaker variance is a "phonetic subspace" and a space within-speaker variance is a "phonetic sub-space". We propose a speaker clustering method based on non-negative matrix factorization using a Gaussian mixture model trained in the speaker subspace. We carried out comparative experiments of the proposed method with conventional methods based on Bayesian information criterion and Gaussian mixture model in an observation space. The experimental results showed that the proposed method can achieve higher clustering accuracy than conventional methods.
引用
下载
收藏
页数:5
相关论文
共 50 条
  • [1] Speaker Clustering Based on Non-negative Matrix Factorization
    Nishida, Masafumi
    Yamamoto, Seiichi
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 956 - 959
  • [2] A nonlinear orthogonal non-negative matrix factorization approach to subspace clustering
    Tolic, Dijana
    Antulov-Fantulin, Nino
    Kopriva, Ivica
    PATTERN RECOGNITION, 2018, 82 : 40 - 55
  • [3] Speaker conversion using kernel non-negative matrix factorization
    Xu Qinyu
    Lu Guanming
    Yan Jingjie
    Li Haibo
    Cheng Xiao
    The Journal of China Universities of Posts and Telecommunications, 2017, (05) : 60 - 67
  • [4] Speaker conversion using kernel non-negative matrix factorization
    Xu Qinyu
    Lu Guanming
    Yan Jingjie
    Li Haibo
    Cheng Xiao
    TheJournalofChinaUniversitiesofPostsandTelecommunications, 2017, 24 (05) : 60 - 67
  • [5] Speaker conversion using kernel non-negative matrix factorization
    Qinyu X.
    Guanming L.
    Jingjie Y.
    Haibo L.
    Xiao C.
    Guanming, Lu (lugm@njupt.edu.cn), 2017, Beijing University of Posts and Telecommunications (24): : 60 - 67
  • [6] Fast speaker adaptation using non-negative matrix factorization
    Duchateau, Jacques
    Leroy, Tobias
    Demuynck, Kris
    Van hamme, Hugo
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4269 - 4272
  • [7] Document clustering based on spectral clustering and non-negative matrix factorization
    Bao, Lei
    Tang, Sheng
    Li, Jintao
    Zhang, Yongdong
    Ye, Wei-Ping
    NEW FRONTIERS IN APPLIED ARTIFICIAL INTELLIGENCE, 2008, 5027 : 149 - +
  • [8] Mandarin Electrolaryngeal Voice Conversion with Combination of Gaussian Mixture Model and Non-negative Matrix Factorization
    Li, Ming
    Wang, Luting
    Xu, Zhicheng
    Cai, Danwei
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1360 - 1363
  • [9] Hulling versus Clustering - Two Complementary Applications of Non-Negative Matrix Factorization
    Klopotek, Mieczyslaw A.
    Wierzchon, Slawomir T.
    2021 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC 2021), 2021, : 2069 - 2076
  • [10] Robust non-negative matrix factorization for subspace learning
    School of Three Gorges Artificial Intelligence, Chongqing Three Gorges University, Wanzhou, Chongqing
    404100, China
    不详
    404100, China
    不详
    404100, China
    Ital. J. Pure Appl. Math., 2020, (511-520): : 511 - 520