Discovering speech phones using convolutive non-negative matrix factorisation with a sparseness constraint

被引:44
|
作者
O'Grady, Paul D. [1 ]
Pearlmutter, Barak A. [2 ]
机构
[1] Univ Coll Dublin, Complex & Adapt Syst Lab, Dublin 4, Ireland
[2] Natl Univ Ireland Maynooth, Hamilton Inst, Kildare, Ireland
基金
爱尔兰科学基金会;
关键词
Non-negative matrix factorisation; Sparse representations; Convolutive dictionaries; Speech phone analysis;
D O I
10.1016/j.neucom.2008.01.033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discovering a representation that allows auditory data to be parsimoniously represented is useful for many machine learning and signal processing tasks. Such a representation can be constructed by non-negative matrix factorisation (NMF), a method for finding parts-based representations of non-negative data. Here, we present an extension to convolutive NMF that includes a sparseness constraint, where the resultant algorithm has multiplicative updates and utilises the beta divergence as its reconstruction objective. In combination with a spectral magnitude transform of speech, this method discovers auditory objects that resemble speech phones along with their associated sparse activation patterns. We use these in a supervised separation scheme for monophonic mixtures, finding improved separation performance in comparison to standard convolutive NMF. (C) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:88 / 101
页数:14
相关论文
共 50 条
  • [41] Speech Dereverberation Using Non-Negative Convolutive Transfer Function and Spectro-Temporal Modeling
    Mohammadiha, Nasser
    Doclo, Simon
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (02) : 276 - 289
  • [42] Alternating Direction Method of Multipliers for Convolutive Non-Negative Matrix Factorization
    Li, Yinan
    Wang, Ruili
    Fang, Yuqiang
    Sun, Meng
    Luo, Zhangkai
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (12) : 7735 - 7748
  • [43] Bayesian extensions to non-negative matrix factorisation for audio signal modelling
    Virtanen, Tuomas
    Cemgil, A. Taylan
    Godsill, Simon
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 1825 - 1828
  • [44] Transcribing Bach chorales: Limitations and potentials of non-negative matrix factorisation
    Phon-Amnuaisuk, Somnuk
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2012,
  • [45] Non-Negative Matrix Factorization With Sparseness Constraints For Credit Risk Assessment
    Liu, Yulong
    Du, Jianlei
    Wang, Feng
    PROCEEDINGS OF 2013 IEEE INTERNATIONAL CONFERENCE ON GREY SYSTEMS AND INTELLIGENT SERVICES (GSIS), 2013, : 211 - 214
  • [46] Non-negative matrix factorisation improves Centiloid robustness in longitudinal studies
    Bourgeat, Pierrick
    Dore, Vincent
    Doecke, James
    Ames, David
    Masters, Colin L.
    Rowe, Christopher C.
    Fripp, Jurgen
    Villemagne, Victor L.
    NEUROIMAGE, 2021, 226
  • [47] Generalised non-negative matrix factorisation for air pollution source apportionment
    Lekinwala, Nirav L.
    Bhushan, Mani
    SCIENCE OF THE TOTAL ENVIRONMENT, 2022, 839
  • [48] Automatic image annotation by a loosely joint non-negative matrix factorisation
    Rad, Roya
    Jamzad, Mansour
    IET COMPUTER VISION, 2015, 9 (06) : 806 - 813
  • [49] SPEECH OVERLAP DETECTION USING CONVOLUTIVE NON-NEGATIVE SPARSE CODING: NEW IMPROVEMENTS AND INSIGHTS
    Geiger, Juergen T.
    Vipperla, Ravichander
    Evans, Nicholas
    Schuller, Bjoern
    Rigoll, Gerhard
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 340 - 344
  • [50] Automatic Model Order Selection for Convolutive Non-Negative Matrix Factorization
    Li, Yinan
    Zhang, Xiongwei
    Sun, Meng
    Jia, Chong
    Zou, Xia
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (10) : 1867 - 1870