Exploiting known sound source signals to improve ICA-based robot audition in speech separation and recognition

被引:0
|
作者
Takeda, Ryu [1 ]
Nakadai, Kazuhiro [2 ]
Komatani, Kazunori [1 ]
Ogata, Tetsuya [1 ]
Okuno, Hiroshi G. [1 ]
机构
[1] Kyoto Univ, Grad Sch Informat, Kyoto 6068501, Japan
[2] Honda Res Inst, Wako, Saitama 3510114, Japan
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes a new semi-blind source separation (semi-BSS) technique with independent component analysis (ICA) for enhancing a target source of interest and for suppressing other known interference sources. The semi-BSS technique is necessary for double-talk free robot audition systems in order to utilize known sound source signals such as self speech, music, or TV-sound, through a line-in or ubiquitous network. Unlike the conventional semi-BSS with ICA, we use the time-frequency domain convolution model to describe the reflection of the sound and a new mixing process of sounds for ICA. In other words, we consider that reflected sounds during some delay time are different from the original. ICA then separates the reflections as other interference sources. The model enables us to eliminate the frame size limitations of the frequency-domain ICA, and ICA can separate the known sources under a highly reverberative environment. Experimental results show that our method outperformed the conventional semi-BSS using ICA under simulated normal and highly reverberative environments.
引用
收藏
页码:1763 / +
页数:2
相关论文
共 50 条
  • [41] Speech Recognition Using Blind Source Separation and Dereverberation Method for Mixed Sound of Speech and Music
    Wang, Longbiao
    Odani, Kyohei
    Kai, Atsuhiko
    Li, Weifeng
    [J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [42] On-line Sound Event Detection and Recognition Based on Adaptive Background Model for Robot Audition
    Li, Xinguo
    Wang, Yi
    Fan, Ting
    Zhang, Dongmang
    Liu, Hong
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO), 2013, : 1089 - 1094
  • [43] Sound Source Separation for Plural Passenger Speech Recognition in Smart Mobility System
    Fukui, Masahiro
    Watanabe, Toshihiko
    Kanazawa, Minato
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2018, 64 (03) : 399 - 405
  • [44] Sound Source Separation for Plural Passenger Speech Recognition in Smart Mobility System
    Fukui, Masahiro
    Wakisaka, Youhei
    Watanabe, Toshihiko
    Kanazawa, Minato
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2018,
  • [45] A single-chip FPGA design for real-time ICA-based blind source separation algorithm
    Charoensak, C
    Sattar, F
    [J]. 2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS, 2005, : 5822 - 5825
  • [46] Blind Source Separation Based on Convolution Mixture Speech Signals
    Yan, Li
    Zhen, Yang
    [J]. 2010 6TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS NETWORKING AND MOBILE COMPUTING (WICOM), 2010,
  • [47] Controlling Robot using Thai Speech Recognition Based on Eigen Sound
    Phanprasit, Tanasak
    [J]. 2014 6TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST), 2014, : 57 - 62
  • [48] Design of low-cost FPGA hardware for real-time ICA-based blind source separation algorithm
    Charoensak, C
    Sattar, F
    [J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2005, 2005 (18) : 3076 - 3086
  • [49] Design of Low-Cost FPGA Hardware for Real-time ICA-Based Blind Source Separation Algorithm
    Charayaphan Charoensak
    Farook Sattar
    [J]. EURASIP Journal on Advances in Signal Processing, 2005
  • [50] OPTIMUM BLOCK ADAPTIVE ICA FOR SEPARATION OF REAL AND COMPLEX SIGNALS WITH KNOWN SOURCE DISTRIBUTIONS IN DYNAMIC FLAT FADING ENVIRONMENTS
    Ranganathan, Raghuram
    Yang, Thomas
    Mikhael, Wasfy B.
    [J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2010, 19 (02) : 367 - 379