Model-based sequential organization in cochannel speech

被引:50
|
作者
Shao, Y [1 ]
Wang, DL
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Ohio State Univ, Ctr Cognit Sci, Columbus, OH 43210 USA
关键词
auditory scene analysis; cochannel speech; model-based approach; sequential organization; speaker identification (SID); usable speech;
D O I
10.1109/TSA.2005.854106
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A human listener has the ability to follow a speaker's voice while others are speaking simultaneously; in particular, the listener can organize the time-frequency energy of the same speaker across time into a single stream. In this paper, we focus on sequential organization in cochannel speech, or mixtures of two voices. We extract minimally corrupted segments, or usable speech, in cochannel speech using a robust multipitch tracking algorithm. The extracted, usable speech is shown to capture speaker characteristics and improves speaker identification (SID) performance across various target-to-interferer ratios. To utilize speaker characteristics for sequential organization, we extend the traditional SID framework to cochannel speech and derive a joint objective for sequential grouping and SID, leading to a problem of search for the optimum hypothesis. Subsequently we propose a hypothesis pruning algorithm based on speaker models in order to make the search computationally efficient. Evaluation results show that the proposed system approaches the ceiling SID performance obtained with prior pitch information and yields significant improvement over alternative approaches to sequential organization.
引用
收藏
页码:289 / 298
页数:10
相关论文
共 50 条
  • [1] Unsupervised sequential organization for cochannel speech separation
    Hu, Ke
    Wang, DeLiang
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2794 - +
  • [2] An iterative model-based approach to cochannel speech separation
    Hu, Ke
    Wang, DeLiang
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2013,
  • [3] An iterative model-based approach to cochannel speech separation
    Ke Hu
    DeLiang Wang
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2013
  • [4] AN APPROACH TO SEQUENTIAL GROUPING IN COCHANNEL SPEECH
    Hu, Ke
    Wang, DeLiang
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4636 - 4639
  • [5] Cochannel Speech Separation Using Multi-pitch Estimation and Model Based Voiced Sequential Grouping
    Li, Ming
    Cao, Chuan
    Wang, Di
    Lu, Ping
    Fu, Qiang
    Yan, Yonghong
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 151 - 154
  • [6] Sequential Model-Based Ensemble Optimization
    Lacoste, Alexandre
    Larochelle, Hugo
    Marchand, Mario
    Laviolette, Francois
    [J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2014, : 440 - 448
  • [7] Model-based design of speech interfaces
    Berti, S
    Paternò, F
    [J]. INTERACTIVE SYSTEMS: DESIGN, SPECIFICATION, AND VERIFICATION, 2003, 2844 : 231 - 244
  • [8] INDIRECT MODEL-BASED SPEECH ENHANCEMENT
    Le Roux, Jonathan
    Hershey, John R.
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4045 - 4048
  • [9] Adaptive model-based speech enhancement
    Logan, B
    Robinson, T
    [J]. SPEECH COMMUNICATION, 2001, 34 (04) : 351 - 368
  • [10] A model-based approach to sequential fault diagnosis
    Pietersma, Jurryt
    van Gemund, Arjan J. C.
    Bos, Andre
    [J]. AUTOTESTCON 2005, 2005, : 621 - 627