Advances in multi-speaker conversational speech recognition and understanding

被引:0
|
作者
Hori, Takaaki [1 ]
Araki, Shoko [1 ]
Nakatani, Tomohiro O. [1 ]
Nakamura, Atsushi [1 ]
机构
[1] Media Tnformanon Laboratory, NTT Coinmnnication Science Laboratories, United States
来源
NTT Technical Review | 2013年 / 11卷 / 12期
关键词
D O I
暂无
中图分类号
学科分类号
摘要
引用
收藏
相关论文
共 50 条
  • [31] Research on ASIC for multi-speaker isolated word recognition
    Xiong, B
    Sun, YH
    [J]. 1996 2ND INTERNATIONAL CONFERENCE ON ASIC, PROCEEDINGS, 1996, : 135 - 137
  • [32] INTEGRATION OF SPEECH SEPARATION, DIARIZATION, AND RECOGNITION FOR MULTI-SPEAKER MEETINGS: SYSTEM DESCRIPTION, COMPARISON, AND ANALYSIS
    Raj, Desh
    Denisov, Pavel
    Chen, Zhuo
    Erdogan, Hakan
    Huang, Zili
    He, Maokui
    Watanabe, Shinji
    Du, Jun
    Yoshioka, Takuya
    Luo, Yi
    Kanda, Naoyuki
    Li, Jinyu
    Wisdom, Scott
    Hershey, John R.
    [J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 897 - 904
  • [33] Single-speaker/multi-speaker co-channel speech classification
    Rossignol, Stephane
    Pietquini, Olivier
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2322 - 2325
  • [34] Unsupervised Discovery of Phoneme Boundaries in Multi-Speaker Continuous Speech
    Armstrong, Tom
    Antetomaso, Stephanie
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING (ICDL), 2011,
  • [35] LCMV BEAMFORMING WITH SUBSPACE PROJECTION FOR MULTI-SPEAKER SPEECH ENHANCEMENT
    Hassani, Amin
    Bertrand, Alexander
    Moonen, Marc
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 91 - 95
  • [36] Neural Speech Tracking Highlights the Importance of Visual Speech in Multi-speaker Situations
    Haider, Chandra L.
    Park, Hyojin
    Hauswald, Anne
    Weisz, Nathan
    [J]. JOURNAL OF COGNITIVE NEUROSCIENCE, 2024, 36 (01) : 128 - 142
  • [37] Integration of audio-visual information for multi-speaker multimedia speaker recognition
    Yang, Jichen
    Chen, Fangfan
    Cheng, Yu
    Lin, Pei
    [J]. DIGITAL SIGNAL PROCESSING, 2024, 145
  • [38] PHONEME DEPENDENT SPEAKER EMBEDDING AND MODEL FACTORIZATION FOR MULTI-SPEAKER SPEECH SYNTHESIS AND ADAPTATION
    Fu, Ruibo
    Tao, Jianhua
    Wen, Zhengqi
    Zheng, Yibin
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6930 - 6934
  • [39] Lightweight, Multi-Speaker, Multi-Lingual Indic Text-to-Speech
    Singh, Abhayjeet
    Nagireddi, Amala
    Jayakumar, Anjali
    Deekshitha, G.
    Bandekar, Jesuraja
    Roopa, R.
    Badiger, Sandhya
    Udupa, Sathvik
    Kumar, Saurabh
    Ghosh, Prasanta Kumar
    Murthy, Hema A.
    Zen, Heiga
    Kumar, Pranaw
    Kant, Kamal
    Bole, Amol
    Singh, Bira Chandra
    Tokuda, Keiichi
    Hasegawa-Johnson, Mark
    Olbrich, Philipp
    [J]. IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 790 - 798
  • [40] A Multi-channel/Multi-speaker Articulatory Database in Mandarin for Speech Visualization
    Zhang, Dan
    Liu, Xianqian
    Yan, Nan
    Wang, Lan
    Zhu, Yun
    Chen, Hui
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 299 - +