Speaker indexing using neural network clustering of vowel spectra

被引:3
|
作者
Roy D.K. [1 ]
机构
[1] MIT Media Lab., Cambridge, MA 02139
关键词
Audio retrieval; Audio skimming; Speaker indexing;
D O I
10.1007/BF02277195
中图分类号
学科分类号
摘要
Speaker indexing refers to the process of separating speakers within a recording and assigning indices to each unique speaker. This paper describes a new speaker indexing algorithm which dynamically generates and trains a neural network to model each postulated speaker found within a recording. Each neural network is trained to differentiate the vowel spectra of one specific speaker from all other speakers. A method for combining speaker indexing and other annotations of a recording in a general framework is also presented. The speaker indexing system is currently being incorporated into several application systems in the Speech Group at the MIT Media Lab. © 1997 Kluwer Academic Publishers.
引用
收藏
页码:143 / 149
页数:6
相关论文
共 50 条
  • [1] Speaker indexing and adaptation using speaker clustering based on statistical model selection
    Nishida, M
    Kawahara, T
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 353 - 356
  • [2] Neural network ensemble based on vowel classification for chinese speaker recognition
    Qian, Bo
    Tang, Zhen-min
    Li, Yan-ping
    Xu, Li-min
    Zhang, Yan
    [J]. ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 3, PROCEEDINGS, 2007, : 141 - +
  • [3] Speaker-Independent Vowel Recognition for Malay Children Using Time-Delay Neural Network
    Yong, B. F.
    Ting, H. N.
    [J]. 5TH KUALA LUMPUR INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING 2011 (BIOMED 2011), 2011, 35 : 565 - 568
  • [4] Speaker recognition using artificial neural networks based on vowel phonemes
    Badran, EFMF
    Selim, H
    [J]. 2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 796 - 802
  • [5] Vowel Based Neural Networks for Speaker Verification
    Xu, Yun-Fei
    Huang, Yu-Fei
    Zhou, Ruo-Hua
    Yan, Yong-Hong
    [J]. INTERNATIONAL ACADEMIC CONFERENCE ON THE INFORMATION SCIENCE AND COMMUNICATION ENGINEERING (ISCE 2014), 2014, : 89 - 97
  • [6] SPEAKER IDENTIFICATION AND CLUSTERING USING CONVOLUTIONAL NEURAL NETWORKS
    Lukic, Yanick
    Vogt, Carlo
    Durr, Oliver
    Stadelmann, Thilo
    [J]. 2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2016,
  • [7] Automatic segmentation and clustering for speaker indexing of audio databases
    Chen, YX
    Gao, J
    Wang, Q
    [J]. PROCEEDINGS OF THE 11TH JOINT INTERNATIONAL COMPUTER CONFERENCE, 2005, : 399 - 403
  • [8] Speaker-Independent Malay Vowel Recognition of Children using Neural Networks
    Ting, H. N.
    Lam, Y. M.
    [J]. WORLD CONGRESS ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING, VOL 25, PT 4: IMAGE PROCESSING, BIOSIGNAL PROCESSING, MODELLING AND SIMULATION, BIOMECHANICS, 2010, 25 : 288 - 291
  • [9] Enhancing Speaker Diarization with Deep Neural Network Embeddings and Spectral Clustering
    Yanshan University, China
    [J].
  • [10] Concurrent Vowel Identification Using the Deep Neural Network
    Prasad, Vandana
    Chintanpalli, Anantha Krishna
    [J]. MACHINE LEARNING AND BIG DATA ANALYTICS (PROCEEDINGS OF INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND BIG DATA ANALYTICS (ICMLBDA) 2021), 2022, 256 : 78 - 84