Speaker Detection in Audio Stream via Probabilistic Prediction Using Generalized GEBI

被引:1
|
作者
Sakata, Koki [1 ]
Sakashita, Shota [1 ]
Matsuo, Kazuya [1 ]
Kurogi, Shuichi [1 ]
机构
[1] Kyushu Inst Technol, Kitakyushu, Fukuoka 8048550, Japan
关键词
Probabilistic prediction; Speaker detection; Generalized Gibbs-distribution-based extended Bayesian inference;
D O I
10.1007/978-3-319-46681-1_37
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a method of speaker detection using probabilistic prediction for avoiding the tuning of thresholds to detect a speaker in an audio stream. We introduce g-GEBI (generalized GEBI) as a generalization of BI (Bayesian Inference) and GEBI (Gibbsdistribution- based Extended BI) to execute iterative detection of a speaker in audio stream uttered by more than one speaker. Then, we show a method of probabilistic prediction in multiclass classification to classify the results of speaker detection. By means of numerical experiments using recorded real speech data, we examine the properties and the effectiveness of the present method. Especially, we show that g-GEBI and g-BI (generalized BI) are more effective than the conventional BI and GEBI in incremental speaker detection task.
引用
收藏
页码:302 / 311
页数:10
相关论文
共 50 条
  • [41] Using multi-stream hierarchical deep neural network to extract deep audio feature for acoustic event detection
    Yanxiong Li
    Xue Zhang
    Hai Jin
    Xianku Li
    Qin Wang
    Qianhua He
    Qian Huang
    Multimedia Tools and Applications, 2018, 77 : 897 - 916
  • [42] Horizontally scalable probabilistic generalized suffix tree (PGST) based route prediction using map data and GPS traces
    Tiwari V.S.
    Arya A.
    Tiwari, Vishnu Shankar (vishnustiwari@gmail.com), 1600, SpringerOpen (04)
  • [43] Process prediction and detection of faults using probabilistic bidirectional recurrent neural networks on real plant data
    Yerimah L.E.
    Ghosh S.
    Wang Y.
    Cao Y.
    Flores-Cerrillo J.
    Bequette B.W.
    Journal of Advanced Manufacturing and Processing, 2022, 4 (04)
  • [44] Speaker verification from mixture of speech and non-speech audio signals via using pole distribution of piecewise linear predictive coding coefficients
    Tagomori T.
    Tsuruda R.
    Matsuo K.
    Kurogi S.
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (12) : 15585 - 15595
  • [45] Prediction of Sleep Stages Via Deep Learning Using Smartphone Audio Recordings in Home Environments: Model Development and Validation
    Tran, Hai Hong
    Hong, Jung Kyung
    Jang, Hyeryung
    Jung, Jinhwan
    Kim, Jongmok
    Hong, Joonki
    Lee, Minji
    Kim, Jeong-Whun
    Kushida, Clete A.
    Lee, Dongheon
    Kim, Daewoo
    Yoon, In-Young
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2023, 25
  • [46] Optimized technique for speaker changes detection in multispeaker audio recording using pyknogram and efficient distance metric (vol 19, e0314073, 2024)
    Kaur, S.
    Prabha, C.
    Singh, R. P.
    Gupta, D.
    Juneja, S.
    Gupta, P.
    PLOS ONE, 2025, 20 (03):
  • [47] Robust environmental change detection using PTZ camera via spatial-temporal probabilistic modeling
    Hu, Jwu-Sheng
    Su, Tzung-Min
    IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2007, 12 (03) : 339 - 344
  • [48] Robust environmental change detection using PTZ camera via spatial-temporal probabilistic modeling
    Hu, JS
    Su, TM
    2005 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS, 2005, : 50 - 55
  • [49] Automatic detection and prediction of COVID-19 in cough audio signals using coronavirus herd immunity optimizer algorithm
    Ayappan, G.
    Anila, S.
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [50] A Generalized Model for Wind Turbine Faulty Condition Detection Using Combination Prediction Approach and Information Entropy
    Chen, J. S.
    Chen, W. G.
    Li, J.
    Sun, P.
    JOURNAL OF ENVIRONMENTAL INFORMATICS, 2018, 32 (01) : 14 - 24