Speaker Detection in Audio Stream via Probabilistic Prediction Using Generalized GEBI

被引：1

作者：

Sakata, Koki ^{[1
]}

Sakashita, Shota ^{[1
]}

Matsuo, Kazuya ^{[1
]}

Kurogi, Shuichi ^{[1
]}

机构：

[1] Kyushu Inst Technol, Kitakyushu, Fukuoka 8048550, Japan

来源：

NEURAL INFORMATION PROCESSING, ICONIP 2016, PT IV | 2016年 / 9950卷

关键词：

Probabilistic prediction; Speaker detection; Generalized Gibbs-distribution-based extended Bayesian inference;

D O I：

10.1007/978-3-319-46681-1_37

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a method of speaker detection using probabilistic prediction for avoiding the tuning of thresholds to detect a speaker in an audio stream. We introduce g-GEBI (generalized GEBI) as a generalization of BI (Bayesian Inference) and GEBI (Gibbsdistribution- based Extended BI) to execute iterative detection of a speaker in audio stream uttered by more than one speaker. Then, we show a method of probabilistic prediction in multiclass classification to classify the results of speaker detection. By means of numerical experiments using recorded real speech data, we examine the properties and the effectiveness of the present method. Especially, we show that g-GEBI and g-BI (generalized BI) are more effective than the conventional BI and GEBI in incremental speaker detection task.

引用

页码：302 / 311

页数：10

共 50 条

[41] Using multi-stream hierarchical deep neural network to extract deep audio feature for acoustic event detection
Yanxiong Li
Xue Zhang
Hai Jin
Xianku Li
Qin Wang
Qianhua He
Qian Huang
Multimedia Tools and Applications, 2018, 77 : 897 - 916
[42] Horizontally scalable probabilistic generalized suffix tree (PGST) based route prediction using map data and GPS traces
Tiwari V.S.
Arya A.
Tiwari, Vishnu Shankar (vishnustiwari@gmail.com), 1600, SpringerOpen (04)
[43] Process prediction and detection of faults using probabilistic bidirectional recurrent neural networks on real plant data
Yerimah L.E.
Ghosh S.
Wang Y.
Cao Y.
Flores-Cerrillo J.
Bequette B.W.
Journal of Advanced Manufacturing and Processing, 2022, 4 (04)
[44] Speaker verification from mixture of speech and non-speech audio signals via using pole distribution of piecewise linear predictive coding coefficients
Tagomori T.
Tsuruda R.
Matsuo K.
Kurogi S.
Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (12) : 15585 - 15595
[45] Prediction of Sleep Stages Via Deep Learning Using Smartphone Audio Recordings in Home Environments: Model Development and Validation
Tran, Hai Hong
Hong, Jung Kyung
Jang, Hyeryung
Jung, Jinhwan
Kim, Jongmok
Hong, Joonki
Lee, Minji
Kim, Jeong-Whun
Kushida, Clete A.
Lee, Dongheon
Kim, Daewoo
Yoon, In-Young
JOURNAL OF MEDICAL INTERNET RESEARCH, 2023, 25
[46] Optimized technique for speaker changes detection in multispeaker audio recording using pyknogram and efficient distance metric (vol 19, e0314073, 2024)
Kaur, S.
Prabha, C.
Singh, R. P.
Gupta, D.
Juneja, S.
Gupta, P.
PLOS ONE, 2025, 20 (03):
[47] Robust environmental change detection using PTZ camera via spatial-temporal probabilistic modeling
Hu, Jwu-Sheng
Su, Tzung-Min
IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2007, 12 (03) : 339 - 344
[48] Robust environmental change detection using PTZ camera via spatial-temporal probabilistic modeling
Hu, JS
Su, TM
2005 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS, 2005, : 50 - 55
[49] Automatic detection and prediction of COVID-19 in cough audio signals using coronavirus herd immunity optimizer algorithm
Ayappan, G.
Anila, S.
SCIENTIFIC REPORTS, 2025, 15 (01):
[50] A Generalized Model for Wind Turbine Faulty Condition Detection Using Combination Prediction Approach and Information Entropy
Chen, J. S.
Chen, W. G.
Li, J.
Sun, P.
JOURNAL OF ENVIRONMENTAL INFORMATICS, 2018, 32 (01) : 14 - 24

← 1 2 3 4 5 →