Generalized Sound Recognition in Reverberant Environments

被引:12
|
作者
Ntalampiras, Stavros [1 ]
机构
[1] Univ Milan, Dept Comp Sci, Via Celoria 18, I-20133 Milan, Italy
来源
关键词
DIRECTED ACYCLIC GRAPHS; ACOUSTIC SCENES; CLASSIFICATION; RETRIEVAL;
D O I
10.17743/jaes.2019.0030
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Computational Auditory Scene Analysis (CASA) is typically achieved by statistical models trained offline on available data. Their performance relies heavily on the assumption that the process generating the data along with the recording conditions are stationary over time. Nowadays, the focus of CASA is moving from structured, well-defined scenarios to unrestricted scenes with realistic characteristics where the stationarity assumption might not be true. Therefore, there is a high demand for methodologies and tools dealing with a series of problems tightly coupled with such non-stationary conditions, such as changes in the recording conditions, reverberant effects, etc. This paper formulates these obstacles under the concept drift framework and explores the two fundamental adaptation approaches: active and passive. The overall aim is to learn online the statistical properties of the evolving data distribution and incorporate them into the recognition mechanism for boosting its performance. The proposed CASA system encompasses: (a) a concept drift detector and (b) an online adaptation module. The proposed framework was evaluated in the auditory analysis of three environments (office, meeting room, and lecture hall) with diverse characteristics (dimensions, reverberation times, etc.) The corpus is based on a combination of professional sound event collections. We report encouraging experimental results in terms of classification rate, false positive rate, false negative rate, and detection delay.
引用
收藏
页码:772 / 781
页数:10
相关论文
共 50 条
  • [1] A sound localizer for reverberant environments
    Vieira, J
    Fonte, J
    Gonçalves, D
    Almeida, L
    [J]. INTELLIGENT COMPONENTS AND INSTRUMENTS FOR CONTROL APPLICATIONS 2003, 2003, : 53 - 57
  • [2] On the poor robustness of sound equalization in reverberant environments
    Radlovic, BD
    Williamson, RC
    Kennedy, RA
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 881 - 884
  • [3] Neural Coding of Sound Envelope in Reverberant Environments
    Slama, Michael C. C.
    Delgutte, Bertrand
    [J]. JOURNAL OF NEUROSCIENCE, 2015, 35 (10): : 4452 - 4468
  • [4] On the poor robustness of sound equalization in reverberant environments
    Radlovic, Biljana D.
    Williamson, Robert C.
    Kennedy, Rodney A.
    [J]. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1999, 2 : 881 - 884
  • [5] Binaural sound segregation for multisource reverberant environments
    Roman, N
    Wang, DL
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING SIGNAL PROCESSING THEORY AND METHODS, 2004, : 373 - 376
  • [6] Speech Emotion Recognition in Noisy and Reverberant Environments
    Heracleous, Panikos
    Yasuda, Keiji
    Sugaya, Fumiaki
    Yoneyama, Akio
    Hashimoto, Masayuki
    [J]. 2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 262 - 266
  • [7] Strategies for distant speech recognition in reverberant environments
    Delcroix, Marc
    Yoshioka, Takuya
    Ogawa, Atsunori
    Kubo, Yotaro
    Fujimoto, Masakiyo
    Ito, Nobutaka
    Kinoshita, Keisuke
    Espi, Miquel
    Araki, Shoko
    Hori, Takaaki
    Nakatani, Tomohiro
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2015,
  • [8] Survey on Approaches to Speech Recognition in Reverberant Environments
    Yoshioka, Takuya
    Sehr, Armin
    Delcroix, Marc
    Kinoshita, Keisuke
    Maas, Roland
    Nakatani, Tomohiro
    Kellermann, Walter
    [J]. 2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [9] Position estimation of binaural sound source in reverberant environments
    Ghamdan, Lama
    Shoman, Mahmoud A. Ismail
    Abd Elwahab, Reda
    Ghamry, Nivin Abo El-Hadid
    [J]. EGYPTIAN INFORMATICS JOURNAL, 2017, 18 (02) : 87 - 93
  • [10] Methods for Robust Speech Recognition in Reverberant Environments: A Comparison
    Petrick, Rico
    Feher, Thomas
    Unoki, Masashi
    Hoffmann, Ruediger
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 582 - +