Estimating Ensemble Location and Width in Binaural Recordings of Music with Convolutional Neural Networks

被引:0
|
作者
Antoniuk, Pawel [1 ]
Zielinski, Slawomir K. [1 ]
机构
[1] Bialystok Tech Univ, Fac Comp Sci, Bialystok, Poland
关键词
ensemble width; ensemble location; binaural; spatial audio; localization; convolutional neural net- work; head-related transfer function; angle of arrival; SPATIAL AUDIO; SOUND SOURCE; ROBUST LOCALIZATION; HEAD MOVEMENTS; MODEL; SPEAKERS; DATABASE; FRONT;
D O I
10.24425/aoa.2025.153648
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Binaural audio technology has been in existence for many years. However, its popularity has significantly increased over the past decade as a consequence of advancements in virtual reality and streaming techniques. Along with its growing popularity, the quantity of publicly accessible binaural audio recordings has also expanded. Consequently, there is now a need for automated and objective retrieval of spatial content information, with ensemble location and width being the most prominent. This study presents a novel method for estimating these ensemble parameters in binaural recordings of music. For this purpose, a dataset of 23 040 binaural recordings was synthesized from 192 publicly-available music recordings using 30 head-related transfer functions. The synthesized excerpts were then used to train a multi-task spectrogram-based convolutional neural network model, aiming to estimate the ensemble location and width for unseen recordings. The results indicate that a model for estimating ensemble parameters can be successfully constructed with low prediction errors: 4.76 circle (+/- 0.10 circle) for ensemble location and 8.57 circle (+/- 0.19 circle) for ensemble width. The method developed in this study outperforms previous spatiogram-based techniques recently published in the literature and shows promise for future development as part of a novel tool for binaural audio recordings analysis.
引用
收藏
页码:81 / 93
页数:13
相关论文
共 50 条
  • [41] Ensemble of convolutional neural networks trained with different activation functions
    Maguolo, Gianluca
    Nanni, Loris
    Ghidoni, Stefano
    Expert Systems with Applications, 2021, 166
  • [42] Balanced Image Data Based Ensemble of Convolutional Neural Networks
    Jan, Zohaib Md.
    Verma, Brijesh
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 2418 - 2424
  • [43] A Novel Online Ensemble Convolutional Neural Networks for Streaming Data
    Xuan Cuong Pham
    Thi Thu Thuy Nguyen
    Liew, Alan Wee-Chung
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT I, 2019, 11953 : 199 - 210
  • [44] Ensemble of Deep Convolutional Neural Networks for Prognosis of Ischemic Stroke
    Choi, Youngwon
    Kwon, Yongchan
    Lee, Hanbyul
    Kim, Beom Joon
    Paik, Myunghee Cho
    Won, Joong-Ho
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, 2016, 2016, 10154 : 231 - 243
  • [45] Iterative ensemble pseudo-labeling for convolutional neural networks
    Yildiz, Serdar
    Amasyali, Mehmet Fatih
    SIGMA JOURNAL OF ENGINEERING AND NATURAL SCIENCES-SIGMA MUHENDISLIK VE FEN BILIMLERI DERGISI, 2024, 42 (03): : 862 - 874
  • [46] Building an Ensemble of Convolutional Neural Networks for Classifying Panoramic Images
    Arkhipov, P. O.
    Philippskih, S. L.
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2022, 32 (03) : 511 - 514
  • [47] Particle streak velocimetry using ensemble convolutional neural networks
    Grayver, Alexander V.
    Noir, Jerome
    EXPERIMENTS IN FLUIDS, 2020, 61 (02)
  • [48] Particle streak velocimetry using ensemble convolutional neural networks
    Alexander V. Grayver
    Jerome Noir
    Experiments in Fluids, 2020, 61
  • [49] Ensemble feature learning for material recognition with convolutional neural networks
    Bian, Peng
    Li, Wanwan
    Jin, Yi
    Zhi, Ruicong
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2018,
  • [50] Estimating the directed information to infer causal relationships in ensemble neural spike train recordings
    Quinn, Christopher J.
    Coleman, Todd P.
    Kiyavash, Negar
    Hatsopoulos, Nicholas G.
    JOURNAL OF COMPUTATIONAL NEUROSCIENCE, 2011, 30 (01) : 17 - 44