Ontology-aware Learning and Evaluation for Audio Tagging

被引:0
|
作者
Liu, Haohe [1 ]
Kong, Qiuqiang [2 ]
Liu, Xubo [1 ]
Mei, Xinhao [1 ]
Wang, Wenwu [1 ]
Plumbley, Mark D. [1 ]
机构
[1] Univ Surrey, CVSSP, Guildford, Surrey, England
[2] ByteDance, Speech Audio & Mus Intelligence SAMI Grp, Beijing, Peoples R China
来源
关键词
machine learning; audio tagging; ontology; evaluation metric; CLASSIFICATION;
D O I
10.21437/Interspeech.2023-979
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This study defines a new evaluation metric for audio tagging tasks to alleviate the limitation of the mean average precision (mAP) metric. The mAP metric treats different kinds of sound as independent classes without considering their relations. The proposed metric, ontology-aware mean average precision (OmAP), addresses the weaknesses of mAP by utilizing additional ontology during evaluation. Specifically, we reweight the false positive events in the model prediction based on the AudioSet ontology graph distance to the target classes. The OmAP also provides insights into model performance by evaluating different coarse-grained levels in the ontology graph. We conduct a human assessment and show that OmAP is more consistent with human perception than mAP. We also propose an ontology-based loss function (OBCE) that reweights binary cross entropy (BCE) loss based on the ontology distance. Our experiment shows that OBCE can improve both mAP and OmAP metrics on the AudioSet tagging task.
引用
收藏
页码:3799 / 3803
页数:5
相关论文
共 50 条
  • [41] Exploiting Tagging in Ontology-based e-Learning
    Capuano, Nicola
    Gaeta, Angelo
    Orciuoli, Francesco
    Paolozzi, Stefano
    ONTOLOGY FOR E-TECHNOLOGIES, PROCEEDINGS, 2009, : 3 - +
  • [42] Audiovisual transfer learning for audio tagging and sound event detection
    Boes, Wim
    Van Hamme, Hugo
    INTERSPEECH 2021, 2021, : 2401 - 2405
  • [43] An approach for combining ontology learning and semantic tagging in the ontology development process: eGovernment use case
    Stojanovic, Ljiljana
    Stojanovic, Nenad
    Ma, Jun
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2007, PROCEEDINGS, 2007, 4831 : 249 - 260
  • [44] Streaming Audio Transformers for Online Audio Tagging
    Dinkel, Heinrich
    Yan, Zhiyong
    Wang, Yongqing
    Zhang, Junbo
    Wang, Yujun
    Bin Wang
    INTERSPEECH 2024, 2024, : 1145 - 1149
  • [45] Enhancing Personal Learning Environments by Context-Aware Tagging
    Cao, Yiwei
    Kovachev, Dejan
    Klamma, Ralf
    Lau, Rynson W. H.
    ADVANCES IN WEB-BASED LEARNING-ICWL 2010, 2010, 6483 : 11 - +
  • [46] Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging
    Xu, Yong
    Huang, Qiang
    Wang, Wenwu
    Foster, Peter
    Sigtia, Siddharth
    Jackson, Philip J. B.
    Plumbley, Mark D.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (06) : 1230 - 1241
  • [47] Audio Embedding-Aware Dialogue Policy Learning
    Lopez Zorrilla, Asier
    Ines Torres, Maria
    Cuayahuitl, Heriberto
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 525 - 538
  • [48] Ontology mapping through tagging
    Conroy, Colm
    O'Sullivan, Declan
    Lewis, David
    CISIS 2008: THE SECOND INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS, PROCEEDINGS, 2008, : 886 - 891
  • [49] Deep Learning for Audio Event Detection and Tagging on Low-Resource Datasets
    Morfi, Veronica
    Stowell, Dan
    APPLIED SCIENCES-BASEL, 2018, 8 (08):
  • [50] Ontology based Context Aware e-Learning System
    Guermah, Hatim
    Fissaa, Tarik
    Hafiddi, Hatim
    Nassar, Mahmoud
    Kriouile, Abdelaziz
    2013 3RD INTERNATIONAL SYMPOSIUM ISKO-MAGHREB, 2013,