Graph-Based Audio Classification Using Pre-Trained Models and Graph Neural Networks

被引:2
|
作者
Castro-Ospina, Andres Eduardo [1 ]
Solarte-Sanchez, Miguel Angel [1 ]
Vega-Escobar, Laura Stella [1 ]
Isaza, Claudia [2 ]
Martinez-Vargas, Juan David [3 ]
机构
[1] Inst Tecnol Metropolitano, Grp Invest Maquinas Inteligentes & Reconocimiento, Medellin 050013, Colombia
[2] Univ Antioquia UdeA, Elect Engn Dept, SISTEMIC, Medellin 050010, Colombia
[3] Univ EAFIT, GIDITIC, Medellin 050022, Colombia
关键词
ecoacoustics; environmental sound classification; graph neural networks; graph representation learning; node classification; pre-trained models; MUSIC;
D O I
10.3390/s24072106
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Sound classification plays a crucial role in enhancing the interpretation, analysis, and use of acoustic data, leading to a wide range of practical applications, of which environmental sound analysis is one of the most important. In this paper, we explore the representation of audio data as graphs in the context of sound classification. We propose a methodology that leverages pre-trained audio models to extract deep features from audio files, which are then employed as node information to build graphs. Subsequently, we train various graph neural networks (GNNs), specifically graph convolutional networks (GCNs), GraphSAGE, and graph attention networks (GATs), to solve multi-class audio classification problems. Our findings underscore the effectiveness of employing graphs to represent audio data. Moreover, they highlight the competitive performance of GNNs in sound classification endeavors, with the GAT model emerging as the top performer, achieving a mean accuracy of 83% in classifying environmental sounds and 91% in identifying the land cover of a site based on its audio recording. In conclusion, this study provides novel insights into the potential of graph representation learning techniques for analyzing audio data.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Automatic topic labeling using graph-based pre-trained neural embedding
    He, Dongbin
    Ren, Yanzhao
    Khattak, Abdul Mateen
    Liu, Xinliang
    Tao, Sha
    Gao, Wanlin
    [J]. NEUROCOMPUTING, 2021, 463 : 596 - 608
  • [2] Speech Topic Classification Based on Pre-trained and Graph Networks
    Niu, Fangjing
    Cao, Tengfei
    Hu, Ying
    Huang, Hao
    He, Liang
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1721 - 1726
  • [3] A graph-based blocking approach for entity matching using pre-trained contextual embedding models*
    Mugeni, John Bosco
    Amagasa, Toshiyuki
    [J]. 37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 357 - 364
  • [4] Remote Sensing Image Classification with a Graph-Based Pre-Trained Neighborhood Spatial Relationship
    Guan, Xudong
    Huang, Chong
    Yang, Juan
    Li, Ainong
    [J]. SENSORS, 2021, 21 (16)
  • [5] Graph-based Recommendation using Graph Neural Networks
    Dossena, Marco
    Irwin, Christopher
    Portinale, Luigi
    [J]. 2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1769 - 1774
  • [6] Improving chemical reaction yield prediction using pre-trained graph neural networks
    Han, Jongmin
    Kwon, Youngchun
    Choi, Youn-Suk
    Kang, Seokho
    [J]. JOURNAL OF CHEMINFORMATICS, 2024, 16 (01)
  • [7] Improving chemical reaction yield prediction using pre-trained graph neural networks
    Jongmin Han
    Youngchun Kwon
    Youn-Suk Choi
    Seokho Kang
    [J]. Journal of Cheminformatics, 16
  • [8] Dynamic Convolutional Neural Networks as Efficient Pre-Trained Audio Models
    Schmid, Florian
    Koutini, Khaled
    Widmer, Gerhard
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2227 - 2241
  • [9] Session Search with Pre-trained Graph Classification Model
    Ma, Shengjie
    Chen, Chong
    Mao, Jiaxin
    Tian, Qi
    Jiang, Xuhui
    [J]. PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 953 - 962
  • [10] Assisted Process Knowledge Graph Building Using Pre-trained Language Models
    Bellan, Patrizio
    Dragoni, Mauro
    Ghidini, Chiara
    [J]. AIXIA 2022 - ADVANCES IN ARTIFICIAL INTELLIGENCE, 2023, 13796 : 60 - 74