Recognition and location of marine animal sounds using two-stream ConvNet with attention

被引:3
|
作者
Hu, Shaoxiang [1 ]
Hou, Rong [2 ]
Liao, Zhiwu [3 ]
Chen, Peng [2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Automat Engn, Chengdu, Peoples R China
[2] Chengdu Res Base Giant Panda Breeding, Sichuan Key Lab Conservat Biol Endangered Wildlife, Chengdu, Peoples R China
[3] Sichuan Normal Univ, Acad Global Governance & Area Studies, Chengdu, Peoples R China
关键词
voice recognition; location; two-stream ConvNet; YOLO; attention; CMFCC; SOURCE LOCALIZATION;
D O I
10.3389/fmars.2023.1059622
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
There are abundant resources and many endangered marine animals in the ocean. Using sound to effectively identify and locate them, and estimate their distribution area, has a very important role in the study of the complex diversity of marine animals (Hanny et al., 2013). We design a Two-Stream ConvNet with Attention (TSCA) model, which is a two-stream model combined with attention, in which one branch processes the temporal signal and the other branch processes the frequency domain signal; It makes good use of the characteristics of high time resolution of time domain signal and high recognition rate of frequency domain signal features of sound, and it realizes rapid localization and recognition of sound of marine species. The basic network architecture of the model is YOLO (You Only Look Once) (Joseph et al., 2016). A new loss function focal loss is constructed to strengthen the impact on the tail class of the sample, overcome the problem of data imbalance and avoid over fitting. At the same time, the attention module is constructed to focus on more detailed sound features, so as to improve the noise resistance of the model and achieve high-precision marine species identification and location. In The Watkins Marine Mammal Sound Database, the recognition rate of the algorithm reached 92.04% and the positioning accuracy reached 78.4%.The experimental results show that the algorithm has good robustness, high recognition accuracy and positioning accuracy.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Human action recognition using two-stream attention based LSTM networks
    Dai, Cheng
    Liu, Xingang
    Lai, Jinfeng
    APPLIED SOFT COMPUTING, 2020, 86
  • [2] RGB-D Human Action Recognition of Deep Feature Enhancement and Fusion Using Two-Stream ConvNet
    Liu, Yun
    Ma, Ruidi
    Li, Hui
    Wang, Chuanxu
    Tao, Ye
    JOURNAL OF SENSORS, 2021, 2021
  • [3] Two-stream Graph Attention Convolutional for Video Action Recognition
    Zhang, Deyuan
    Gao, Hongwei
    Dai, Hailong
    Shi, Xiangbin
    2021 IEEE 15TH INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (BIGDATASE 2021), 2021, : 23 - 27
  • [4] Two-Stream Adaptive Attention Graph Convolutional Networks for Action Recognition
    Du Q.
    Xiang Z.
    Tian L.
    Yu L.
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2022, 50 (12): : 20 - 29
  • [5] Two-Stream Attention Network for Pain Recognition from Video Sequences
    Thiam, Patrick
    Kestler, Hans A.
    Schwenker, Friedhelm
    SENSORS, 2020, 20 (03)
  • [6] Two-Stream 3-D convNet Fusion for Action Recognition in Videos With Arbitrary Size and Length
    Wang, Xuanhan
    Gao, Lianli
    Wang, Peng
    Sun, Xiaoshuai
    Liu, Xianglong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (03) : 634 - 644
  • [7] First Person Action Recognition via Two-stream ConvNet with Long-term Fusion Pooling
    Kwon, Heeseung
    Kim, Yeonho
    Lee, Jin S.
    Cho, Minsu
    PATTERN RECOGNITION LETTERS, 2018, 112 : 161 - 167
  • [8] A Two-Stream Context-Aware ConvNet for Pavement Distress Detection
    Louk, Roland
    Tepljakov, Aleksei
    Riid, Andri
    2020 43RD INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2020, : 270 - 273
  • [9] Gaze Estimation by Attention Using a Two-Stream Regression Network
    Karazor, Ahmet
    Bayar, Alperen Enes
    Topal, Cihan
    Cevikalp, Hakan
    2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,
  • [10] Two-stream Global-Guided Attention Network for Facial Expression Recognition
    Wen, Yaoli
    Xu, Xiangmin
    Liu, Fang
    Xing, Xiaofen
    Wang, Lin
    2021 16TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2021), 2021,