Recognition and location of marine animal sounds using two-stream ConvNet with attention

被引:3
|
作者
Hu, Shaoxiang [1 ]
Hou, Rong [2 ]
Liao, Zhiwu [3 ]
Chen, Peng [2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Automat Engn, Chengdu, Peoples R China
[2] Chengdu Res Base Giant Panda Breeding, Sichuan Key Lab Conservat Biol Endangered Wildlife, Chengdu, Peoples R China
[3] Sichuan Normal Univ, Acad Global Governance & Area Studies, Chengdu, Peoples R China
关键词
voice recognition; location; two-stream ConvNet; YOLO; attention; CMFCC; SOURCE LOCALIZATION;
D O I
10.3389/fmars.2023.1059622
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
There are abundant resources and many endangered marine animals in the ocean. Using sound to effectively identify and locate them, and estimate their distribution area, has a very important role in the study of the complex diversity of marine animals (Hanny et al., 2013). We design a Two-Stream ConvNet with Attention (TSCA) model, which is a two-stream model combined with attention, in which one branch processes the temporal signal and the other branch processes the frequency domain signal; It makes good use of the characteristics of high time resolution of time domain signal and high recognition rate of frequency domain signal features of sound, and it realizes rapid localization and recognition of sound of marine species. The basic network architecture of the model is YOLO (You Only Look Once) (Joseph et al., 2016). A new loss function focal loss is constructed to strengthen the impact on the tail class of the sample, overcome the problem of data imbalance and avoid over fitting. At the same time, the attention module is constructed to focus on more detailed sound features, so as to improve the noise resistance of the model and achieve high-precision marine species identification and location. In The Watkins Marine Mammal Sound Database, the recognition rate of the algorithm reached 92.04% and the positioning accuracy reached 78.4%.The experimental results show that the algorithm has good robustness, high recognition accuracy and positioning accuracy.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Workflow recognition with structured two-stream convolutional networks
    Hu, Haiyang
    Cheng, Kaiming
    Li, Zhongjin
    Chen, Jie
    Hu, Hua
    PATTERN RECOGNITION LETTERS, 2020, 130 : 267 - 274
  • [32] Automated Video Behavior Recognition of Pigs Using Two-Stream Convolutional Networks
    Zhang, Kaifeng
    Li, Dan
    Huang, Jiayun
    Chen, Yifei
    SENSORS, 2020, 20 (04)
  • [33] Automatic Modulation Recognition of Underwater Acoustic Signals Using a Two-Stream Transformer
    Li, Juan
    Jia, Qingning
    Cui, Xuerong
    Gulliver, T. Aaron
    Jiang, Bin
    Li, Shibao
    Yang, Jungang
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (10): : 18839 - 18851
  • [34] A heterogeneous two-stream network for human action recognition
    Liao, Shengbin
    Wang, Xiaofeng
    Yang, ZongKai
    AI COMMUNICATIONS, 2023, 36 (03) : 219 - 233
  • [35] GaitSTR: Gait Recognition With Sequential Two-Stream Refinement
    Zheng, Wanrong
    Zhu, Haidong
    Zheng, Zhaoheng
    Nevatia, Ram
    IEEE TRANSACTIONS ON BIOMETRICS, BEHAVIOR, AND IDENTITY SCIENCE, 2024, 6 (04): : 528 - 538
  • [36] Two-Stream Dictionary Learning Architecture for Action Recognition
    Xu, Ke
    Jiang, Xinghao
    Sun, Tanfeng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (03) : 567 - 576
  • [37] Two-Stream Gated Fusion ConvNets for Action Recognition
    Zhu, Jiagang
    Zou, Wei
    Zhu, Zheng
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 597 - 602
  • [38] Two-Stream Convolutional Networks for Action Recognition in Videos
    Simonyan, Karen
    Zisserman, Andrew
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [39] A Spatiotemporal Heterogeneous Two-Stream Network for Action Recognition
    Chen, Enqing
    Bai, Xue
    Gao, Lei
    Tinega, Haron Chweya
    Ding, Yingqiang
    IEEE ACCESS, 2019, 7 : 57267 - 57275
  • [40] Skeleton action recognition using Two-Stream Adaptive Graph Convolutional Networks
    Lee, James
    Kang, Suk-ju
    2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC), 2021,