Recognition and location of marine animal sounds using two-stream ConvNet with attention

被引:3
|
作者
Hu, Shaoxiang [1 ]
Hou, Rong [2 ]
Liao, Zhiwu [3 ]
Chen, Peng [2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Automat Engn, Chengdu, Peoples R China
[2] Chengdu Res Base Giant Panda Breeding, Sichuan Key Lab Conservat Biol Endangered Wildlife, Chengdu, Peoples R China
[3] Sichuan Normal Univ, Acad Global Governance & Area Studies, Chengdu, Peoples R China
关键词
voice recognition; location; two-stream ConvNet; YOLO; attention; CMFCC; SOURCE LOCALIZATION;
D O I
10.3389/fmars.2023.1059622
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
There are abundant resources and many endangered marine animals in the ocean. Using sound to effectively identify and locate them, and estimate their distribution area, has a very important role in the study of the complex diversity of marine animals (Hanny et al., 2013). We design a Two-Stream ConvNet with Attention (TSCA) model, which is a two-stream model combined with attention, in which one branch processes the temporal signal and the other branch processes the frequency domain signal; It makes good use of the characteristics of high time resolution of time domain signal and high recognition rate of frequency domain signal features of sound, and it realizes rapid localization and recognition of sound of marine species. The basic network architecture of the model is YOLO (You Only Look Once) (Joseph et al., 2016). A new loss function focal loss is constructed to strengthen the impact on the tail class of the sample, overcome the problem of data imbalance and avoid over fitting. At the same time, the attention module is constructed to focus on more detailed sound features, so as to improve the noise resistance of the model and achieve high-precision marine species identification and location. In The Watkins Marine Mammal Sound Database, the recognition rate of the algorithm reached 92.04% and the positioning accuracy reached 78.4%.The experimental results show that the algorithm has good robustness, high recognition accuracy and positioning accuracy.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] A Two-Stream Method for Human Action Recognition Using Facial Action Cues
    Lai, Zhimao
    Zhang, Yan
    Liang, Xiubo
    SENSORS, 2024, 24 (21)
  • [42] Proposing Gesture Recognition Algorithm Using Two-Stream Convolutional Network and LSTM
    Phat Nguyen Huu
    Tien Luong Ngoc
    Quang Tran Minh
    IEEE ICCE 2020: 2020 IEEE EIGHTH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS (ICCE), 2021, : 427 - 432
  • [43] Two-stream spatiotemporal networks for skeleton action recognition
    Wang, Lei
    Zhang, Jianwei
    Yang, Shanmin
    Gu, Song
    IET IMAGE PROCESSING, 2023, 17 (11) : 3358 - 3370
  • [44] A Multimode Two-Stream Network for Egocentric Action Recognition
    Li, Ying
    Shen, Jie
    Xiong, Xin
    He, Wei
    Li, Peng
    Yan, Wenjie
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT I, 2021, 12891 : 357 - 368
  • [45] Improved two-stream model for human action recognition
    Yuxuan Zhao
    Ka Lok Man
    Jeremy Smith
    Kamran Siddique
    Sheng-Uei Guan
    EURASIP Journal on Image and Video Processing, 2020
  • [46] A Two-Stream Network For Driving Hand Gesture Recognition
    Zhou, Yefan
    Lv, Zhao
    Wang, Chaoqun
    Zhang, Shengli
    20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2020), 2020, : 553 - 560
  • [47] Structured Two-Stream Attention Network for Video Question Answering
    Gao, Lianli
    Zeng, Pengpeng
    Song, Jingkuan
    Li, Yuan-Fang
    Liu, Wu
    Mei, Tao
    Shen, Heng Tao
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6391 - 6398
  • [48] Two-Stream Video Classification with Cross-Modality Attention
    Chi, Lu
    Tian, Guiyu
    Mu, Yadong
    Tian, Qi
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4511 - 4520
  • [49] Two-Stream Adaptive Weight Convolutional Neural Network Based on Spatial Attention for Human Action Recognition
    Chen, Guanzhou
    Yao, Lu
    Xu, Jingting
    Liu, Qianxi
    Chen, Shengyong
    INTELLIGENT ROBOTICS AND APPLICATIONS (ICIRA 2022), PT IV, 2022, 13458 : 319 - 330
  • [50] Two-Stream Convolution Neural Network with Video-stream for Action Recognition
    Dai, Wei
    Chen, Yimin
    Huang, Chen
    Gao, Ming-Ke
    Zhang, Xinyu
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,