Classification of animal sounds in a hyperdiverse rainforest using convolutional neural networks with data augmentation

被引:0
|
作者
Sun, Yuren [1 ]
Maeda, Tatiana Midori [2 ,3 ]
Solis-Lemus, Claudia [4 ,5 ]
Pimentel-Alarcon, Daniel [4 ,6 ]
Burivalova, Zuzana [2 ,3 ]
机构
[1] Univ Wisconsin, Dept Comp Sci, Madison, WI USA
[2] Univ Wisconsin, Nelson Inst Environm Studies, Madison, WI 53706 USA
[3] Univ Wisconsin, Dept Forest & Wildlife Ecol, Madison, WI 53706 USA
[4] Univ Wisconsin, Wisconsin Inst Discovery, Madison, WI USA
[5] Univ Wisconsin, Dept Plant Pathol, Madison, WI USA
[6] Univ Wisconsin, Dept Biostat & Med Informat, Madison, WI USA
关键词
Bioacoustics; Convolutional neural network; Conservation; Data augmentation; Passive 30 acoustic monitoring; Sound classification; Tropical forest; Transfer learning; BIODIVERSITY; CONSERVATION;
D O I
10.1016/j.ecolind.2022.109621
中图分类号
X176 [生物多样性保护];
学科分类号
090705 ;
摘要
To protect tropical forest biodiversity, we need to be able to detect it reliably, cheaply, and at scale. Automated detection of sound producing animals from passively recorded soundscapes via machine-learning approaches is a promising technique towards this goal, but it is constrained by the necessity of large training data sets. Using soundscapes from a tropical forest in Borneo and a Convolutional Neural Network model (CNN), we investigate i) the minimum viable training data set size for accurate prediction of call types ('sonotypes'), and ii) the extent to which data augmentation and transfer learning can overcome the issue of small and imbalanced training data sets. We found that even relatively high sample sizes (>80 per sonotype) lead to mediocre accuracy, which however improved significantly with data augmentation and transfer learning, including at extremely small sample sizes (3 per sonotype), regardless of taxonomic group or call characteristics. Neither transfer learning nor data augmentation alone achieved high accuracy. Our results suggest that transfer learning and data augmen-tation could make the use of CNNs to classify species' vocalizations feasible even for small soundscape-based projects with many rare species. Retraining our open-source model requires only basic programming skills which makes it possible for individual conservation initiatives to match their local context, in order to enable more evidence-informed management of biodiversity.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Classification of animal sounds in a hyperdiverse rainforest using convolutional neural networks with data augmentation
    Sun, Yuren
    Midori Maeda, Tatiana
    Solís-Lemus, Claudia
    Pimentel-Alarcón, Daniel
    Buřivalová, Zuzana
    [J]. Ecological Indicators, 2022, 145
  • [2] Classification of lung sounds using convolutional neural networks
    Murat Aykanat
    Özkan Kılıç
    Bahar Kurt
    Sevgi Saryal
    [J]. EURASIP Journal on Image and Video Processing, 2017
  • [3] Lung sounds classification using convolutional neural networks
    Bardou, Dalal
    Zhang, Kun
    Ahmad, Sayed Mohammad
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2018, 88 : 58 - 69
  • [4] Classification of lung sounds using convolutional neural networks
    Aykanat, Murat
    Kilic, Ozkan
    Kurt, Bahar
    Saryal, Sevgi
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2017,
  • [5] Underwater Image Classification Using Deep Convolutional Neural Networks and Data Augmentation
    Xu, Yifeng
    Zhang, Yang
    Wang, Huigang
    Liu, Xing
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2017,
  • [6] Environmental Sound Classification using Deep Convolutional Neural Networks and Data Augmentation
    Davis, Nithya
    Suresh, K.
    [J]. 2018 IEEE RECENT ADVANCES IN INTELLIGENT COMPUTATIONAL SYSTEMS (RAICS), 2018, : 41 - 45
  • [7] Skin melanoma classification using ROI and data augmentation with deep convolutional neural networks
    Hosny, Khalid M.
    Kassem, Mohamed A.
    Foaud, Mohamed M.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (33-34) : 24029 - 24055
  • [8] Skin melanoma classification using ROI and data augmentation with deep convolutional neural networks
    Khalid M. Hosny
    Mohamed A. Kassem
    Mohamed M. Foaud
    [J]. Multimedia Tools and Applications, 2020, 79 : 24029 - 24055
  • [9] Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
    Salamon, Justin
    Bello, Juan Pablo
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (03) : 279 - 283
  • [10] Automatic Heart and Lung Sounds Classification using Convolutional Neural Networks
    Chen, Qiyu
    Zhang, Weibin
    Tian, Xiang
    Zhang, Xiaoxue
    Chen, Shaoqiong
    Lei, Wenkang
    [J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,