Polyphonic pitch tracking with deep layered learning

被引:6
|
作者
Elowsson, Anders [1 ,2 ]
机构
[1] KTH Royal Inst Technol, Sch Elect Engn & Comp Sci, Stockholm, Sweden
[2] Univ Oslo, RITMO Ctr Interdisciplinary Studies Rhythm Time &, Oslo, Norway
来源
基金
瑞典研究理事会;
关键词
FUNDAMENTAL-FREQUENCY ESTIMATION; MULTIPITCH ESTIMATION; MUSIC TRANSCRIPTION;
D O I
10.1121/10.0001468
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This article presents a polyphonic pitch tracking system that is able to extract both framewise and note-based estimates from audio. The system uses several artificial neural networks trained individually in a deep layered learning setup. First, cascading networks are applied to a spectrogram for framewise fundamental frequency (f(0)) estimation. A sparse receptive field is learned by the first network and then used as a filter kernel for parameter sharing throughout the system. Thef(0)activations are connected across time to extract pitch contours. These contours define a framework within which subsequent networks perform onset and offset detection, operating across both time and smaller pitch fluctuations at the same time. As input, the networks use, e.g., variations of latent representations from thef(0)estimation network. Finally, erroneous tentative notes are removed one by one in an iterative procedure that allows a network to classify notes within a correct context. The system was evaluated on four public test sets: MAPS, Bach10, TRIOS, and the MIREX Woodwind quintet and achieved state-of-the-art results for all four datasets. It performs well across all subtasksf(0), pitched onset, and pitched offset tracking.
引用
收藏
页码:446 / 468
页数:23
相关论文
共 50 条
  • [41] A COMPARATIVE ANALYSIS OF TIME-FREQUENCY DECOMPOSITIONS IN POLYPHONIC PITCH ESTIMATION
    Canadas-Quesada, F. J.
    Vera-Candeas, P.
    Ruiz-Reyes, N.
    Carabias, J.
    Cabanas, P.
    Rodriguez, F.
    SIGMAP 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATION, 2010, : 145 - 150
  • [42] Tracking the Race Between Deep Reinforcement Learning and Imitation Learning
    Gros, Timo P.
    Hoeller, Daniel
    Hoffmann, Joerg
    Wolf, Verena
    QUANTITATIVE EVALUATION OF SYSTEMS (QEST 2020), 2020, 12289 : 11 - 17
  • [43] Deep Learning and Preference Learning for Object Tracking: A Combined Approach
    Pang, Shuchao
    Jose del Coz, Juan
    Yu, Zhezhou
    Luaces, Oscar
    Diez, Jorge
    NEURAL PROCESSING LETTERS, 2018, 47 (03) : 859 - 876
  • [44] Deep Learning and Preference Learning for Object Tracking: A Combined Approach
    Shuchao Pang
    Juan José del Coz
    Zhezhou Yu
    Oscar Luaces
    Jorge Díez
    Neural Processing Letters, 2018, 47 : 859 - 876
  • [45] Formant estimation and tracking: A deep learning approach
    Dissen, Yehoshua
    Goldberger, Jacob
    Keshet, Joseph
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 145 (02): : 642 - 653
  • [46] Deep mutual learning for visual object tracking
    Zhao, Haojie
    Yang, Gang
    Wang, Dong
    Lu, Huchuan
    PATTERN RECOGNITION, 2021, 112 (112)
  • [47] Nonlinear Motion Tracking by Deep Learning Architecture
    Verma, Arnav
    Samaiya, Devesh
    Gupta, Karunesh K.
    3RD INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS (ICCS-2017), 2018, 331
  • [48] Deep learning in multiple animal tracking: A survey
    Liu, Yeqiang
    Li, Weiran
    Liu, Xue
    Li, Zhenbo
    Yue, Jun
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 224
  • [49] Object Detection and Tracking Based on Deep Learning
    Lee, Yong-Hwan
    Lee, Wan-Bum
    INNOVATIVE MOBILE AND INTERNET SERVICES IN UBIQUITOUS COMPUTING, IMIS-2019, 2020, 994 : 629 - 635
  • [50] Deep Reinforcement Learning for Subpixel Neural Tracking
    Dai, Tianhong
    Dubois, Magda
    Arulkumaran, Kai
    Campbell, Jonathan
    Bass, Cher
    Billot, Benjamin
    Uslu, Fatmatulzehra
    de Paola, Vincenzo
    Clopath, Claudia
    Bharath, Anil Anthony
    INTERNATIONAL CONFERENCE ON MEDICAL IMAGING WITH DEEP LEARNING, VOL 102, 2019, 102 : 130 - 150