Multiple-instrument polyphonic music transcription using a temporally constrained shift-invariant model

被引:39
|
作者
Benetos, Emmanouil [1 ]
Dixon, Simon [1 ]
机构
[1] Queen Mary Univ London, Sch Elect Engn & Comp Sci, Ctr Digital Mus, London E1 4NS, England
来源
关键词
FUNDAMENTAL-FREQUENCY ESTIMATION; TUTORIAL;
D O I
10.1121/1.4790351
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A method for automatic transcription of polyphonic music is proposed in this work that models the temporal evolution of musical tones. The model extends the shift-invariant probabilistic latent component analysis method by supporting the use of spectral templates that correspond to sound states such as attack, sustain, and decay. The order of these templates is controlled using hidden Markov model-based temporal constraints. In addition, the model can exploit multiple templates per pitch and instrument source. The shift-invariant aspect of the model makes it suitable for music signals that exhibit frequency modulations or tuning changes. Pitch-wise hidden Markov models are also utilized in a postprocessing step for note tracking. For training, sound state templates were extracted for various orchestral instruments using isolated note samples. The proposed transcription system was tested on multiple-instrument recordings from various datasets. Experimental results show that the proposed model is superior to a non-temporally constrained model and also outperforms various state-of-the-art transcription systems for the same experiment. (C) 2013 Acoustical Society of America. [http://dx.doi.org/10.1121/1.4790351]
引用
收藏
页码:1727 / 1741
页数:15
相关论文
共 11 条
  • [1] Unsupervised learning of sparse and shift-invariant decompositions of polyphonic music
    Blumensath, T
    Davies, M
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 497 - 500
  • [2] A Shift-Invariant Latent Variable Model for Automatic Music Transcription
    Benetos, Emmanouil
    Dixon, Simon
    COMPUTER MUSIC JOURNAL, 2012, 36 (04) : 81 - 94
  • [3] Transcribing Frequency Modulated Musical Expressions from Polyphonic Music using HMM Constrained Shift Invariant PLCA
    Sung, Dooyong
    Lee, Kyogu
    2014 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2014), 2014, : 562 - 565
  • [4] Impulse detection using a shift-invariant dictionary and multiple compressions
    Lin, Huibin
    Tang, Jianmeng
    Mechefske, Chris
    JOURNAL OF SOUND AND VIBRATION, 2019, 449 : 1 - 17
  • [5] Music transcription using an instrument model
    Yin, J
    Sim, T
    Wang, Y
    Shenoy, A
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 217 - 220
  • [6] Efficient eigenspace-based array signal processing using multiple shift-invariant subarrays
    Yu, SJ
    Lee, JH
    IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 1999, 47 (01) : 186 - 194
  • [7] MODEL-REDUCTION OF MULTIDIMENSIONAL LINEAR SHIFT-INVARIANT RECURSIVE SYSTEMS USING PADE TECHNIQUES
    CUYT, A
    OGAWA, S
    VERDONK, B
    MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING, 1992, 3 (04) : 309 - 322
  • [8] Detection and diagnosis of bearing faults using shift-invariant dictionary learning and hidden Markov model
    Zhou, Haitao
    Chen, Jin
    Dong, Guangming
    Wang, Ran
    MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2016, 72-73 : 65 - 79
  • [9] Constrained non-negative sparse coding using learnt instrument templates for realtime music transcription
    Carabias-Orti, J. J.
    Rodriguez-Serrano, F. J.
    Vera-Candeas, P.
    Canadas-Quesada, F. J.
    Ruiz-Reyes, N.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (07) : 1671 - 1680
  • [10] Monophonic constrained non-negative sparse coding using instrument models for audio separation and transcription of monophonic source-based polyphonic mixtures
    Francisco José Rodríguez-Serrano
    Julio José Carabias-Orti
    Pedro Vera-Candeas
    Francisco Jesús Canadas-Quesada
    Nicolás Ruiz-Reyes
    Multimedia Tools and Applications, 2014, 72 : 925 - 949