Exploring the two-dimensional nature of music notation for score recognition with end-to-end approaches

被引:9
|
作者
Rios-Vila, Antonio [1 ]
Calvo-Zaragoza, Jorge [1 ]
Inesta, Jose M. [1 ]
机构
[1] Univ Alicante, UI Comp Res, Alicante, Spain
关键词
D O I
10.1109/ICFHR2020.2020.00044
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Optical Music Recognition workflows perform several steps to retrieve the content in music score images, being symbol recognition one of the key stages. State-of-the-art approaches for this stage currently address the coding of the output symbols as if they were plain text characters. However, music symbols have a two-dimensional nature that is ignored in these approaches. In this paper, we explore alternative output representations to perform music symbol recognition with state-of-the-art end-to-end neural technologies. We propose and describe new output representations which take into account the mentioned two-dimensional nature. We seek answers to the question of whether it is possible to obtain better recognition results in both printed and handwritten music scores. In this analysis, we compare the results given using three output encodings and two neural approaches. We found that one of the proposed encodings outperforms the results obtained by the standard one. This permits us to conclude that it is interesting to keep researching on this topic to improve end-to-end music score recognition.
引用
收藏
页码:193 / 198
页数:6
相关论文
共 50 条
  • [21] End-to-end distribution function of two-dimensional stiff polymers for all persistence lengths
    Hamprecht, B
    Janke, W
    Kleinert, H
    [J]. PHYSICS LETTERS A, 2004, 330 (3-4) : 254 - 259
  • [22] Exploring End-to-End Techniques for Low-Resource Speech Recognition
    Bataev, Vladimir
    Korenevsky, Maxim
    Medennikov, Ivan
    Zatvornitskiy, Alexander
    [J]. SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 32 - 41
  • [23] Exploring end-to-end framework towards Khasi speech recognition system
    Bronson Syiem
    L. Joyprakash Singh
    [J]. International Journal of Speech Technology, 2021, 24 : 419 - 424
  • [24] Exploring end-to-end framework towards Khasi speech recognition system
    Syiem, Bronson
    Singh, L. Joyprakash
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (02) : 419 - 424
  • [25] EXPLORING MODEL UNITS AND TRAINING STRATEGIES FOR END-TO-END SPEECH RECOGNITION
    Huang, Mingkun
    Lu, Yizhou
    Wang, Lan
    Qian, Yanmin
    Yu, Kai
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 524 - 531
  • [26] A comprehensive comparison of end-to-end approaches for handwritten digit string recognition
    Hochuli, Andre G.
    Britto Jr, Alceu S.
    Saji, David A.
    Saavedra, Jose M.
    Sabourin, Robert
    Oliveira, Luiz S.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 165 (165)
  • [27] Curriculum Learning-Based Approaches for End-to-End Gas Recognition
    Zhang, Chao
    Wang, Wen
    Pan, Yong
    Zhai, Shoupei
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [28] Breast ultrasound lesions recognition: end-to-end deep learning approaches
    Yap, Moi Hoon
    Goyal, Manu
    Osman, Fatima M.
    Marti, Robert
    Denton, Erika
    Juette, Arne
    Zwiggelaar, Reyer
    [J]. JOURNAL OF MEDICAL IMAGING, 2019, 6 (01)
  • [29] Transfer Learning Approaches for Streaming End-to-End Speech Recognition System
    Joshi, Vikas
    Zhao, Rui
    Mehta, Rupesh R.
    Kumar, Kshitiz
    Li, Jinyu
    [J]. INTERSPEECH 2020, 2020, : 2152 - 2156
  • [30] End-to-end Two-dimensional Sound Source Localization With Ad-hoc Microphone Arrays
    Gong, Yijun
    Liu, Shupei
    Zhang, Xiao-Lei
    [J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1944 - 1949