Multi-Modal Deep Learning-Based Violin Bowing Action Recognition

被引:1
|
作者
Liu, Bao-Yun [1 ]
Jen, Yi-Hsin [2 ,3 ]
Sun, Shih-Wei [4 ]
Su, Li [2 ]
Chang, Pao-Chi [1 ]
机构
[1] Natl Cent Univ, Dept Commun Engn, Taoyuan, Taiwan
[2] Acad Sinica, Inst Informat Sci, Taipei, Taiwan
[3] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu, Taiwan
[4] Taipei Natl Univ Arts, Dept New Media Art, Taipei, Taiwan
关键词
D O I
10.1109/icce-taiwan49838.2020.9257995
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a deep learning-based violin action recognition is proposed. By fusing the sensing signals from depth camera modality and inertial sensor modalities, violin bowing actions can be recognized by the proposed deep learning scheme. The actions performed by a violinist are captured by a depth camera, and recorded by wearable sensors on the forearm of a violinist. In the proposed system, 3D convolution neural network (3D-CNN) and long short-term memory (LSTM) deep learning algorithms are adopted to generate the action models from depth camera modality and inertial sensor modalities. The features and models obtained from multi-modalities are used to classify different violin bowing actions. A fusion process from different modalities can achieve satisfactory recognition accuracy. In this paper, we generate a violin bowing actions dataset for the preliminary study and the system performance evaluation.
引用
收藏
页数:2
相关论文
共 50 条
  • [41] Deep Learning-Based CNN Multi-Modal Camera Model Identification for Video Source Identification
    Singh S.
    Sehgal V.K.
    Informatica (Slovenia), 2023, 47 (03): : 417 - 430
  • [42] Cross-modal learning with multi-modal model for video action recognition based on adaptive weight training
    Zhou, Qingguo
    Hou, Yufeng
    Zhou, Rui
    Li, Yan
    Wang, Jinqiang
    Wu, Zhen
    Li, Hung-Wei
    Weng, Tien-Hsiung
    CONNECTION SCIENCE, 2024, 36 (01)
  • [43] DFN: A deep fusion network for flexible single and multi-modal action recognition
    Li, Chuankun
    Hou, Yonghong
    Li, Wanqing
    Ding, Zewei
    Wang, Pichao
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 245
  • [44] Multi-modal Video Action Recognition Method Based on Language-visual Contrastive Learning
    Zhang Y.
    Zhang B.-B.
    Dong W.
    An F.-M.
    Zhang J.-X.
    Zhang Q.
    Zidonghua Xuebao/Acta Automatica Sinica, 2024, 50 (02): : 417 - 430
  • [45] A Multi-Modal Emotion Recognition System Based on CNN-Transformer Deep Learning Technique
    Karatay, Busra
    Bestepe, Deniz
    Sailunaz, Kashfia
    Ozyer, Tansel
    Alhajj, Reda
    2022 7TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MACHINE LEARNING APPLICATIONS (CDMA 2022), 2022, : 145 - 150
  • [46] Deep Learning Based Multi-modal Registration for Retinal Imaging
    Arikan, Mustafa
    Sadeghipour, Amir
    Gerendas, Bianca
    Told, Reinhard
    Schmidt-Erfurt, Ursula
    INTERPRETABILITY OF MACHINE INTELLIGENCE IN MEDICAL IMAGE COMPUTING AND MULTIMODAL LEARNING FOR CLINICAL DECISION SUPPORT, 2020, 11797 : 75 - 82
  • [47] Multi-Modal Pedestrian Detection Algorithm Based on Deep Learning
    Li X.
    Fu H.
    Niu W.
    Wang P.
    Lü Z.
    Wang W.
    Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2022, 56 (10): : 61 - 70
  • [48] A Novel Cross Modal Hashing Algorithm Based on Multi-modal Deep Learning
    Qu, Wen
    Wang, Daling
    Feng, Shi
    Zhang, Yifei
    Yu, Ge
    SOCIAL MEDIA PROCESSING, SMP 2015, 2015, 568 : 156 - 167
  • [49] Multi-modal fusion method for human action recognition based on IALC
    Zhang, Yinhuan
    Xiao, Qinkun
    Liu, Xing
    Wei, Yongquan
    Chu, Chaoqin
    Xue, Jingyun
    IET IMAGE PROCESSING, 2023, 17 (02) : 388 - 400
  • [50] Interactive Learning of a Dual Convolution Neural Network for Multi-Modal Action Recognition
    Li, Qingxia
    Gao, Dali
    Zhang, Qieshi
    Wei, Wenhong
    Ren, Ziliang
    MATHEMATICS, 2022, 10 (21)