An ensemble model of CNN with Bi-LSTM for automatic singer identification

被引:0
|
作者
Mukkamala S. N. V. Jitendra
Y. Radhika
机构
[1] GITAM School of Technology,Department of Computer Science and Engineering
[2] GITAM (Deemed-to-be University),undefined
来源
关键词
Bidirectional long short-term memory; CNN; Gender identification; LSTM-RNN; Music information retrieval; Singer identification; Spectrogram;
D O I
暂无
中图分类号
学科分类号
摘要
In the present-day scenario, gender detection has become significant in content-based multimedia systems. An automated mechanism for gender identification is mainly in demand to process the massive data. Singer identification is a popular topic in music information recommender systems that includes identifying the singer from the song based on the singer’s voice and other background key features like timbre and pitch. Many models like GMM, SVM, and MLP are broadly used for classification and singer identification. Moreover, most current models have limitations where vocals and instrumental music are separated manually, and only vocals are used to build and train the model. To deal with unstructured data like music, the deep learning techniques are very suitable and have exhibited exemplary performance in similar studies. In acoustic modeling, the Deep Neural Networks (DNN) models like convolutional neural networks (CNN) have played a promising role in classifying unstructured and poorly labeled data. In the current study, an ensemble model, a combination of a CNN model with bi-directional LSTM, is considered for singer identification from the spectrogram images generated from the audio clip. CNN models are proven to better handle variable-length input data by identifying the features. Bi-LSTM will yield better accuracy by remembering the essential features over time and addressing temporal contextual information. The experimentation is performed on the Indian songs and MIR-1 k data set, and it is observed that the proposed model has outperformed with a prediction accuracy of 97.4%. The performance of the proposed model is being compared against the existing models in the current study.
引用
收藏
页码:38853 / 38874
页数:21
相关论文
共 50 条
  • [21] Image Captioning Algorithm Based on Multi-Branch CNN and Bi-LSTM
    He, Shan
    Lu, Yuanyao
    Chen, Shengnan
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (07): : 941 - 947
  • [22] 基于CNN与Bi-LSTM的唇语识别研究
    骆天依
    刘大运
    李修政
    房国志
    安欣
    魏华杰
    胡城
    软件导刊, 2019, 18 (10) : 36 - 39
  • [23] Identifying Financial Text Causality with Bi-LSTM and Two-way CNN
    Zhang, Shunxiang
    Zhang, Zhenjiang
    Zhu, Guangli
    Zhao, Tong
    Huang, Ju
    Data Analysis and Knowledge Discovery, 2022, 6 (07) : 118 - 127
  • [24] 基于CNN和Bi-LSTM的脑电波情感分析
    朱丽
    杨青
    吴涛
    李晨
    李铭
    应用科学学报, 2022, 40 (01) : 1 - 12
  • [25] Mid-term electricity load prediction using CNN and Bi-LSTM
    Gul, M. Junaid
    Urfa, Gul Malik
    Paul, Anand
    Moon, Jihoon
    Rho, Seungmin
    Hwang, Eenjun
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (10): : 10942 - 10958
  • [26] Automatic hate speech detection using aspect based feature extraction and Bi-LSTM model
    Kothuru, Srinivasulu
    Santhanavijayan, A.
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2022, 13 (06) : 2934 - 2943
  • [27] 结合改进Bi-LSTM和CNN的文本情感分析
    郭勇
    赵康
    潘力
    信息技术, 2021, (02) : 50 - 55
  • [28] Hybrid Distance-based, CNN and Bi-LSTM System for Dictionary Expansion
    Szakacs, Bela Benedek
    Meszaros, Tamas
    INFOCOMMUNICATIONS JOURNAL, 2020, 12 (04): : 6 - 13
  • [29] Automatic hate speech detection using aspect based feature extraction and Bi-LSTM model
    Srinivasulu Kothuru
    A. Santhanavijayan
    International Journal of System Assurance Engineering and Management, 2022, 13 : 2934 - 2943
  • [30] CNN联合BI-LSTM混合模型的手势识别算法
    纪盟盟
    肖金壮
    李瑞鹏
    激光杂志, 2021, 42 (06) : 88 - 91