Singer Diarization for Polyphonic Music With Unison Singing

被引:1
|
作者
Suda, Hitoshi [1 ]
Saito, Daisuke [1 ]
Fukayama, Satoru [2 ]
Nakano, Tomoyasu [2 ]
Goto, Masataka [2 ]
机构
[1] Univ Tokyo, Dept Engn, Bunkyo Ku, Tokyo 1138656, Japan
[2] Natl Inst Adv Ind Sci & Technol, Tsukuba, Ibaraki 3058568, Japan
关键词
Feature extraction; Data mining; Synchronization; Information processing; Voice activity detection; Timbre; Speech analysis; Music information processing; music information retrieval; singer diarization; unison singing; SPEAKER DIARIZATION; DATABASE;
D O I
10.1109/TASLP.2022.3166262
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper introduces a new framework for singer diarization, which is a technique to reveal who sings when in songs with multiple singers. Although various techniques have been developed to analyze and extract features of singing voices in musical audio signals, most of them assume that a song is sung by a single singer, and singer diarization for multiple singers has not been well studied in the field of singing information processing. To deal with multiple speakers in speech analysis, speaker diarization has been explored to handle overlapped speech voices, but cannot handle singing voices well because of acoustic differences between singing and speech voices. This paper therefore proposes a new diarization framework specialized in singing voices. To achieve high accuracy in overlap detection, this paper proposes a novel acoustic feature named Cosacorr score, which is helpful in estimating whether a song is sung by more than one singer. After extracting singing voices from polyphonic music by using a singing voice separation technique, the framework adopts an existing ArcFace technique to extract discriminative singer representations from short segments of the separated singing voices. The framework is evaluated by using a new private dataset of unison singing voices, which is constructed using commercially available compact discs (CDs). The experimental results show that the proposed framework outperformed the baseline method for speaker diarization in terms of diarization error rate (DER).
引用
收藏
页码:1531 / 1545
页数:15
相关论文
共 50 条
  • [1] Singing with the frogs (Polyphonic music and homophonic music)
    Bringhurst, R
    [J]. CANADIAN LITERATURE, 1997, (155): : 114 - 134
  • [2] Statistical and Neural Classifiers: Application for Singer and Music Discrimination in Polyphonic Music Context
    Ezzaidi, Hassan
    Bahoura, Mohammed
    [J]. IMAGE AND SIGNAL PROCESSING, PROCEEDINGS, 2010, 6134 : 130 - +
  • [3] Singing Voice Detection in Polyphonic Music using Predominant Pitch
    Rao, Vishweshwara
    Ramakrishnan, S.
    Rao, Preeti
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1135 - 1138
  • [4] Automatic Transcription of Flamenco Singing From Polyphonic Music Recordings
    Kroher, Nadine
    Gomez, Emilia
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (05) : 901 - 913
  • [5] Query by Singing/Humming (QbSH) System for Polyphonic Music Retrieval
    Park, Sungjoo
    Chung, Kwangsue
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2012, : 245 - +
  • [6] FUSING TRANSCRIPTION RESULTS FROM POLYPHONIC AND MONOPHONIC AUDIO FOR SINGING MELODY TRANSCRIPTION IN POLYPHONIC MUSIC
    Zhu, Bilei
    Wu, Fuzhang
    Li, Ke
    Wu, Yongjian
    Huang, Feiyue
    Wu, Yunsheng
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 296 - 300
  • [7] On the Importance of Audio-source Separation for Singer Identification in Polyphonic Music
    Sharma, Bidisha
    Das, Rohan Kumar
    Li, Haizhou
    [J]. INTERSPEECH 2019, 2019, : 2020 - 2024
  • [8] A query-by-singing technique for retrieving polyphonic objects of popular music
    Yu, HM
    Tsai, WH
    Wang, HM
    [J]. INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2005, 3689 : 439 - 453
  • [9] Singing Transcription from Polyphonic Music Using Melody Contour Filtering
    He, Zhuang
    Feng, Yin
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (13):
  • [10] MUSIC AND IDENTITY: VARIED ASPECTS IN THE SINGING OF AN ARAB ISRAELI SINGER
    Marks, Essica
    [J]. REVISTA DE ETNOGRAFIE SI FOLCLOR-JOURNAL OF ETHNOGRAPHY AND FOLKLORE, 2010, (1-2): : 57 - 71