Musical Speech: A Transformer-based Composition Tool

被引:0
|
作者
d'Eon, Jason [1 ]
Dumpala, Harsha [1 ]
Sastry, Chandramouli Shama [1 ]
Oore, Dani [2 ]
Oore, Sageev [1 ]
机构
[1] Dalhousie Univ, Vector Inst, Halifax, NS, Canada
[2] Mem Univ Newfoundland, IICSI, St John, NF, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Speech processing; musical notes; transformer networks; denoising autoencoder; SYLLABLE NUCLEI; VOCAL-TRACT; RESONANCES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a new compositional tool that will generate a musical outline of speech recorded/provided by the user for use as a musical building block in their compositions. The tool allows any user to use their own speech to generate musical material, while still being able to hear the direct connection between their recorded speech and the resulting music. The tool is built on our proposed pipeline. This pipeline begins with speech-based signal processing, after which some simple musical heuristics are applied, and finally these pre-processed signals are passed through Transformer models trained on new musical tasks. We illustrate the effectiveness of our pipeline - which does not require a paired dataset for training - through examples of music created by musicians making use of our tool.
引用
收藏
页码:253 / 274
页数:22
相关论文
共 50 条
  • [21] A novel transformer-based neural network model for tool wear estimation
    Liu, Hui
    Liu, Zhenyu
    Jia, Weiqiang
    Lin, Xianke
    Zhang, Shuo
    [J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2020, 31 (06)
  • [22] TRANSFORMER IN ACTION: A COMPARATIVE STUDY OF TRANSFORMER-BASED ACOUSTIC MODELS FOR LARGE SCALE SPEECH RECOGNITION APPLICATIONS
    Wang, Yongqiang
    Shi, Yangyang
    Zhang, Frank
    Wu, Chunyang
    Chan, Julian
    Yeh, Ching-Feng
    Xiao, Alex
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6778 - 6782
  • [23] Transformer-based Long-context End-to-end Speech Recognition
    Hori, Takaaki
    Moritz, Niko
    Hori, Chiori
    Le Roux, Jonathan
    [J]. INTERSPEECH 2020, 2020, : 5011 - 5015
  • [24] Transformer-Based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project
    Lehecka, Jan
    Psutka, Josef, V
    Psutka, Josef
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 301 - 312
  • [25] Transformer-Based End-to-End Speech Translation With Rotary Position Embedding
    Li, Xueqing
    Li, Shengqiang
    Zhang, Xiao-Lei
    Rahardja, Susanto
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 371 - 375
  • [26] An End-to-End Transformer-Based Automatic Speech Recognition for Qur?an Reciters
    Hadwan, Mohammed
    Alsayadi, Hamzah A.
    AL-Hagree, Salah
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (02): : 3471 - 3487
  • [27] Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation
    Sant, Gerard
    Gállego, Gerard I.
    Alastruey, Belen
    Costa-Jussà, Marta R.
    [J]. NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Student Research Workshop, 2022, : 277 - 284
  • [28] Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation
    Sant, Gerard
    Gallego, Gerard, I
    Alastruey, Belen
    Costa-Jussa, Marta R.
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2022, : 277 - 284
  • [29] On-device Streaming Transformer-based End-to-End Speech Recognition
    Oh, Yoo Rhee
    Park, Kiyoung
    [J]. INTERSPEECH 2021, 2021, : 967 - 968
  • [30] ScaleFormer: Transformer-based speech enhancement in the multi-scale time domain
    Wu, Tianci
    He, Shulin
    Zhang, Hui
    Zhang, XueLiang
    [J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2448 - 2453