Musical Speech: A Transformer-based Composition Tool

被引：0

作者：

d'Eon, Jason ^{[1
]}

Dumpala, Harsha ^{[1
]}

Sastry, Chandramouli Shama ^{[1
]}

Oore, Dani ^{[2
]}

Oore, Sageev ^{[1
]}

机构：

[1] Dalhousie Univ, Vector Inst, Halifax, NS, Canada

[2] Mem Univ Newfoundland, IICSI, St John, NF, Canada

来源：

NEURIPS 2020 COMPETITION AND DEMONSTRATION TRACK, VOL 133 | 2020年 / 133卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Speech processing; musical notes; transformer networks; denoising autoencoder; SYLLABLE NUCLEI; VOCAL-TRACT; RESONANCES;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a new compositional tool that will generate a musical outline of speech recorded/provided by the user for use as a musical building block in their compositions. The tool allows any user to use their own speech to generate musical material, while still being able to hear the direct connection between their recorded speech and the resulting music. The tool is built on our proposed pipeline. This pipeline begins with speech-based signal processing, after which some simple musical heuristics are applied, and finally these pre-processed signals are passed through Transformer models trained on new musical tasks. We illustrate the effectiveness of our pipeline - which does not require a paired dataset for training - through examples of music created by musicians making use of our tool.

引用

页码：253 / 274

页数：22

共 50 条

[21] A novel transformer-based neural network model for tool wear estimation
Liu, Hui
Liu, Zhenyu
Jia, Weiqiang
Lin, Xianke
Zhang, Shuo
[J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2020, 31 (06)
[22] TRANSFORMER IN ACTION: A COMPARATIVE STUDY OF TRANSFORMER-BASED ACOUSTIC MODELS FOR LARGE SCALE SPEECH RECOGNITION APPLICATIONS
Wang, Yongqiang
Shi, Yangyang
Zhang, Frank
Wu, Chunyang
Chan, Julian
Yeh, Ching-Feng
Xiao, Alex
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6778 - 6782
[23] Transformer-based Long-context End-to-end Speech Recognition
Hori, Takaaki
Moritz, Niko
Hori, Chiori
Le Roux, Jonathan
[J]. INTERSPEECH 2020, 2020, : 5011 - 5015
[24] Transformer-Based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project
Lehecka, Jan
Psutka, Josef, V
Psutka, Josef
[J]. TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 301 - 312
[25] Transformer-Based End-to-End Speech Translation With Rotary Position Embedding
Li, Xueqing
Li, Shengqiang
Zhang, Xiao-Lei
Rahardja, Susanto
[J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 371 - 375
[26] An End-to-End Transformer-Based Automatic Speech Recognition for Qur?an Reciters
Hadwan, Mohammed
Alsayadi, Hamzah A.
AL-Hagree, Salah
[J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (02): : 3471 - 3487
[27] Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation
Sant, Gerard
Gállego, Gerard I.
Alastruey, Belen
Costa-Jussà, Marta R.
[J]. NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Student Research Workshop, 2022, : 277 - 284
[28] Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation
Sant, Gerard
Gallego, Gerard, I
Alastruey, Belen
Costa-Jussa, Marta R.
[J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2022, : 277 - 284
[29] On-device Streaming Transformer-based End-to-End Speech Recognition
Oh, Yoo Rhee
Park, Kiyoung
[J]. INTERSPEECH 2021, 2021, : 967 - 968
[30] ScaleFormer: Transformer-based speech enhancement in the multi-scale time domain
Wu, Tianci
He, Shulin
Zhang, Hui
Zhang, XueLiang
[J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2448 - 2453

← 1 2 3 4 5 →