Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus

被引:1
|
作者
Xin, Detai [1 ]
Takamichi, Shinnosuke [1 ]
Morimatsu, Ai [1 ]
Saruwatari, Hiroshi [1 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo, Japan
来源
关键词
laughter synthesis; laughter corpus; nonverbal expression; RECOGNITION; EMOTIONS;
D O I
10.21437/Interspeech.2023-806
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present a large-scale in-the-wild Japanese laughter corpus and a laughter synthesis method. Previous work on laughter synthesis lacks not only data but also proper ways to represent laughter. To solve these problems, we first propose an in-the-wild corpus comprising 3.5 hours of laughter, which is to our best knowledge the largest laughter corpus designed for laughter synthesis. We then propose pseudo phonetic tokens (PPTs) to represent laughter by a sequence of discrete tokens, which are obtained by training a clustering model on features extracted from laughter by a pretrained self-supervised model. Laughter can then be synthesized by feeding PPTs into a text-to-speech system. We further show PPTs can be used to train a language model for unconditional laughter generation. Results of comprehensive subjective and objective evaluations demonstrate that the proposed method significantly outperforms a baseline method, and can generate natural laughter unconditionally.
引用
收藏
页码:17 / 21
页数:5
相关论文
共 50 条
  • [1] A Corpus for Large-Scale Phonetic Typology
    Salesky, Elizabeth
    Chodroff, Eleanor
    Pimentel, Tiago
    Wiesner, Matthew
    Cotterell, Ryan
    Black, Alan W.
    Eisner, Jason
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 4526 - 4546
  • [2] Vessel Inspection In-the-wild: Practical Planning in Large-scale Industrial Environments
    Hansen, Jakob Grimm
    Heiss, Micha
    Li, Dengyun
    Kozlowski, Michal
    Kayacan, Erdal
    2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 812 - 817
  • [3] A Large-Scale Synthetic Gait Dataset Towards in-the-Wild Simulation and Comparison Study
    Zhang, Pengyi
    Dou, Huanzhang
    Zhang, Wenhu
    Zhao, Yuhan
    Qin, Zequn
    Hu, Dongping
    Fang, Yi
    Li, Xi
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (01)
  • [4] Toward reanimating the laughter-involved large-scale brain networks to alleviate affective symptoms
    Zarei, Shahab A.
    Yahyavi, Seyedeh-Saeedeh
    Salehi, Iman
    Kazemiha, Milad
    Kamali, Ali-Mohammad
    Nami, Mohammad
    BRAIN AND BEHAVIOR, 2022, 12 (07):
  • [5] People's Interruptibility in-the-wild: Analysis of Breakpoint Detection Model in a Large-Scale Study
    Tsubouchi, Kota
    Okoshi, Tadashi
    PROCEEDINGS OF THE 2017 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING AND PROCEEDINGS OF THE 2017 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTERS (UBICOMP/ISWC '17 ADJUNCT), 2017, : 922 - 927
  • [6] Phonetic variation in English infant-directed speech: A large-scale corpus analysis
    Khlystova, Ekaterina A.
    Chong, Adam J.
    Sundara, Megha
    JOURNAL OF PHONETICS, 2023, 100
  • [7] OpenForensics: Large-Scale Challenging Dataset For Multi-Face Forgery Detection And Segmentation In-The-Wild
    Trung-Nghia Le
    Nguyen, Huy H.
    Yamagishi, Junichi
    Echizen, Isao
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10097 - 10107
  • [8] Captioning Videos Using Large-Scale Image Corpus
    Du, Xiao-Yu
    Yang, Yang
    Yang, Liu
    Shen, Fu-Min
    Qin, Zhi-Guang
    Tang, Jin-Hui
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2017, 32 (03) : 480 - 493
  • [9] Captioning Videos Using Large-Scale Image Corpus
    Xiao-Yu Du
    Yang Yang
    Liu Yang
    Fu-Min Shen
    Zhi-Guang Qin
    Jin-Hui Tang
    Journal of Computer Science and Technology, 2017, 32 : 480 - 493
  • [10] Using diazomethane in large-scale synthesis
    Archibald, T
    MANUFACTURING CHEMIST, 2000, 71 (02): : 20 - 21