Unsupervised Methods for Evaluating Speech Representations

被引:0
|
作者
Gump, Michael [1 ]
Hsu, Wei-Ning [1 ]
Glass, James [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
来源
关键词
speech representation learning; unsupervised learning;
D O I
10.21437/Interspeech.2020-2990
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Disentanglement is a desired property in representation learning and a significant body of research has tried to show that it is a useful representational prior. Evaluating disentanglement is challenging, particularly for real world data like speech, where ground truth generative factors are typically not available. Previous work on disentangled representation learning in speech has used categorical supervision like phoneme or speaker identity in order to disentangle grouped feature spaces. However, this work differs from the typical dimension-wise view of disentanglement in other domains. This paper proposes to use low-level acoustic features to provide the structure required to evaluate dimension-wise disentanglement. By choosing well-studied acoustic features, grounded and descriptive evaluation is made possible for unsupervised representation learning. This work produces a toolkit for evaluating disentanglement in unsupervised representations of speech and evaluates its efficacy on previous research.
引用
收藏
页码:170 / 174
页数:5
相关论文
共 50 条
  • [21] Evaluating acoustic representations and normalization for rhoticity classification in children with speech sound disorders
    Benway, Nina R.
    Preston, Jonathan L.
    Salekin, Asif
    Hitchcock, Elaine
    McAllister, Tara
    JASA EXPRESS LETTERS, 2024, 4 (02):
  • [22] Unsupervised Speech Recognition
    Baevski, Alexei
    Hsu, Wei-Ning
    Conneau, Alexis
    Auli, Michael
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [23] Unsupervised learning of invariant representations
    Anselmi, Fabio
    Leibo, Joel Z.
    Rosasco, Lorenzo
    Mutch, Jim
    Tacchetti, Andrea
    Poggio, Tomaso
    THEORETICAL COMPUTER SCIENCE, 2016, 633 : 112 - 121
  • [24] Unsupervised Generation of Artistic Representations
    Steinberg, Roman
    Kastryulin, Sergey
    TWELFTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2019), 2020, 11433
  • [25] Unsupervised Learning of Face Representations
    Datta, Samyak
    Sharma, Gaurav
    Jawahar, C. V.
    PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, : 135 - 142
  • [26] Models for unsupervised learning of representations
    Garionis, R
    8TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING, VOLS 1-3, PROCEEDING, 2001, : 253 - 258
  • [27] SLMGAN: EXPLOITING SPEECH LANGUAGE MODEL REPRESENTATIONS FOR UNSUPERVISED ZERO-SHOT VOICE CONVERSION IN GANS
    Li, Yinghao Aaron
    Han, Cong
    Mesgarani, Nima
    2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA, 2023,
  • [28] Unsupervised Learning of Discrete Latent Representations with Data-Adaptive Dimensionality from Continuous Speech Streams
    Takahashi, Shun
    Sakti, Sakriani
    INTERSPEECH 2023, 2023, : 416 - 420
  • [29] Evaluating unsupervised and supervised image classification methods for mapping cotton root rot
    Yang, Chenghai
    Odvody, Gary N.
    Fernandez, Carlos J.
    Landivar, Juan A.
    Minzenmayer, Richard R.
    Nichols, Robert L.
    PRECISION AGRICULTURE, 2015, 16 (02) : 201 - 215
  • [30] Evaluating unsupervised and supervised image classification methods for mapping cotton root rot
    Chenghai Yang
    Gary N. Odvody
    Carlos J. Fernandez
    Juan A. Landivar
    Richard R. Minzenmayer
    Robert L. Nichols
    Precision Agriculture, 2015, 16 : 201 - 215