Conditional Independence for Pretext Task Selection in Self-Supervised Speech Representation Learning

被引:2
|
作者
Zaiem, Salah [1 ,2 ]
Parcollet, Titouan [2 ]
Essid, Slim [1 ]
机构
[1] Inst Polytech Paris, Telecom Paris, LTCI, Palaiseau, France
[2] Avignon Univ, LIA, Avignon, France
来源
关键词
Self-Supervised Learning; Speech Representation Learning;
D O I
10.21437/Interspeech.2021-1027
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Through solving pretext tasks, self-supervised learning (SSL) leverages unlabeled data to extract useful latent representations replacing traditional input features in the downstream task. A common pretext task consists in pretraining a SSL model on pseudo-labels derived from the original signal. This technique is particularly relevant for speech data where various meaningful signal processing features may serve as pseudolabels. However, the process of selecting pseudo-labels, for speech or other types of data, remains mostly unexplored and currently relies on observing the results on the final downstream task. Nevertheless, this methodology is not sustainable at scale due to substantial computational (hence carbon) costs. Thus, this paper introduces a practical and theoretical framework to select relevant pseudo-labels with respect to a given downstream task. More precisely, we propose a functional estimator of the pseudo-label utility grounded in the conditional independence theory, which does not require any training. The experiments conducted on speaker recognition and automatic speech recognition validate our estimator, showing a significant correlation between the performance observed on the downstream task and the utility estimates obtained with our approach, facilitating the prospection of relevant pseudo-labels for selfsupervised speech representation learning.
引用
收藏
页码:2851 / 2855
页数:5
相关论文
共 50 条
  • [41] Locally Conditioned GANs: Self-Supervised Local Patch Representation Learning for Conditional Generation
    Kim, Dongseob
    Shim, Hyunjung
    [J]. IEEE Access, 2024, 12 : 134115 - 134132
  • [42] Self-Distilled Self-supervised Representation Learning
    Jang, Jiho
    Kim, Seonhoon
    Yoo, Kiyoon
    Kong, Chaerin
    Kim, Jangho
    Kwak, Nojun
    [J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2828 - 2838
  • [43] Three-Dimension Attention Mechanism and Self-Supervised Pretext Task for Augmenting Few-Shot Learning
    Liang, Yong
    Chen, Zetao
    Lin, Daoqian
    Tan, Junwen
    Yang, Zhenhao
    Li, Jie
    Li, Xinhai
    [J]. IEEE ACCESS, 2023, 11 : 59428 - 59437
  • [44] Self-Supervised Relational Reasoning for Representation Learning
    Patacchiola, Massimiliano
    Storkey, Amos
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [45] Self-Supervised Learning for Specified Latent Representation
    Liu, Chicheng
    Song, Libin
    Zhang, Jiwen
    Chen, Ken
    Xu, Jing
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (01) : 47 - 59
  • [46] Self-supervised Representation Learning on Document Images
    Cosma, Adrian
    Ghidoveanu, Mihai
    Panaitescu-Liess, Michael
    Popescu, Marius
    [J]. DOCUMENT ANALYSIS SYSTEMS, 2020, 12116 : 103 - 117
  • [47] Adaptive Self-Supervised Graph Representation Learning
    Gong, Yunchi
    [J]. 36TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2022), 2022, : 254 - 259
  • [48] Distilling Localization for Self-Supervised Representation Learning
    Zhao, Nanxuan
    Wu, Zhirong
    Lau, Rynson W. H.
    Lin, Stephen
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10990 - 10998
  • [49] Context Autoencoder for Self-supervised Representation Learning
    Xiaokang Chen
    Mingyu Ding
    Xiaodi Wang
    Ying Xin
    Shentong Mo
    Yunhao Wang
    Shumin Han
    Ping Luo
    Gang Zeng
    Jingdong Wang
    [J]. International Journal of Computer Vision, 2024, 132 : 208 - 223
  • [50] Context Autoencoder for Self-supervised Representation Learning
    Chen, Xiaokang
    Ding, Mingyu
    Wang, Xiaodi
    Xin, Ying
    Mo, Shentong
    Wang, Yunhao
    Han, Shumin
    Luo, Ping
    Zeng, Gang
    Wang, Jingdong
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 132 (1) : 208 - 223