On the Use of Absolute Threshold of Hearing-based Loss for Full-band Speech Enhancement

被引:0
|
作者
Mars, Rohith [1 ]
Das, Rohan Kumar [1 ]
机构
[1] Fortemedia Singapore, Singapore, Singapore
关键词
speech enhancement; deep neural networks; absolute threshold of hearing; NOISE; SUPPRESSION;
D O I
10.1109/ISCSLP57327.2022.10038050
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we investigate the use of a perceptually motivated loss function for training single-channel full-band speech enhancement models. Specifically, we modify the conventional squared error loss function by incorporating the use of a frequency-importance based weighting scheme utilizing absolute threshold of hearing (ATH). We placed more emphasis on the perceptually relevant frequency bins of the speech spectrogram by applying larger weights to train the speech enhancement model targeting for a higher perceptual quality. We compare the models trained using both the conventional loss and the loss utilizing the proposed ATH-based weighting scheme on the VCTK and 4th DNS challenge datasets. The results demonstrate that the proposed loss using ATH-based weighting scheme achieves better performance than the conventional loss in terms of multiple objective speech quality metrics.
引用
收藏
页码:458 / 462
页数:5
相关论文
共 50 条
  • [41] DIRECTIONALITY-BASED SPEECH ENHANCEMENT FOR HEARING AIDS
    Woodruff, John
    Wang, DeLiang
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 297 - 300
  • [42] Speech enhancement based on hearing masking properties and subspace
    Ding, Q
    Xu, W
    Xu, JF
    Wang, BX
    [J]. 2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 307 - 310
  • [43] NEGF based transport modelling with a full-band, pseudopotential Hamiltonian: Theory, Implementation and Full Device Simulations
    Pala, Marco G.
    Badami, Oves
    Esseni, David
    [J]. 2017 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM), 2017,
  • [44] ON THE USE OF CONTEXTUAL TIME-FREQUENCY INFORMATION FOR FULL-BAND CLUSTERING-BASED CONVOLUTIVE BLIND SOURCE SEPARATION
    Atcheson, Matt
    Jafari, Ingrid
    Togneri, Roberto
    Nordholm, Sven
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [45] Research Note Threshold Estimation and Speech Perception Under Hearing Loss Simulation: Examination of the Immersive Hearing Loss and Prosthesis Simulator
    Roman, Aaron M.
    Pratt, Sheila R.
    Zhen, Leslie Q.
    [J]. AMERICAN JOURNAL OF AUDIOLOGY, 2024, 33 (01) : 275 - 282
  • [46] ABSOLUTE THRESHOLD, NOISE-INDUCED HEARING-LOSS, AND COCHLEAR PATHOLOGY IN CHINCHILLA
    CLARK, WW
    CLARK, CS
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 55 (02): : 460 - 460
  • [47] SPEECH RECOGNITION THRESHOLD IN NOISE - EFFECTS OF HEARING-LOSS, FREQUENCY-RESPONSE, AND SPEECH MATERIALS
    VANTASELL, DJ
    YANZ, JL
    [J]. JOURNAL OF SPEECH AND HEARING RESEARCH, 1987, 30 (03): : 377 - 386
  • [48] Low-dimensional representation of spectral envelope without deterioration for full-band speech analysis/synthesis system
    Morise, Masanori
    Miyashita, Genta
    Ozawa, Kenji
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 409 - 413
  • [49] Extraction of Potential Oil Fingermarks on Paper Based on Full-Band CCD Photographic System
    Xie Fei
    Gao Shuhui
    Li Yunzhuo
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (21)
  • [50] High Performance Full-band Tunable Product Evolution based on the DSDBR Laser Platform
    Mo, Jinyu
    Mayner, S.
    Nelson, L.
    Bu, Q.
    [J]. AOE 2008: ASIA OPTICAL FIBER COMMUNICATION AND OPTOELECTRONIC EXPOSITION AND CONFERENCE, 2009,