A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences

被引:31
|
作者
Manocha, Pranay [1 ]
Finkelstein, Adam [1 ]
Zhang, Richard [2 ]
Bryan, Nicholas J. [2 ]
Mysore, Gautham J. [2 ]
Jin, Zeyu [2 ]
机构
[1] Princeton Univ, Princeton, NJ 08544 USA
[2] Adobe Res, San Jose, CA USA
来源
关键词
QUALITY ASSESSMENT;
D O I
10.21437/Interspeech.2020-1191
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Many audio processing tasks require perceptual assessment. The "gold standard" of obtaining human judgments is time-consuming, expensive, and cannot be used as an optimization criterion. On the other hand, automated metrics are efficient to compute but often correlate poorly with human judgment, particularly for audio differences at the threshold of human detection. In this work, we construct a metric by fitting a deep neural network to a new large dataset of crowdsourced human judgments. Subjects are prompted to answer a straightforward, objective question: are two recordings identical or not? These pairs are algorithmically generated under a variety of perturbations, including noise, reverb, and compression artifacts; the perturbation space is probed with the goal of efficiently identifying the just-noticeable difference (JND) level of the subject. We show that the resulting learned metric is well-calibrated with human judgments, outperforming baseline methods. Since it is a deep network, the metric is differentiable, making it suitable as a loss function for other tasks. Thus, simply replacing an existing loss (e.g., deep feature loss) with our metric yields significant improvement in a denoising network, as measured by subjective pairwise comparison.
引用
收藏
页码:2852 / 2856
页数:5
相关论文
共 50 条
  • [31] COLOR SPECIFICATION BASED ON JUST NOTICEABLE DIFFERENCES OF HUE
    DIMMICK, FL
    COLOR ENGINEERING, 1966, 4 (01): : 20 - &
  • [32] Perceptual video coding with multi-just-noticeable-distortion level
    Wang J.
    Wan S.
    Gong Y.
    Zhao H.
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2021, 49 (09): : 11 - 16
  • [33] Just noticeable differences for elbow joint torque feedback
    Hubert Kim
    Alan T. Asbeck
    Scientific Reports, 11
  • [34] RMS GRANULARITY - DETERMINATION OF JUST-NOTICEABLE DIFFERENCES
    ZWICK, D
    BROTHERS, DL
    SMPTE JOURNAL, 1977, 86 (06): : 427 - 430
  • [35] Scalable perceptual metric for evaluating audio quality
    Vanam, Rahul
    Creusere, Charles D.
    2005 39TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1 AND 2, 2005, : 319 - 323
  • [36] Just noticeable differences for elbow joint torque feedback
    Kim, Hubert
    Asbeck, Alan T.
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [37] JUST NOTICEABLE DIFFERENCES OF ARTICULATION RATE AT SENTENCE LEVEL
    EEFTING, W
    RIETVELD, ACM
    SPEECH COMMUNICATION, 1989, 8 (04) : 355 - 361
  • [38] JUST-NOTICEABLE DIFFERENCES OF FREQUENCY FOR MASKED TONES
    DYE, RH
    HAFTER, ER
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1980, 67 (05): : 1746 - 1753
  • [39] JUST-NOTICEABLE DIFFERENCES FOR TEXT QUALITY COMPONENTS
    DVORAK, CA
    HAMERLY, JR
    JOURNAL OF APPLIED PHOTOGRAPHIC ENGINEERING, 1983, 9 (03): : 97 - 100
  • [40] PERCEPTUAL VIDEO CODING WITH BLOCK-LEVEL STAIRCASE JUST NOTICEABLE DISTORTION
    Zhang, Xinyu
    Wang, Hanli
    Tian, Tao
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 4140 - 4144