Nonlinear waveform distortion: Assessment and detection of clipping on speech data and systems

被引:5
|
作者
Hansen, John H. L. [1 ]
Stauffer, Allen [1 ]
Xia, Wei [1 ]
机构
[1] Univ Texas Dallas, Erik Jonsson Sch Engn, Ctr Robust Speech Syst CRSS, Richardson, TX 75083 USA
关键词
Audio clipping; Speech quality assessment; Non-linear distortion; Speaker recognition;
D O I
10.1016/j.specom.2021.07.007
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech, speaker, and language systems have traditionally relied on carefully collected speech material for training acoustic models. There is an enormous amount of freely accessible audio content. A major challenge, however, is that such data is not professionally recorded, and therefore may contain a wide diversity of background noise, nonlinear distortions, or other unknown environmental or technology-based contamination or mismatch. There is a crucial need for automatic analysis to screen such unknown datasets before acoustic model development training, or to perform input audio purity screening prior to classification. In this study, we propose a waveform based clipping detection algorithm for naturalistic audio streams and examine the impact of clipping at different severities on speech quality measurements and automatic speaker recognition systems. We use the TIMIT and NIST SRE08 corpora as case studies. The results show, as expected, that clipping introduces a nonlinear distortion into clean speech data, which reduces speech quality and performance for speaker recognition. We also investigate what degree of clipping can be present to sustain effective speech system performance. The proposed detection system, which will be released, could contribute to massive new audio collections for speech and language technology development (e.g. Google Audioset (Gemmeke et al., 2017), CRSS-UTDallas Apollo Fearless-Steps (Yu et al., 2014) (19,000 h naturalistic audio from NASA Apollo missions)).
引用
收藏
页码:20 / 31
页数:12
相关论文
共 50 条
  • [31] On some advanced methods for waveform distortion assessment in presence of interharmonics
    Bracale, A.
    Carpinelli, G.
    Langella, R.
    Testa, A.
    2006 POWER ENGINEERING SOCIETY GENERAL MEETING, VOLS 1-9, 2006, : 1536 - +
  • [32] A new approach for evaluating clipping distortion in DS-CDMA systems
    Wang, J
    Shan, XM
    Ren, Y
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2005, E88B (02) : 792 - 796
  • [33] Privacy-Preserving Outlier Detection Through Random Nonlinear Data Distortion
    Bhaduri, Kanishka
    Stefanski, Mark D.
    Srivastava, Ashok N.
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2011, 41 (01): : 260 - 272
  • [34] The Application of Similarity Measure for Power Quality Waveform Distortion Detection
    Lin, Lin
    Qi, Jiajin
    Huang, Nantian
    ADVANCED MEASUREMENT AND TEST, PARTS 1 AND 2, 2010, 439-440 : 304 - +
  • [35] A rate distortion method for waveform design in RF target detection
    Bonneau, Robert J.
    2006 IEEE AEROSPACE CONFERENCE, VOLS 1-9, 2006, : 2244 - 2255
  • [36] THE USE OF SPEECH CLIPPING IN SINGLE-SIDEBAND COMMUNICATIONS SYSTEMS
    KAHN, LR
    PROCEEDINGS OF THE INSTITUTE OF RADIO ENGINEERS, 1957, 45 (08): : 1148 - 1149
  • [37] Detection of nonlinear distortion in audio signals
    Maré, S
    IEEE TRANSACTIONS ON BROADCASTING, 2002, 48 (02) : 76 - 80
  • [38] An adaptive nonlinear prefilter for compensation of distortion in nonlinear Systems
    Lim, YH
    Cho, YS
    Cha, IW
    Youn, DH
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1998, 46 (06) : 1726 - 1730
  • [39] SIGNAL DISTORTION IN NONLINEAR FEEDBACK SYSTEMS
    SANDBERG, IW
    BELL SYSTEM TECHNICAL JOURNAL, 1963, 42 (06): : 2533 - +
  • [40] OBJECTIVE QUALITY MEASURES FOR SPEECH WAVEFORM CODING SYSTEMS
    ITOH, K
    KITAWAKI, N
    KAKEHI, K
    REVIEW OF THE ELECTRICAL COMMUNICATIONS LABORATORIES, 1984, 32 (02): : 220 - 228