On the Assessment of High-Quality Voice Recordings including Voice Postprocessing

被引:2
|
作者
Beerends, John G. [1 ]
Beerends, Imre [2 ]
机构
[1] TNO, NL-2509 JE The Hague, Netherlands
[2] Mantis Audio, Wateringen, Netherlands
来源
关键词
ITU-T STANDARD; ASSESSMENT POLQA;
D O I
10.17743/jaes.2015.0013
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
When we assess the quality of a voice recording two different aspects play a role the voice characteristics (voice quality) and the audio chain characteristics (audio quality). Subjective experiments where no clear ideal reference is provided, so called absolute category rating experiments, assess the speech quality, i.e., the combined effect of voice and audio quality. This paper investigates whether voice postprocessing such as timbre optimization, loudness optimization; de-essing, room reverberation optimization, and (background) noise suppression can improve the quality of a high quality voice recording. It turned out that none of the processing provides a significant improvement in perceived quality. The best postprocessing is noise reduction to absolute silence, delivering only a non-significant improvement when the voice recording is of high quality. The subjective quality evaluations show a significant preference of male over female voice and a significant effect of speaker/sentence dependency on the perceived quality of certain types of degradation. The subjective results are compared with predictions made with the ITU-T standard for the objective assessment of speech quality POLQA (ITU-T Recommendation P.863 versions 1.1 and 2.4) and shows that many speech quality effects are predicted correctly, on condition level as well as individual sentence level.
引用
收藏
页码:174 / 183
页数:10
相关论文
共 50 条
  • [1] On the assessment of high-quality voice recordings including voice postprocessing
    Beerends, John G.
    Beerends, Imre
    [J]. AES: Journal of the Audio Engineering Society, 2015, 63 (03): : 174 - 183
  • [2] VoiceAssist: Guiding Users to High-Quality Voice Recordings
    Seetharaman, Prem
    Mysore, Gautham
    Pardo, Bryan
    Smaragdis, Paris
    Gomes, Celso
    [J]. CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2019,
  • [3] Voice Agents Supporting High-Quality Social Play
    Pantoja, Luiza Superti
    Diederich, Kyle
    Crawford, Liam
    Hourcade, Juan Pablo
    [J]. PROCEEDINGS OF ACM INTERACTION DESIGN AND CHILDREN (IDC 2019), 2019, : 314 - 325
  • [4] On the relevance of F0, Jitter, Shimmer and HNR acoustic parameters in forensic voice comparisons using GSM, VOIP and contemporaneous high-quality voice recordings
    Fernandes, Vania
    Ferreira, Anibal
    [J]. 2017 AES INTERNATIONAL CONFERENCE ON AUDIO FORENSICS, 2017,
  • [5] Frequency-domain techniques for high-quality voice modification
    Laroche, J
    [J]. DAFX-03: 6TH INTERNATIONAL CONFERENCE ON DIGITAL AUDIO EFFECTS, PROCEEDINGS, 2003, : 328 - 332
  • [6] On the horizon: Mobile telephony - Making high-quality voice a reality
    Turner, Brough
    [J]. Communications Solutions, 2002, 7 (01):
  • [7] Evaluating iPhone Recordings for Acoustic Voice Assessment
    Lin, Emily
    Hornibrook, Jeremy
    Ormond, Tika
    [J]. FOLIA PHONIATRICA ET LOGOPAEDICA, 2012, 64 (03) : 122 - 130
  • [8] XiaoiceSing: A High-Quality and Integrated Singing Voice Synthesis System
    Lu, Peiling
    Wu, Jie
    Luan, Jian
    Tan, Xu
    Zhou, Li
    [J]. INTERSPEECH 2020, 2020, : 1306 - 1310
  • [9] More efforts after feeling rejected: the effects of poor voice quality on employee's motivation to make high-quality voice
    Liu, Pan
    [J]. BALTIC JOURNAL OF MANAGEMENT, 2022, 17 (04) : 533 - 545
  • [10] A NOTE ON VOICE RECORDINGS
    Henrikson, Ernest H.
    [J]. JOURNAL OF SPEECH DISORDERS, 1943, 8 (02): : 133 - 135