Audio-Visual Multimedia Quality Assessment A Comprehensive Survey

被引:54
|
作者
Akhtar, Zahid [1 ]
Falk, Tiago H. [1 ]
机构
[1] Univ Quebec, INRS EMT, Montreal, PQ H5A 1K6, Canada
来源
IEEE ACCESS | 2017年 / 5卷
关键词
Subjective quality assessment; objective quality metric; multimedia quality; signal-driven model; audiovisual perception; quality of service; data-driven analysis; PACKET-LOSS VISIBILITY; VIDEO QUALITY; SPEECH-QUALITY; AUDIO QUALITY; STRUCTURAL SIMILARITY; MODEL; PREDICTION; PERCEPTION; PSNR; INTELLIGIBILITY;
D O I
10.1109/ACCESS.2017.2750918
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Measuring perceived quality of audio-visual signals at the end-user has become an important parameter in many multimedia networks and applications. It plays a crucial role in shaping audio-visual processing, compression, transmission and systems, along with their implementation, optimization, and testing. Service providers are enacting different quality of service (QoS) solutions to issue the best quality of experience (QoE) to their customers. Thus, devising precise perception-based quality metrics will greatly help improving multimedia services over wired and wireless networks. In this paper, we provide a comprehensive survey of the works that have been carried out over recent decades in perceptual audio, video, and joint audio-visual quality assessments, describing existing methodologies in terms of requirement of a reference signal, feature extraction, feature mapping, and classification schemes. In this context, an overview of quality formation and perception, QoS, QoE as well as quality of perception is also presented. Finally, open issues and challenges in audio-visual quality assessment are highlighted and potential future research directions are discussed.
引用
收藏
页码:21090 / 21117
页数:28
相关论文
共 50 条
  • [1] Audio-visual interaction in multimedia
    Chen, Tsuhan
    Rao, Ram
    [J]. IEEE Circuits and Devices Magazine, 1995, 11 (06): : 21 - 25
  • [2] Audio-visual interaction in multimedia communication
    Chen, TH
    Rao, RR
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 179 - 182
  • [3] Audio-visual interaction: Multimedia applications
    Zonja, S.
    Livun, N.
    Jambrosic, K.
    [J]. PROCEEDINGS ELMAR-2006, 2006, : 143 - +
  • [4] UnB-AV: An Audio-Visual Database for Multimedia Quality Research
    Martinez, Helard B.
    Hines, Andrew
    Farias, Mylene C. Q.
    [J]. IEEE ACCESS, 2020, 8 : 56641 - 56649
  • [5] Perceptual Quality Assessment of Omnidirectional Audio-Visual Signals
    Zhu, Xilei
    Duan, Huiyu
    Cao, Yuqin
    Zhu, Yuxin
    Zhu, Yucheng
    Liu, Jing
    Chen, Li
    Min, Xiongkuo
    Zhai, Guangtao
    [J]. ARTIFICIAL INTELLIGENCE, CICAI 2023, PT II, 2024, 14474 : 512 - 525
  • [6] Audio-Visual Biometric Recognition and Presentation Attack Detection: A Comprehensive Survey
    Mandalapu, Hareesh
    Reddy, Aravinda P. N.
    Ramachandra, Raghavendra
    Rao, Krothapalli Sreenivasa
    Mitra, Pabitra
    Prasanna, S. R. Mahadeva
    Busch, Christoph
    [J]. IEEE ACCESS, 2021, 9 : 37431 - 37455
  • [7] A multimedia chipset for consumer audio-visual applications
    Baum, AJ
    Clarke, K
    Taunton, M
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 1997, 43 (03) : 646 - 648
  • [8] Multimedia chipset for consumer audio-visual applications
    Digital Equipment Corp
    [J]. IEEE Trans Consum Electron, 3 (646-648):
  • [9] A multimedia chipset for consumer audio-visual applications
    Baum, AJ
    Clarke, K
    Taunton, M
    [J]. INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, 1997 DIGEST OF TECHNICAL PAPERS, 1997, : 270 - 271
  • [10] AUDIO-VISUAL SYNCHRONIZATION RECOVERY IN MULTIMEDIA CONTENT
    Lee, Jong-Seok
    Ebrahimi, Touradj
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 2280 - 2283