Frame-level global context modeling for detection and localization of abnormality

被引:0
|
作者
Sharma, Manoj Kumar [1 ]
Kumar, Vikas [2 ]
Sheet, Debdoot [1 ]
Biswas, Prabir Kumar [1 ]
机构
[1] Indian Inst Technol Kharagpur, Kharagpur 721302, West Bengal, India
[2] Homi Bhabha Natl Inst, Indira Gandhi Ctr Atom Res, Chennai, India
关键词
Abnormality detection; Localization; Contextual abnormality; Deep learning; Image and video processing; ANOMALY DETECTION;
D O I
10.1007/s11042-023-14575-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Abnormality detection helps human beings by reducing the amount of data to be processed manually. However, detection and localization of contextual abnormality in image and video sequence have to deal with many challenges. Some object which is normal in one scenario may be considered as abnormal in another. The general solution is to divide the frame into regions or patches, followed by abnormality detection. The performance of the patch-based approach is limited to the size of the context window and suffers from issues of limited field-of-view. It does not consider the information available in the entire frame at a time. Increasing the patch size requires more number-of-nodes to be present in the network, and hence more computation memory is demanded. It also requires significant trainable parameters to train the system. Decreasing the node will reduce the performance and size of the context it can capture. These issues are overcome in the proposed method. The framework combines the convolution neural network and adversarial autoencoder for the localization of contextual abnormality. The spatial arrangement between objects across the different channels in the feature map of CNN is jointly trained from the normal data. The developed framework is further extended to reduce the required trainable parameters, which otherwise becomes a computational challenge. Experimental result outperforms the baseline approach in terms of localizing contextual abnormality.
引用
收藏
页码:38345 / 38370
页数:26
相关论文
共 50 条
  • [1] Frame-level global context modeling for detection and localization of abnormality
    Manoj Kumar Sharma
    Vikas Kumar
    Debdoot Sheet
    Prabir Kumar Biswas
    [J]. Multimedia Tools and Applications, 2023, 82 : 38345 - 38370
  • [2] Frame-Level Stutter Detection
    Harvill, John
    Hasegawa-Johnson, Mark
    Yoo, Changdong
    [J]. INTERSPEECH 2022, 2022, : 2843 - 2847
  • [3] Modeling frame-level errors in GSM wireless channels
    Ji, P
    Liu, BY
    Towsley, D
    Ge, ZH
    Kurose, J
    [J]. PERFORMANCE EVALUATION, 2004, 55 (1-2) : 165 - 181
  • [4] Modeling frame-level errors in GSM wireless channels
    Ji, P
    Liu, BY
    Towsley, D
    Kurose, J
    [J]. GLOBECOM'02: IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, VOLS 1-3, CONFERENCE RECORDS: THE WORLD CONVERGES, 2002, : 2483 - 2487
  • [5] Multi-Speaker Video Dialog with Frame-Level Temporal Localization
    Wang, Qiang
    Jiang, Pin
    Guo, Zhiyi
    Han, Yahong
    Zhao, Zhou
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12200 - 12207
  • [6] DENOISPEECH: DENOISING TEXT TO SPEECH WITH FRAME-LEVEL NOISE MODELING
    Zhang, Chen
    Ren, Yi
    Tan, Xu
    Liu, Jinglin
    Zhang, Kejun
    Qin, Tao
    Zhao, Sheng
    Liu, Tie-Yan
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7063 - 7067
  • [7] Frame-Level Vocal Effort Likelihood Space Modeling for Improved Whisper-Island Detection
    Zhang, Chi
    Hansen, John H. L.
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2432 - 2435
  • [8] Frame-level hidden Markov models
    Tran, D
    Wagner, M
    [J]. ADVANCES IN INTELLIGENT SYSTEMS: THEORY AND APPLICATIONS, 2000, 59 : 252 - 259
  • [9] FTM: A Frame-Level Timeline Modeling Method for Temporal Graph Representation Learning
    Cao, Bowen
    Ye, Qichen
    Xu, Weiyuan
    Zou, Yuexian
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 6888 - 6896
  • [10] MULTITASK LEARNING FOR FRAME-LEVEL INSTRUMENT RECOGNITION
    Hung, Yun-Ning
    Chen, Yi-An
    Yang, Yi-Hsuan
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 381 - 385