Unsupervised Multi-Latent Space RL Framework for Video Summarization in Ultrasound Imaging

被引:2
|
作者
Mathews, Roshan P. [1 ,2 ]
Panicker, Mahesh Raveendranatha [1 ]
Hareendranathan, Abhilash R. [3 ]
Chen, Yale Tung [4 ]
Jaremko, Jacob L.
Buchanan, Brian [5 ]
Narayan, Kiran Vishnu [6 ]
Kesavadas, C. [7 ]
Mathews, Greeta [8 ]
机构
[1] Indian Inst Technol Palakkad, Ctr Computat Imaging, Dept Elect Engn, Kozhippara, India
[2] Univ Calif Los Angeles, Dept Elect & Comp Engn, Los Angeles, CA 90095 USA
[3] Univ Alberta, Radiol & Diagnost Imaging Dept, Edmonton, AB, Canada
[4] Hosp Univ Puerta Hierro, Majadahonda, Spain
[5] Univ Alberta, Crit Care Med Dept, Edmonton, AB, Canada
[6] Govt Med Coll, Thiruvananthapuram, India
[7] Sree Chitra Tirunal Inst Med Sci & Technol, Thiruvananthapuram, India
[8] Bhagwan Mahaveer Jain Hosp, Radiol Dept, Bangalore, India
关键词
Ultrasound; video summarization; unsupervised reinforcement learning; attention ensemble encoders; CLASSIFICATION;
D O I
10.1109/JBHI.2022.3208779
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The COVID-19 pandemic has highlighted the need for a tool to speed up triage in ultrasound scans and provide clinicians with fast access to relevant information. To this end, we propose a new unsupervised reinforcement learning (RL) framework with novel rewards to facilitate unsupervised learning by avoiding tedious and impractical manual labelling for summarizing ultrasound videos. The proposed framework is capable of delivering video summaries with classification labels and segmentations of key landmarks which enhances its utility as a triage tool in the emergency department (ED) and for use in telemedicine. Using an attention ensemble of encoders, the high dimensional image is projected into a low dimensional latent space in terms of: a) reduced distance with a normal or abnormal class (classifier encoder), b) following a topology of landmarks (segmentation encoder), and c) the distance or topology agnostic latent representation (autoencoders). The summarization network is implemented using a bi-directional long short term memory (Bi-LSTM) which utilizes the latent space representation from the encoder. Validation is performed on lung ultrasound (LUS), that typically represent potential use cases in telemedicine and ED triage acquired from different medical centers across geographies (India and Spain). The proposed approach trained and tested on 126 LUS videos showed high agreement with the ground truth with an average precision of over 80% and average F-1 score of well over 44 +/- 1.7%. The approach resulted in an average reduction in storage space of 77% which can ease bandwidth and storage requirements in telemedicine.
引用
收藏
页码:227 / 238
页数:12
相关论文
共 23 条
  • [1] RL Based Unsupervised Video Summarization Framework for Ultrasound Imaging
    Mathews, Roshan P.
    Panicker, Mahesh Raveendranatha
    Hareendranathan, Abhilash R.
    Chen, Yale Tung
    Jaremko, Jacob L.
    Buchanan, Brian
    Narayan, Kiran Vishnu
    Chandrasekharan, Kesavadas
    Mathews, Greeta
    [J]. SIMPLIFYING MEDICAL ULTRASOUND, ASMUS 2022, 2022, 13565 : 23 - 33
  • [2] vid-SAMGRAH: A PyTorch framework for multi-latent space reinforcement learning driven video summarization in ultrasound imaging
    Mathews, Roshan P.
    Panicker, Mahesh Raveendranatha
    Hareendranathan, Abhilash R.
    [J]. SOFTWARE IMPACTS, 2021, 10
  • [3] MULTI-LATENT GAN INVERSION FOR UNSUPERVISED 3D SHAPE COMPLETION
    Ghosh, Krishnendu
    Kar, Aupendu
    Bhattacharya, Saumik
    Sen, Debashis
    Biswas, Prabir Kumar
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 3460 - 3464
  • [4] A Generalized Hierarchical Multi-Latent Space Model for Heterogeneous Learning
    Yang, Pei
    Davulcu, Hasan
    Zhu, Yada
    He, Jingrui
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (12) : 3154 - 3168
  • [5] Model Multiple Heterogeneity via Hierarchical Multi-Latent Space Learning
    Yang, Pei
    He, Jingrui
    [J]. KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 1375 - 1384
  • [6] Unsupervised Video Summarization via Multi-source Features
    Kanafani, Hussain
    Ghauri, Junaid Ahmed
    Hakimov, Sherzod
    Ewerth, Ralph
    [J]. PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 466 - 470
  • [7] Diversity Learning Based on Multi-Latent Space for Medical Image Visual Question Generation
    Zhu, He
    Togo, Ren
    Ogawa, Takahiro
    Haseyama, Miki
    [J]. SENSORS, 2023, 23 (03)
  • [8] Unsupervised Framework for Comment-based Multi-document Extractive Summarization
    Roha, Vishal Singh
    Saini, Naveen
    Saha, Sriparna
    Moreno, Jose G.
    [J]. PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'22), 2022, : 574 - 582
  • [9] Object of Interest and Unsupervised Learning-based Framework for an Effective Video Summarization Using Deep Learning
    Negi, Alok
    Kumar, Krishan
    Saini, Parul
    [J]. IETE JOURNAL OF RESEARCH, 2024, 70 (05) : 5019 - 5030
  • [10] Personalized Video Summarization Based on Multi-layered Probabilistic Latent Semantic Analysis with shared Topics
    Chung, Cheng-Tao
    Hsiung, Hsin-Kuan
    Wei, Cheng-Kuang
    Lee, Lin-Shan
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 173 - +