A Survey on Multi-modal Summarization

被引:22
|
作者
Jangra, Anubhav [1 ]
Mukherjee, Sourajit [2 ]
Jatowt, Adam [3 ,4 ]
Saha, Sriparna [1 ]
Hasanuzzaman, Mohammad [5 ]
机构
[1] Indian Inst Technol Patna, Dept Comp Sci, Patna 801106, Bihar, India
[2] Indian Inst Technol Patna, Dept Math, Patna, Bihar, India
[3] Univ Innsbruck, Dept Informat, Innsbruck, Austria
[4] Univ Innsbruck, DiSC, Innsbruck, Austria
[5] Cork Inst Technol, Dept Comp Sci, Cork, Ireland
关键词
Summarization; multi-modal content processing; neural networks; FUSION; VIDEO; LANGUAGE; SALIENCY; REVIEWS;
D O I
10.1145/3584700
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The new era of technology has brought us to the point where it is convenient for people to share their opinions over an abundance of platforms. These platforms have a provision for the users to express themselves in multiple forms of representations, including text, images, videos, and audio. This, however, makes it difficult for users to obtain all the key information about a topic, making the task of automatic multi-modal summarization (MMS) essential. In this article, we present a comprehensive survey of the existing research in the area of MMS, covering various modalities such as text, image, audio, and video. Apart from highlighting the different evaluation metrics and datasets used for the MMS task, our work also discusses the current challenges and future directions in this field.
引用
收藏
页数:36
相关论文
共 50 条
  • [41] Artificial intelligence accelerates multi-modal biomedical process: A Survey
    Li, Jiajia
    Han, Xue
    Qin, Yiming
    Tan, Feng
    Chen, Yulong
    Wang, Zikai
    Song, Haitao
    Zhou, Xi
    Zhang, Yuan
    Hu, Lun
    Hu, Pengwei
    NEUROCOMPUTING, 2023, 558
  • [42] Multi-Modal Fusion Technology Based on Vehicle Information: A Survey
    Zhang, Xinyu
    Gong, Yan
    Lu, Jianli
    Wu, Jiayi
    Li, Zhiwei
    Jin, Dafeng
    Li, Jun
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (06): : 3605 - 3619
  • [43] Expression Recognition Survey Through Multi-Modal Data Analytics
    Ramyasree, Kummari
    Kumar, Ch. Sumanth
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (06): : 600 - 610
  • [44] A Survey on Semantic Communications System Based on Multi-modal Data
    Win, Thwe Thwe
    Won, Dongwook
    Do, Quang Tuan
    Oh, Junsuk
    Hien, Pham Thi Thu
    Paek, Jeongyeup
    Cho, Sungrae
    38TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN 2024, 2024, : 214 - 217
  • [45] Multi-modal Perception
    Kondo, T.
    Denshi Joho Tsushin Gakkai Shi/Journal of the Institute of Electronics, Information and Communications Engineers, 78 (12):
  • [46] Multi-modal mapping
    Yates, Darran
    NATURE REVIEWS NEUROSCIENCE, 2016, 17 (09) : 536 - 536
  • [47] Multi-modal perception
    BT Technol J, 1 (35-46):
  • [48] Multi-modal Fusion
    Liu, Huaping
    Hussain, Amir
    Wang, Shuliang
    INFORMATION SCIENCES, 2018, 432 : 462 - 462
  • [49] Multi-modal perception
    Hollier, MP
    Rimell, AN
    Hands, DS
    Voelcker, RM
    BT TECHNOLOGY JOURNAL, 1999, 17 (01) : 35 - 46
  • [50] Multi-modal mapping
    Darran Yates
    Nature Reviews Neuroscience, 2016, 17 : 536 - 536