A Survey on Multi-modal Summarization

被引:22
|
作者
Jangra, Anubhav [1 ]
Mukherjee, Sourajit [2 ]
Jatowt, Adam [3 ,4 ]
Saha, Sriparna [1 ]
Hasanuzzaman, Mohammad [5 ]
机构
[1] Indian Inst Technol Patna, Dept Comp Sci, Patna 801106, Bihar, India
[2] Indian Inst Technol Patna, Dept Math, Patna, Bihar, India
[3] Univ Innsbruck, Dept Informat, Innsbruck, Austria
[4] Univ Innsbruck, DiSC, Innsbruck, Austria
[5] Cork Inst Technol, Dept Comp Sci, Cork, Ireland
关键词
Summarization; multi-modal content processing; neural networks; FUSION; VIDEO; LANGUAGE; SALIENCY; REVIEWS;
D O I
10.1145/3584700
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The new era of technology has brought us to the point where it is convenient for people to share their opinions over an abundance of platforms. These platforms have a provision for the users to express themselves in multiple forms of representations, including text, images, videos, and audio. This, however, makes it difficult for users to obtain all the key information about a topic, making the task of automatic multi-modal summarization (MMS) essential. In this article, we present a comprehensive survey of the existing research in the area of MMS, covering various modalities such as text, image, audio, and video. Apart from highlighting the different evaluation metrics and datasets used for the MMS task, our work also discusses the current challenges and future directions in this field.
引用
收藏
页数:36
相关论文
共 50 条
  • [21] A survey of multi-modal learning theory(英文)
    HUANG Yu
    HUANG Longbo
    中山大学学报(自然科学版)(中英文), 2023, 62 (05) : 38 - 49
  • [22] A survey on multi-modal social event detection
    Zhou, Han
    Yin, Hongpeng
    Zheng, Hengyi
    Li, Yanxia
    KNOWLEDGE-BASED SYSTEMS, 2020, 195
  • [23] Multi-Modal Supplementary-Complementary Summarization using Multi-Objective Optimization
    Jangra, Anubhav
    Saha, Sriparna
    Jatowt, Adam
    Hasanuzzaman, Mohammed
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 818 - 828
  • [24] MM-AVS: A Full-Scale Dataset for Multi-modal Summarization
    Fu, Xiyan
    Wang, Jun
    Yang, Zhenglu
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5922 - 5926
  • [25] A Multi-Modal Transformer-based Code Summarization Approach for Smart Contracts
    Yang, Zhen
    Keung, Jacky
    Yu, Xiao
    Gu, Xiaodong
    Wei, Zhengyuan
    Ma, Xiaoxue
    Zhang, Miao
    2021 IEEE/ACM 29TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2021), 2021, : 1 - 12
  • [26] A Survey of Multi-modal Knowledge Graphs: Technologies and Trends
    Liang, Wanying
    De Meo, Pasquale
    Tang, Yong
    Zhu, Jia
    ACM COMPUTING SURVEYS, 2024, 56 (11)
  • [27] Multi-Modal Hashing for Efficient Multimedia Retrieval: A Survey
    Zhu, Lei
    Zheng, Chaoqun
    Guan, Weili
    Li, Jingjing
    Yang, Yang
    Shen, Heng Tao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (01) : 239 - 260
  • [28] Modeling Complexity in Multi-modal Adaptive Survey Systems
    Highland, Fred
    COMPLEX ADAPTIVE SYSTEMS, 2014, 36 : 198 - 203
  • [29] EnCoSum: enhanced semantic features for multi-scale multi-modal source code summarization
    Yuexiu Gao
    Hongyu Zhang
    Chen Lyu
    Empirical Software Engineering, 2023, 28
  • [30] Multization: Multi-Modal Summarization Enhanced by Multi-Contextually Relevant and Irrelevant Attention Alignment
    Rong, Huan
    Chen, Zhongfeng
    Lu, Zhenyu
    Xu, Fan
    Sheng, Victor S.
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (05)