Multimodal Fusion Strategies: Human vs. Machine

被引:0
|
作者
Ko, Hanseok [1 ,2 ]
机构
[1] Korea Univ, Elect & Comp Engn, Seoul, South Korea
[2] Korea Univ, Machine Learning & Big Data Inst, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Attention; multimodality; audio-visual information fusion;
D O I
10.1145/3264869.3264870
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Two-hour movie or a short movie clip as its subset is intended to capture and present a meaningful (or significant) story in video to be recognized and understood by human audience. What if we substitute the task of human audience with that of an intelligent machine or robot capable of capturing and processing the semantic information in terms of audio and video cues contained in the video? By using both auditory and visual means, human brain processes the audio (sound, speech) and video (background image scene, moving video objects, written characters) modalities to extract the spatial and temporal semantic information, that are contextually complementary and robust. Smart machines equipped with audiovisual multisensors (e.g. CCTV equipped with cameras and microphones) should be capable of achieving the same task. An appropriate fusion strategy combining the audio and visual information would be a key component in developing such artificial general intelligent (AGI) systems. This talk reviews the challenges of current video analytics schemes and explores various sensor fusion techniques [1, 2, 3, 4] to combine the audio-visual information cues for video content analytics task.
引用
收藏
页码:1 / 1
页数:1
相关论文
共 50 条
  • [1] Human vs. Machine - or with Machine?
    Giannakopoulos, Triantafillos G.
    Kyriazanos, Ioannis
    [J]. EUROPEAN JOURNAL OF VASCULAR AND ENDOVASCULAR SURGERY, 2021, 62 (06) : 878 - 878
  • [2] Fusion vs. Two-Stage for Multimodal Retrieval
    Arampatzis, Avi
    Zagoris, Konstantinos
    Chatzichristofis, Savvas A.
    [J]. ADVANCES IN INFORMATION RETRIEVAL, 2011, 6611 : 759 - 762
  • [3] A Deep Fusion Model for Human vs. Machine-Generated Essay Classification
    Corizzo, Roberto
    Leal-Arenas, Sebastian
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [4] Intermediality and Human vs. Machine Translation
    Huang, Harry J.
    [J]. CLCWEB-COMPARATIVE LITERATURE AND CULTURE, 2011, 13 (03):
  • [5] Measurement of vitiligo: human vs. machine
    Edwards, C.
    [J]. BRITISH JOURNAL OF DERMATOLOGY, 2019, 180 (05) : 991 - 991
  • [6] Sound Localization: Human Vs. Machine
    Jayaweera, W. G. Nuwan
    Buddhika, A. G.
    Jayasekara, P.
    Abeykoon, A. M. Harsha S.
    [J]. 2014 7TH INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION FOR SUSTAINABILITY (ICIAFS), 2014,
  • [7] Human vs. Machine: A Comparison of Strategies for Case Finding and Eliciting Referrals in Palliative Care
    Holdsworth, Laura M.
    Mui, Heather Z.
    Winget, Marcy
    Singh, Nainwant
    Lorenz, Karl
    [J]. JOURNAL OF PAIN AND SYMPTOM MANAGEMENT, 2023, 65 (05) : E658 - E659
  • [8] Early vs. Late Multimodal Fusion for Recognizing Confusion in Collaborative Tasks
    Ashwath, Anisha
    Peechatt, Michael
    Alm, Cecilia
    Bailey, Reynold
    [J]. 2023 11TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS, ACIIW, 2023,
  • [9] ChatGPT and exercise prescription: Human vs. machine or human plus machine?
    Cavazzotto, Timothy Gustavo
    Dantas, Diego Bessa
    Queiroga, Marcos Roberto
    [J]. JOURNAL OF SPORT AND HEALTH SCIENCE, 2024, 13 (05) : 661 - 662
  • [10] Human vs. machine: evaluation of fluorescence micrographs
    Nattkemper, TW
    Twellmann, T
    Ritter, H
    Schubert, W
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2003, 33 (01) : 31 - 43