Possibilities of the Latest AI Models in Production – Multi-Modal Foundation Models in Production

被引:0
|
作者
Behnen, H. [1 ]
Woltersmann, J.-H. [2 ,3 ]
Wolfschläger, D. [2 ,3 ]
Schmitt, R.H. [2 ,3 ]
机构
[1] RWTH AachenUniversity, Germany
[2] WZL | RWTH Aachen University, Germany
[3] Intelligence in Quality Sensing (IQS) Lehrstuhl für Informations -, Qualitätsund Sensorsysteme in der Produktion, Campus-Boulevard 30, Aachen,52074, Germany
来源
WT Werkstattstechnik | 2024年 / 114卷 / 11-12期
关键词
D O I
10.37544/1436-4980-2024-11-12-43
中图分类号
学科分类号
摘要
Current challenges in production, such as shortage of skilled workers, increase the need to automate processes and increase productivity. Multi-modal foundation models address this automation demand for a variety of applications by deriving decisions based on heterogeneous information sources. However, applications around this technology are currently rare. This article therefore provides an overview of the potential and challenges of these models in production. © 2024, VDI Fachmedien GmBH & Co. KG. All rights reserved.
引用
收藏
页码:747 / 754
相关论文
共 50 条
  • [21] Emotional Models for Multi-modal Communication of Robot Partners
    Yorita, Akihiro
    Botzheim, Janos
    Kubota, Naoyuki
    2013 IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2013,
  • [22] CHAMELEON: Foundation Models for Fairness-aware Multi-modal Data Augmentation to Enhance Coverage of Minorities
    Erfanian, Mahdi
    Jagadish, H. V.
    Asudeh, Abolfazl
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (11): : 3470 - 3483
  • [23] Multi-task Multi-modal Models for Collective Anomaly Detection
    Ide, Tsuyoshi
    Phan, Dzung T.
    Kalagnanam, Jayant
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 177 - 186
  • [24] Demonstrating CAESURA: Language Models as Multi-Modal Query Planners
    Urban, Matthias
    Binnig, Carsten
    COMPANION OF THE 2024 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, SIGMOD-COMPANION 2024, 2024, : 472 - 475
  • [25] Multi-Modal Attribute Prompting for Vision-Language Models
    Liu, Xin
    Wu, Jiamin
    Yang, Wenfei
    Zhou, Xu
    Zhang, Tianzhu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (11) : 11579 - 11591
  • [26] GRAPHICAL MODELS FOR MULTI-MODAL AUTOMATIC VIDEO EDITING IN MEETINGS
    Hoernler, Benedikt
    Arsic, Dejan
    Schuller, Bjoern
    Rigoll, Gerhard
    2009 16TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, VOLS 1 AND 2, 2009, : 846 - 853
  • [27] Multi-modal Language Models for Human-Robot Interaction
    Janssens, Ruben
    COMPANION OF THE 2024 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI 2024 COMPANION, 2024, : 109 - 111
  • [28] Multi-modal segmental models for on-line handwriting recognition
    Artières, T
    Marchand, JM
    Dorizzi, B
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS: PATTERN RECOGNITION AND NEURAL NETWORKS, 2000, : 247 - 250
  • [29] Nonparametric Bayesian Upstream Supervised Multi-Modal Topic Models
    Liao, Renjie
    Zhu, Jun
    Qin, Zengchang
    WSDM'14: PROCEEDINGS OF THE 7TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2014, : 493 - 502
  • [30] MULTI-MODAL BACKGROUND SUBTRACTION USING GAUSSIAN MIXTURE MODELS
    Langmann, Benjamin
    Ghobadi, Seyed E.
    Hartmann, Klaus
    Loffeld, Otmar
    PCV 2010 - PHOTOGRAMMETRIC COMPUTER VISION AND IMAGE ANALYSIS, PT I, 2010, 38 : 61 - 66