Multi-modal fusion in ergonomic health: bridging visual and pressure for sitting posture detection

被引：1

作者：

Quan, Qinxiao ^{[1
]}

Gao, Yang ^{[2
]}

Bai, Yang ^{[1
]}

Jin, Zhanpeng ^{[1
]}

机构：

[1] South China Univ Technol, Sch Future Technol, Guangzhou, Peoples R China

[2] East China Normal Univ, Sch Comp Sci, Shanghai, Peoples R China

来源：

CCF TRANSACTIONS ON PERVASIVE COMPUTING AND INTERACTION | 2024年

关键词：

Pressure sensing; Computer vision; Sitting posture recognition; Feature fusion; Multi-label classification; RECOGNITION;

D O I：

10.1007/s42486-024-00164-x

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As the contradiction between the pursuit of health and the increasing duration of sedentary office work intensifies, there has been a growing focus on maintaining correct sitting posture while working in recent years. Scientific studies have shown that sitting posture correction plays a positive role in alleviating physical pain. With the rapid development of artificial intelligence technology, a significant amount of research has shifted towards implementing sitting posture detection and recognition systems using machine learning approaches. In this paper, we introduce an innovative sitting posture recognition system that integrates visual and pressure modalities. The system employs a differentiated pre-training strategy for training the bimodal models and features a feature fusion module designed based on feed-forward networks. Our system utilizes commonly available built-in cameras in laptops for collecting visual data and thin-film pressure sensor mats for pressure data in office scenarios. It achieved an F1-Macro score of 95.43% on a dataset with complex composite actions, marking an improvement of 7.13% and 10.79% over systems that rely solely on pressure or visual modalities, respectively, and a 7.07% improvement over systems using a uniform pre-training strategy.

引用

页码：380 / 393

页数：14

共 50 条

[21] Text-Guided Multi-Modal Fusion for Underwater Visual Tracking
Michael, Yonathan
Alansari, Mohamad
Javed, Sajid
2024 IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE, AVSS 2024, 2024,
[22] Multi-Modal Fusion Transformer for Visual Question Answering in Remote Sensing
Siebert, Tim
Clasen, Kai Norman
Ravanbakhsh, Mahdyar
Demir, Beguem
IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING XXVIII, 2022, 12267
[23] The multi-modal fusion in visual question answering: a review of attention mechanisms
Lu, Siyu
Liu, Mingzhe
Yin, Lirong
Yin, Zhengtong
Liu, Xuan
Zheng, Wenfeng
PEERJ COMPUTER SCIENCE, 2023, 9
[24] Learning Visual Emotion Distributions via Multi-Modal Features Fusion
Zhao, Sicheng
Ding, Guiguang
Gao, Yue
Han, Jungong
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 369 - 377
[25] Multi-Modal Anomaly Detection by Using Audio and Visual Cues
Rehman, Ata-Ur
Ullah, Hafiz Sami
Farooq, Haroon
Khan, Muhammad Salman
Mahmood, Tayyeb
Khan, Hafiz Owais Ahmed
IEEE ACCESS, 2021, 9 : 30587 - 30603
[26] Multi-Modal Fusion for Multi-Task Fuzzy Detection of Rail Anomalies
Liyuan, Yang
Osman, Ghazali
Abdul Rahman, Safawi
Mustapha, Muhammad Firdaus
IEEE ACCESS, 2024, 12 : 73925 - 73935
[27] Multi-level and Multi-modal Target Detection Based on Feature Fusion
Cheng T.
Sun L.
Hou D.
Shi Q.
Zhang J.
Chen J.
Huang H.
Qiche Gongcheng/Automotive Engineering, 2021, 43 (11): : 1602 - 1610
[28] Soft multi-modal data fusion
Coppock, S
Mazack, L
PROCEEDINGS OF THE 12TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1 AND 2, 2003, : 636 - 641
[29] Multi-modal fusion for video understanding
Hoogs, A
Mundy, J
Cross, G
30TH APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP, PROCEEDINGS: ANALYSIS AND UNDERSTANDING OF TIME VARYING IMAGERY, 2001, : 103 - 108
[30] Multi-modal data fusion: A description
Coppock, S
Mazlack, LJ
KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS, 2004, 3214 : 1136 - 1142

← 1 2 3 4 5 →