WiVi: WiFi-Video Cross-Modal Fusion based Multi-Path Gait Recognition System

被引：3

作者：

Fan, Jinmeng ^{[1
,2
,3
]}

Zhou, Hao ^{[1
,2
,3
]}

Zhou, Fengyu ^{[1
,2
,3
]}

Wang, Xiaoyan ^{[4
]}

Liu, Zhi ^{[5
]}

Li, Xiang-Yang ^{[1
,2
,3
]}

机构：

[1] Univ Sci & Technol China, Sch Comp Sci & Technol, LINKE Lab, Hefei, Peoples R China

[2] Univ Sci & Technol China, CAS Key Lab Wireless Opt Commun, Hefei, Peoples R China

[3] Deqing Alpha Innovat Inst, Huzhou, Zhejiang, Peoples R China

[4] Ibaraki Univ, Grad Sch Sci & Engn, Ibaraki, Japan

[5] Univ Electrocommun, Dept Comp & Network Engn, Tokyo, Japan

来源：

2022 IEEE/ACM 30TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS) | 2022年

基金：

国家重点研发计划;

关键词：

D O I：

10.1109/IWQoS54832.2022.9812893

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

WiFi-based gait recognition is an attractive method for device-free user identification, but path-sensitive Channel State Information (CSI) hinders its application in multi-path environments, which exacerbates sampling and deployment costs (i.e., large number of samples and multiple specially placed devices). On the other hand, although video-based ideal CSI generation is promising for dramatically reducing samples, the missing environment-related information in the ideal CSI makes it unsuitable for general indoor scenarios with multiple walking paths. In this paper, we propose WiVi, a WiFi-video cross-modal fusion based multi-path gait recognition system which needs fewer samples and fewer devices simultaneously. When the subject walks naturally in the room, we determine whether he/she is walking on the predefined judgment paths with a K-Nearest Neighbors (KNN) classifier working on the WiFi-based human localization results. For each judgment path, we generate the ideal CSI through video-based simulation to decrease the number of needed samples, and adopt two separated neural networks (NNs) to fulfill environment-aware comparison among the ideal and measured CSIs. The first network is supervised by measured CSI samples, and learns to obtain the semi-ideal CSI features which contain the room-specific 'accent', i.e., the long-term environment influence normally caused by room layout. The second network is trained for similarity evaluation between the semi-ideal and measured features, with the existence of short-term environment influence such as channel variation or noises. We implement the prototype system and conduct extensive experiments to evaluate the performance. Experimental results show that WiVi's recognition accuracy ranges from 85.4% for a 6-person group to 98.0% for a 3-person group. As compared with single-path gait recognition systems, we achieve average 113.8% performance improvement. As compared with the other multi-path gait recognition systems, we achieve similar or even better performance with needed samples being reduced by 57.1-93.7%

引用

页数：10

共 50 条

[31] Semantics-preserving hashing based on multi-scale fusion for cross-modal retrieval
Hong Zhang
Min Pan
Multimedia Tools and Applications, 2021, 80 : 17299 - 17314
[32] Multi-modal video event recognition based on association rules and decision fusion
Guder, Mennan
Cicekli, Nihan Kesim
MULTIMEDIA SYSTEMS, 2018, 24 (01) : 55 - 72
[33] Chest radiology report generation based on cross-modal multi-scale feature fusion
Pan, Yu
Liu, Li -Jun
Yang, Xiao-Bing
Peng, Wei
Huang, Qing-Song
JOURNAL OF RADIATION RESEARCH AND APPLIED SCIENCES, 2024, 17 (01)
[34] Depth map reconstruction method based on multi-scale cross-modal feature fusion
Yang J.
Xie T.
Yue H.
Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2023, 51 (03): : 52 - 59
[35] Automatic depression prediction via cross-modal attention-based multi-modal fusion in social networks
Wang, Lidong
Zhang, Yin
Zhou, Bin
Cao, Shihua
Hu, Keyong
Tan, Yunfei
COMPUTERS & ELECTRICAL ENGINEERING, 2024, 118
[36] Multi-modal feature fusion based on multi-layers LSTM for video emotion recognition
Weizhi Nie
Yan Yan
Dan Song
Kun Wang
Multimedia Tools and Applications, 2021, 80 : 16205 - 16214
[37] Multi-modal feature fusion based on multi-layers LSTM for video emotion recognition
Nie, Weizhi
Yan, Yan
Song, Dan
Wang, Kun
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16205 - 16214
[38] A Multi-modal Gait Based Human Identity Recognition System Based on Surveillance Videos
Hossain, Emdad
Chetty, Girija
6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS'2012), 2012,
[39] Cross-modal biometric fusion intelligent traffic recognition system combined with real-time data operation
Xu, Wei
Zhai, Yujin
OPEN COMPUTER SCIENCE, 2022, 12 (01) : 332 - 344
[40] An AIoT Framework With Multi-modal Frequency Fusion for WiFi-Based Coarse and Fine Activity Recognition
Chen J.
Xu X.
Wang T.
Jeon G.
Camacho D.
IEEE Internet of Things Journal, 2024, 11 (24) : 1 - 1

← 1 2 3 4 5 →