SelfPAB: large-scale pre-training on accelerometer data for human activity recognition

被引:0
|
作者
Aleksej Logacjov
Sverre Herland
Astrid Ustad
Kerstin Bach
机构
[1] Norwegian University of Science and Technology,Department of Computer Science
[2] Norwegian University of Science and Technology,Department of Neuromedicine and Movement Science
来源
Applied Intelligence | 2024年 / 54卷
关键词
Accelerometer; Human activity recognition; Machine learning; Physical activity behavior; Self-supervised learning; Transformer;
D O I
暂无
中图分类号
学科分类号
摘要
Annotating accelerometer-based physical activity data remains a challenging task, limiting the creation of robust supervised machine learning models due to the scarcity of large, labeled, free-living human activity recognition (HAR) datasets. Researchers are exploring self-supervised learning (SSL) as an alternative to relying solely on labeled data approaches. However, there has been limited exploration of the impact of large-scale, unlabeled datasets for SSL pre-training on downstream HAR performance, particularly utilizing more than one accelerometer. To address this gap, a transformer encoder network is pre-trained on various amounts of unlabeled, dual-accelerometer data from the HUNT4 dataset: 10, 100, 1k, 10k, and 100k hours. The objective is to reconstruct masked segments of signal spectrograms. This pre-trained model, termed SelfPAB, serves as a feature extractor for downstream supervised HAR training across five datasets (HARTH, HAR70+, PAMAP2, Opportunity, and RealWorld). SelfPAB outperforms purely supervised baselines and other SSL methods, demonstrating notable enhancements, especially for activities with limited training data. Results show that more pre-training data improves downstream HAR performance, with the 100k-hour model exhibiting the highest performance. It surpasses purely supervised baselines by absolute F1-score improvements of 7.1% (HARTH), 14% (HAR70+), and an average of 11.26% across the PAMAP2, Opportunity, and RealWorld datasets. Compared to related SSL methods, SelfPAB displays absolute F1-score enhancements of 10.4% (HARTH), 18.8% (HAR70+), and 16% (average across PAMAP2, Opportunity, RealWorld).
引用
收藏
页码:4545 / 4563
页数:18
相关论文
共 50 条
  • [31] Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
    Gu, Jiaxi
    Meng, Xiaojun
    Lu, Guansong
    Hou, Lu
    Niu, Minzhe
    Liang, Xiaodan
    Yao, Lewei
    Huang, Runhui
    Zhang, Wei
    Jiang, Xin
    Xu, Chunjing
    Xu, Hang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [32] A Comparison between Pre-training and Large-scale Back-translation for Neural Machine Translation
    Huang, Dandan
    Wang, Kun
    Zhang, Yue
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1718 - 1732
  • [33] Large-Scale Self-Supervised Human Activity Recognition
    Zadeh, Mohammad Zaki
    Jaiswal, Ashish
    Pavel, Hamza Reza
    Hebri, Aref
    Kapoor, Rithik
    Makedon, Fillia
    [J]. PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS, PETRA 2022, 2022, : 298 - 299
  • [34] Score Images as a Modality: Enhancing Symbolic Music Understanding through Large-Scale Multimodal Pre-Training
    Qin, Yang
    Xie, Huiming
    Ding, Shuxue
    Li, Yujie
    Tan, Benying
    Ye, Mingchuan
    [J]. SENSORS, 2024, 24 (15)
  • [35] EVA2.0: Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training
    Gu, Yuxian
    Wen, Jiaxin
    Sun, Hao
    Song, Yi
    Ke, Pei
    Zheng, Chujie
    Zhang, Zheng
    Yao, Jianzhu
    Liu, Lei
    Zhu, Xiaoyan
    Huang, Minlie
    [J]. MACHINE INTELLIGENCE RESEARCH, 2023, 20 (02) : 207 - 219
  • [36] OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network
    Zhao, Tiancheng
    Liu, Peng
    Lee, Kyusong
    [J]. IET COMPUTER VISION, 2024, 18 (05) : 626 - 639
  • [37] Too Large; Data Reduction for Vision-Language Pre-Training
    Wang, Alex Jinpeng
    Lin, Kevin Qinghong
    Zhang, David Junhao
    Lei, Stan Weixian
    Shou, Mike Zheng
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3124 - 3134
  • [38] Dialogue Response Ranking Training with Large-Scale Human Feedback Data
    Gao, Xiang
    Zhang, Yizhe
    Galley, Michel
    Brockett, Chris
    Dolan, Bill
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 386 - 395
  • [39] In Defense of Image Pre-Training for Spatiotemporal Recognition
    Li, Xianhang
    Wang, Huiyu
    Wei, Chen
    Mei, Jieru
    Yuille, Alan
    Zhou, Yuyin
    Xie, Cihang
    [J]. COMPUTER VISION, ECCV 2022, PT XXV, 2022, 13685 : 675 - 691
  • [40] A Framework for Human Activity Recognition Based on Accelerometer Data
    Mandal, Itishree
    Happy, S. L.
    Behera, Dipti Prakash
    Routray, Aurobinda
    [J]. 2014 5TH INTERNATIONAL CONFERENCE CONFLUENCE THE NEXT GENERATION INFORMATION TECHNOLOGY SUMMIT (CONFLUENCE), 2014, : 600 - 603