POLARDB: FORMULA-DRIVEN DATASET FOR PRE-TRAINING TRAJECTORY ENCODERS

被引:0
|
作者
Miyamoto, Sota [1 ]
Yagi, Takuma [3 ]
Makimoto, Yuto [1 ]
Ukai, Mahiro [1 ]
Ushiku, Yoshitaka [2 ]
Hashimoto, Atsushi [2 ]
Inoue, Nakamasa [1 ]
机构
[1] Tokyo Inst Technol, Tokyo, Japan
[2] OMRON SINIC X Corp, Bunkyo, Japan
[3] Natl Inst Adv Ind Sci & Technol, Ibaraki, Japan
来源
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024 | 2024年
关键词
Formula-driven supervised learning; Polar equations; Fine-grained action recognition; Cutting method recognition;
D O I
10.1109/ICASSP48485.2024.10448448
中图分类号
学科分类号
摘要
Formula-driven supervised learning (FDSL) is a growing research topic for finding simple mathematical formulas that generate synthetic data and labels for pre-training neural networks. The main advantage of FDSL is that there is no risk of generating data with ethical implications such as gender bias and racial bias because it does not rely on real data as discussed in previous studies using fractals and polygons for pre-training image encoders. While FDSL has been proposed for pre-training image encoders, it has not been considered for temporal trajectory data. In this paper, we introduce PolarDB, the first formula-driven dataset for pre-training trajectory encoders with an application to fine-grained cutting-method recognition using hand trajectories. More specifically, we generate 270k trajectories for 432 categories on the basis of polar equations and use them to pre-train a Transformer-based trajectory encoder in an FDSL manner. In the experiments, we show that pre-training on PolarDB improves the accuracy of fine-grained cutting-method recognition on cooking videos of EPIC-KITCHEN and Ego4D datasets, where the pre-trained trajectory encoder is used as a plug-in module for a video recognition network.
引用
收藏
页码:5465 / 5469
页数:5
相关论文
共 47 条
  • [41] TaiSu: A 166M Large-scale High-Quality Dataset for Chinese Vision-Language Pre-training
    Liu, Yulong
    Zhu, Guibo
    Zhu, Bin
    Song, Qi
    Ge, Guojing
    Chen, Haoran
    Qiao, Guanhui
    Peng, Ru
    Wu, Lingxiang
    Wang, Jinqiao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [42] Gesture2Text: A Generalizable Decoder for Word-Gesture Keyboards in XR Through Trajectory Coarse Discretization and Pre-Training
    Shen, Junxiao
    Khaldi, Khadija
    Zhou, Enmin
    Surale, Hemant Bhaskar
    Karlson, Amy
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (11) : 7118 - 7128
  • [43] Reference-Driven Compressed Sensing MR Image Reconstruction Using Deep Convolutional Neural Networks without Pre-Training
    Zhao, Di
    Zhao, Feng
    Gan, Yongjin
    SENSORS, 2020, 20 (01)
  • [44] SAM-driven MAE pre-training and background-aware meta-learning for unsupervised vehicle re-identification
    Wang, Dong
    Wang, Qi
    Min, Weidong
    Gai, Di
    Han, Qing
    Li, Longfei
    Geng, Yuhan
    COMPUTATIONAL VISUAL MEDIA, 2024, 10 (04) : 771 - 789
  • [45] Abnormal cell cause localization based on contrastive pre-training and unsupervised data-driven model for lithium-ion battery manufacturing
    Wang, Xiang
    He, Jianjun
    Huang, Fuxin
    Shen, Shuai
    Liu, Zhenjie
    JOURNAL OF ENERGY STORAGE, 2024, 101
  • [46] Data-Driven Self-Triggered Control for Networked Motor Control Systems Using RNNs and Pre-Training: A Hierarchical Reinforcement Learning Framework
    Chen, Wei
    Wan, Haiying
    Luan, Xiaoli
    Liu, Fei
    SENSORS, 2024, 24 (06)
  • [47] Data-driven pre-training framework for reinforcement learning of air-source heat pump (ASHP) systems based on historical data in office buildings: Field validation
    Zhang, Wenqi
    Yu, Yong
    Yuan, Zhongyuan
    Tang, Peipei
    Gao, Bo
    ENERGY AND BUILDINGS, 2025, 332