POLARDB: FORMULA-DRIVEN DATASET FOR PRE-TRAINING TRAJECTORY ENCODERS

被引：0

作者：

Miyamoto, Sota ^{[1
]}

Yagi, Takuma ^{[3
]}

Makimoto, Yuto ^{[1
]}

Ukai, Mahiro ^{[1
]}

Ushiku, Yoshitaka ^{[2
]}

Hashimoto, Atsushi ^{[2
]}

Inoue, Nakamasa ^{[1
]}

机构：

[1] Tokyo Inst Technol, Tokyo, Japan

[2] OMRON SINIC X Corp, Bunkyo, Japan

[3] Natl Inst Adv Ind Sci & Technol, Ibaraki, Japan

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024 | 2024年

关键词：

Formula-driven supervised learning; Polar equations; Fine-grained action recognition; Cutting method recognition;

D O I：

10.1109/ICASSP48485.2024.10448448

中图分类号：

学科分类号：

摘要：

Formula-driven supervised learning (FDSL) is a growing research topic for finding simple mathematical formulas that generate synthetic data and labels for pre-training neural networks. The main advantage of FDSL is that there is no risk of generating data with ethical implications such as gender bias and racial bias because it does not rely on real data as discussed in previous studies using fractals and polygons for pre-training image encoders. While FDSL has been proposed for pre-training image encoders, it has not been considered for temporal trajectory data. In this paper, we introduce PolarDB, the first formula-driven dataset for pre-training trajectory encoders with an application to fine-grained cutting-method recognition using hand trajectories. More specifically, we generate 270k trajectories for 432 categories on the basis of polar equations and use them to pre-train a Transformer-based trajectory encoder in an FDSL manner. In the experiments, we show that pre-training on PolarDB improves the accuracy of fine-grained cutting-method recognition on cooking videos of EPIC-KITCHEN and Ego4D datasets, where the pre-trained trajectory encoder is used as a plug-in module for a video recognition network.

引用

页码：5465 / 5469

页数：5

共 47 条

[31] Evaluation of Dataset Selection for Pre-Training and Fine-Tuning Transformer Language Models for Clinical Question Answering
Soni, Sarvesh
Roberts, Kirk
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5532 - 5538
[32] ESCOXLM-R: Multilingual Taxonomy-driven Pre-training for the Job Market Domain
Zhang, Mike
van der Goot, Rob
Plank, Barbara
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 11871 - 11890
[33] OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network
Zhao, Tiancheng
Liu, Peng
Lee, Kyusong
IET COMPUTER VISION, 2024, 18 (05) : 626 - 639
[34] Pre-training dataset generation using visual explanation for classifying beam of vehicle headlights from nighttime camera image
Oyabu, Tatsuya
Sultana, Rebeka
Sakagawa, Yuta
Ohashi, Gosuke
IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2021, 16 (12) : 1603 - 1611
[35] Linguistically Driven Multi-Task Pre-Training for Low-Resource Neural Machine Translation
Mao, Zhuoyuan
Chu, Chenhui
Kurohashi, Sadao
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (04)
[36] Cross-dataset transfer learning for motor imagery signal classification via multi-task learning and pre-training
Xie, Yuting
Wang, Kun
Meng, Jiayuan
Yue, Jin
Meng, Lin
Yi, Weibo
Jung, Tzyy-Ping
Xu, Minpeng
Ming, Dong
JOURNAL OF NEURAL ENGINEERING, 2023, 20 (05)
[37] Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training
Pan, Yingwei
Li, Yehao
Luo, Jianjie
Xu, Jun
Yao, Ting
Tao Mei
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 7070 - 7074
[38] SwitchLight: Co-design of Physics-driven Architecture and Pre-training Framework for Human Portrait Relighting
Kim, Hoon
Jang, Minje
Yoon, Wonjun
Lee, Jisoo
Na, Donghyun
Wool, Sanghyun
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 25096 - 25106
[39] CREATER: CTR-driven Advertising Text Generation with Controlled Pre-Training and Contrastive Fine-Tuning
Wei, Penghui
Yang, Xuanhua
Liu, Shaoguo
Wang, Liang
Zheng, Bo
2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2022, 2022, : 9 - 17
[40] Data-driven Discovery of a Sepsis Patients Severity Prediction in the ICU via Pre-training BiLSTM Networks
Li, Qing
Huang, L. Frank
Zhong, Jiang
Li, Lili
Li, Qi
Hu, Junhao
2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 668 - 673

← 1 2 3 4 5 →