CvFormer: Cross-view transFormers with pre-training for fMRI analysis of human brain

被引:0
|
作者
Meng, Xiangzhu [1 ,3 ]
Wei, Wei [4 ]
Liu, Qiang [1 ,2 ]
Wang, Yu [3 ]
Li, Min [3 ]
Wang, Liang [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, Ctr Res Intelligent Percept & Comp, 95 Zhongguancun East Rd, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, 1 Yanqihu East Rd, Beijing 101408, Peoples R China
[3] Jing Dong Retail, Dept User Growth & Operat, 18,Kechuang 11th St, Beijing 100176, Peoples R China
[4] Zhengzhou Univ, Sch Management, 100 Sci Ave, Zhengzhou 450001, Henan, Peoples R China
基金
中国国家自然科学基金;
关键词
Functional MRI; Human brain; Cross-view modeling; Transformers; Self-supervised learning; CONVOLUTIONAL NEURAL-NETWORKS;
D O I
10.1016/j.patrec.2024.09.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, functional magnetic resonance imaging (fMRI) has been widely utilized to diagnose neurological disease, by exploiting the region of interest (RoI) nodes as well as their connectivities in human brain. However, most of existing works only rely on either RoIs or connectivities, neglecting the potential for complementary information between them. To address this issue, we study how to discover the rich cross-view information in fMRI data of human brain. This paper presents a novel method for cross-view analysis of fMRI data of the human brain, called Cross-view transFormers (CvFormer). CvFormer employs RoI and connectivity encoder modules to generate two separate views of the human brain, represented as RoI and sub-connectivity tokens. Then, basic transformer modules can be used to process the RoI and sub-connectivity tokens, and cross-view modules integrate the complement information across two views. Furthermore, CvFormer uses a global token for each branch as a query to exchange information with other branches in cross-view modules, which only requires linear time for both computational and memory complexity instead of quadratic time. To enhance the robustness of the proposed CvFormer, we propose a two-stage strategy to train its parameters. To be specific, RoI and connectivity views can be firstly utilized as self-supervised information to pre-train the CvFormer by combining it with contrastive learning and then fused to finetune the CvFormer using label information. Experiment results on two public ABIDE and ADNI datasets can show clear improvements by the proposed CvFormer, which can validate its effectiveness and superiority.
引用
收藏
页码:85 / 90
页数:6
相关论文
共 50 条
  • [21] Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves
    Takashima, Sora
    Hayamizu, Ryo
    Inoue, Nakamasa
    Kataoka, Hirokatsu
    Yokota, Rio
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18579 - 18588
  • [22] DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions
    Wang, Haochen
    Fan, Junsong
    Wang, Yuxi
    Song, Kaiyou
    Wang, Tong
    Zhang, Zhaoxiang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [23] Bootstrapping ViTs: Towards Liberating Vision Transformers from Pre-training
    Zhang, Haofei
    Duan, Jiarui
    Xue, Mengqi
    Song, Jie
    Sun, Li
    Song, Mingli
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8934 - 8943
  • [24] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
    Dai, Zhigang
    Cai, Bolun
    Lin, Yugeng
    Chen, Junying
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1601 - 1610
  • [25] GAReT: Cross-View Video Geolocalization with Adapters and Auto-Regressive Transformers
    Pillai, Manu S.
    Rizve, Mamshad Nayeem
    Shah, Mubarak
    COMPUTER VISION - ECCV 2024, PT LXI, 2025, 15119 : 466 - 483
  • [26] Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery
    Noman, Mubashir
    Naseer, Muzammal
    Cholakkal, Hisham
    Anwar, Rao Muhammad
    Khan, Salman
    Khan, Fahad Shahbaz
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 27811 - 27819
  • [27] Friend Ranking in Online Games via Pre-training Edge Transformers
    Yao, Liang
    Peng, Jiazhen
    Ji, Shenggong
    Liu, Qiang
    Cai, Hongyun
    He, Feng
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2016 - 2020
  • [28] TNT: Text Normalization based Pre-training of Transformers for Content Moderation
    Tan, Fei
    Hu, Yifan
    Hu, Changwei
    Li, Keqian
    Yen, Kevin
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4735 - 4741
  • [29] Canonical sparse cross-view correlation analysis
    Zu, Chen
    Zhang, Daoqiang
    NEUROCOMPUTING, 2016, 191 : 263 - 272
  • [30] Semi-Supervised Sequence Modeling with Cross-View Training
    Clark, Kevin
    Luong, Minh-Thang
    Manning, Christopher D.
    Le, Quoc V.
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1914 - 1925