Cross-Modal Contrastive Learning for Code Search

被引：2

作者：

Shi, Zejian ^{[1
]}

Xiong, Yun ^{[1
,2
]}

Zhang, Xiaolong ^{[1
]}

Zhang, Yao ^{[1
]}

Li, Shanshan ^{[3
]}

Zhu, Yangyong ^{[1
]}

机构：

[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Data Sci, Shanghai, Peoples R China

[2] Peng Cheng Lab, Shenzhen, Peoples R China

[3] Natl Univ Def Technol, Sch Comp, Changsha, Peoples R China

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2022) | 2022年

基金：

中国国家自然科学基金;

关键词：

code search; code representation; data augmentation; contrastive learning;

D O I：

10.1109/ICSME55016.2022.00017

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Code search aims to retrieve code snippets from natural language queries, which serves as a core technology to improve development efficiency. Previous approaches have achieved promising results to learn code and query representations by using BERT-based pre-trained models which, however, leads to semantic collapse problems, i.e. native representations of code and query clustering in a high similarity interval. In this paper, we propose CrossCS, a cross-modal contrastive learning method for code search, to improve the representations of code and query by explicit fine-grained contrastive objectives. Specifically, we design a novel and effective contrastive objective that considers not only the similarity between modalities, but also the similarity within modalities. To maintain semantic consistency of code snippets with different names of functions and variables, we use data augmentation to rename functions and variables to meaningless tokens, which enables us to add comparisons between code and augmented code within modalities. Moreover, in order to further improve the effectiveness of pre-trained models, we rank candidate code snippets using similarity scores weighted by retrieval scores and classification scores. Comprehensive experiments demonstrate that our method can significantly improve the effectiveness of pre-trained models for code search.

引用

页码：94 / 105

页数：12

共 50 条

[1] Cross-modal Contrastive Learning for Speech Translation
Ye, Rong
Wang, Mingxuan
Li, Lei
[J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5099 - 5113
[2] Cross-modal contrastive learning for multimodal sentiment recognition
Yang, Shanliang
Cui, Lichao
Wang, Lei
Wang, Tao
[J]. APPLIED INTELLIGENCE, 2024, 54 (05) : 4260 - 4276
[3] Cross-Modal Graph Contrastive Learning with Cellular Images
Zheng, Shuangjia
Rao, Jiahua
Zhang, Jixian
Zhou, Lianyu
Xie, Jiancong
Cohen, Ethan
Lu, Wei
Li, Chengtao
Yang, Yuedong
[J]. ADVANCED SCIENCE, 2024, 11 (32)
[4] TRAJCROSS: Trajecotry Cross-Modal Retrieval with Contrastive Learning
Jing, Quanliang
Yao, Di
Gong, Chang
Fan, Xinxin
Wang, Baoli
Tan, Haining
Bi, Jingping
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 344 - 349
[5] Cross-modal contrastive learning for multimodal sentiment recognition
Shanliang Yang
Lichao Cui
Lei Wang
Tao Wang
[J]. Applied Intelligence, 2024, 54 : 4260 - 4276
[6] Cross-modal contrastive learning for aspect-based recommendation
Won, Heesoo
Oh, Byungkook
Yang, Hyeongjun
Lee, Kyong-Ho
[J]. INFORMATION FUSION, 2023, 99
[7] Cross-Modal Contrastive Learning for Text-to-Image Generation
Zhang, Han
Koh, Jing Yu
Baldridge, Jason
Lee, Honglak
Yang, Yinfei
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 833 - 842
[8] Cross-modal Contrastive Learning for Multimodal Fake News Detection
Wang, Longzheng
Zhang, Chuang
Xu, Hongbo
Xu, Yongxiu
Xu, Xiaohan
Wang, Siqi
[J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5696 - 5704
[9] Improving Spoken Language Understanding with Cross-Modal Contrastive Learning
Dong, Jingjing
Fu, Jiayi
Zhou, Peng
Li, Hao
Wang, Xiaorui
[J]. INTERSPEECH 2022, 2022, : 2693 - 2697
[10] Enriched Music Representations With Multiple Cross-Modal Contrastive Learning
Ferraro, Andres
Favory, Xavier
Drossos, Konstantinos
Kim, Yuntae
Bogdanov, Dmitry
[J]. IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 733 - 737

← 1 2 3 4 5 →