NATURALCC: An Open-Source Toolkit for Code Intelligence

被引:0
|
作者
Wan, Yao [1 ]
He, Yang [2 ]
Bi, Zhangqian [1 ]
Zhang, Jianguo [3 ]
Sui, Yulei [2 ]
Zhang, Hongyu [4 ]
Hashimoto, Kazuma [5 ]
Jin, Hai [1 ]
Xu, Guandong [2 ]
Xiong, Caiming [6 ]
Yu, Philip S. [3 ]
机构
[1] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Cluster & Grid Comp Lab, Serv Comp Technol & Syst Lab,Sch Comp Sci & Techn, Wuhan, Peoples R China
[2] Univ Technol Sydney, Sydney, NSW, Australia
[3] Univ Illinois, Chicago, IL 60680 USA
[4] Univ Newcastle, Callaghan, NSW, Australia
[5] Google Res, Mountain View, CA USA
[6] Salesforce Res, Palo Alto, CA USA
基金
中国国家自然科学基金;
关键词
Code intelligence; deep learning; code representation; code embedding; open source; toolkit; benchmark;
D O I
10.1145/3510454.3516863
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present NATURALCC, an efficient and extensible open-source toolkit for machine-learning-based source code analysis (i.e., code intelligence). Using NATURALCC, researchers can conduct rapid prototyping, reproduce state-of-the-art models, and/or exercise their own algorithms. NATURALCC is built upon Fairseq and PyTorch, providing (1) a collection of code corpus with preprocessing scripts, (2) a modular and extensible framework that makes it easy to reproduce and implement a code intelligence model, and (3) a benchmark of state-of-the-art models. Furthermore, we demonstrate the usability of our toolkit over a variety of tasks (e.g., code summarization, code retrieval, and code completion) through a graphical user interface. The website of this project is http://xcodemind.github.io, where the source code and demonstration video can be found.
引用
收藏
页码:149 / 153
页数:5
相关论文
共 50 条
  • [21] The Role of Open-Source Software in Artificial Intelligence
    Spohrer, Jim
    [J]. AI MAGAZINE, 2021, 42 (01) : 93 - 94
  • [22] Novel Application of Open-Source Cyber Intelligence
    Sufi, Fahim
    [J]. ELECTRONICS, 2023, 12 (17)
  • [23] Pyradi: an open-source toolkit for infrared calculation and data processing
    Willers, Cornelius J.
    Willers, Maria S.
    Santos, Ricardo Augusto T.
    van der Merwe, Petrus J.
    Calitz, Johannes J.
    de Waal, Alta
    Mudau, Azwitamisi E.
    [J]. TECHNOLOGIES FOR OPTICAL COUNTERMEASURES IX, 2012, 8543
  • [24] CircadiPy: An open-source toolkit for analyzing chronobiology time series
    Carvalho-Moreira, Joao Pedro
    Guarnieri, Leonardo de Oliveira
    Passos, Matheus Costa
    Emrich, Felipe
    Bargi-Souza, Paula
    Peliciari-Garcia, Rodrigo Antonio
    Moraes, Marcio Flavio Dutra
    [J]. JOURNAL OF NEUROSCIENCE METHODS, 2024, 411
  • [25] An open-source radiotherapy image registration toolkit integrated with CERR
    Wu, Y.
    Yang, D.
    Khullar, D.
    El Naqa, I.
    Deasy, J.
    [J]. MEDICAL PHYSICS, 2007, 34 (06) : 2397 - 2397
  • [26] Methods and open-source toolkit for analyzing and visualizing challenge results
    Wiesenfarth, Manuel
    Reinke, Annika
    Landman, Bennett A.
    Eisenmann, Matthias
    Saiz, Laura Aguilera
    Cardoso, M. Jorge
    Maier-Hein, Lena
    Kopp-Schneider, Annette
    [J]. SCIENTIFIC REPORTS, 2021, 11 (01)
  • [27] SymCog: An open-source toolkit for assessing human symbolic cognition
    Flurie, Maurice
    Kelly, Alexandra
    Olson, Ingrid R.
    Reilly, Jamie
    [J]. BEHAVIOR RESEARCH METHODS, 2023, 55 (02) : 807 - 823
  • [28] MMDAGENT - A FULLY OPEN-SOURCE TOOLKIT FOR VOICE INTERACTION SYSTEMS
    Lee, Akinobu
    Oura, Keiichiro
    Tokuda, Keiichi
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8382 - 8385
  • [29] UER: An Open-Source Toolkit for Pre-training Models
    Zhao, Zhe
    Chen, Hui
    Zhang, Jinbin
    Zhao, Xin
    Liu, Tao
    Lu, Wei
    Chen, Xi
    Deng, Haotang
    Ju, Qi
    Du, Xiaoyong
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2019, : 241 - 246
  • [30] DeepRec: An Open-source Toolkit for Deep Learning based Recommendation
    Zhang, Shuai
    Tay, Yi
    Yao, Lina
    Wu, Bin
    Sun, Aixin
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 6581 - 6583