Named Entity Recognition for Code Review Comments

被引:0
|
作者
Kachanov, V. V. [1 ,2 ]
Khitrova, A. S. [1 ,3 ]
Markov, S. I. [1 ]
机构
[1] Russian Acad Sci, Inst Syst Programming, Moscow 109004, Russia
[2] Moscow Inst Phys & Technol, Dolgoprudnyi 141701, Moscow Oblast, Russia
[3] Lomonosov Moscow State Univ, Moscow 119991, Russia
关键词
machine learning; named entity recognition; dataset;
D O I
10.1134/S0361768824700233
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper addresses the problem of named entities recognition from source code reviews. The paper provides a comparative analysis of existing approaches and proposes its own methods to improve the quality of problem solving. Proposed and implemented improvements include: methods to deal with data imbalances, improved tokenization of input data, the use of large arrays of unlabeled data, and the use of additional binary classifiers. To assess quality, a new set of 3000 user code reviews was collected and manually labeled. It is shown that the proposed improvements can significantly increase the performance measured by quality metrics, calculated both at the token level (+22%) and at the entire entity level (+13%).
引用
收藏
页码:511 / 523
页数:13
相关论文
共 50 条
  • [41] Named Entity Recognition in Marathi Language
    Kale, Shrutika
    Govilkar, Sharvari
    INTERNATIONAL CONFERENCE ON INTELLIGENT DATA COMMUNICATION TECHNOLOGIES AND INTERNET OF THINGS, ICICI 2018, 2019, 26 : 371 - 377
  • [42] A Method of Named Entity Recognition for Tigrinya
    Yohannes, Hailemariam Mehari
    Amagasa, Toshiyuki
    APPLIED COMPUTING REVIEW, 2022, 22 (03): : 56 - 68
  • [43] Named Entity Recognition for Nepali Language
    Singh, Oyesh Mann
    Padia, Ankur
    Joshi, Anupam
    2019 IEEE 5TH INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (CIC 2019), 2019, : 184 - 190
  • [44] Named entity recognition without gazetteers
    Mikheev, A
    Moens, M
    Grover, C
    NINTH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS, 1999, : 1 - 8
  • [45] A Survey on Multimodal Named Entity Recognition
    Qian, Shenyi
    Jin, Wenduo
    Chen, Yonggang
    Ma, Jiangtao
    Qiao, Yaqiong
    Lu, Jinyu
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT IV, 2023, 14089 : 609 - 622
  • [46] Named entity recognition for the Kazakh language
    Kozhirbayev, Z. M.
    Yessenbayev, Z. A.
    JOURNAL OF MATHEMATICS MECHANICS AND COMPUTER SCIENCE, 2020, 107 (03): : 57 - 66
  • [47] Named entity recognition in Vietnamese documents
    Tri Tran, Q.
    Thao Pham, T.X.
    Hung Ngo, Q.
    Dinh, Dien
    Collier, Nigel
    Progress in Informatics, 2007, (04): : 5 - 13
  • [48] Named Entity Recognition for Sinhala Language
    Dahanayaka, J. K.
    Weerasinghe, A. R.
    14TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER) 2014, 2014, : 215 - 220
  • [49] Biomedical named entity recognition system
    Patrick, J. (jonpat@it.usyd.edu.au), 2005, School of Information Technologies
  • [50] Named Entity Recognition in Vietnamese Tweets
    Nguyen, Vu H.
    Nguyen, Hien T.
    Snasel, Vaclav
    COMPUTATIONAL SOCIAL NETWORKS, CSONET 2015, 2015, 9197 : 205 - 215