Named Entity Recognition for Code Review Comments

被引:0
|
作者
Kachanov, V. V. [1 ,2 ]
Khitrova, A. S. [1 ,3 ]
Markov, S. I. [1 ]
机构
[1] Russian Acad Sci, Inst Syst Programming, Moscow 109004, Russia
[2] Moscow Inst Phys & Technol, Dolgoprudnyi 141701, Moscow Oblast, Russia
[3] Lomonosov Moscow State Univ, Moscow 119991, Russia
关键词
machine learning; named entity recognition; dataset;
D O I
10.1134/S0361768824700233
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper addresses the problem of named entities recognition from source code reviews. The paper provides a comparative analysis of existing approaches and proposes its own methods to improve the quality of problem solving. Proposed and implemented improvements include: methods to deal with data imbalances, improved tokenization of input data, the use of large arrays of unlabeled data, and the use of additional binary classifiers. To assess quality, a new set of 3000 user code reviews was collected and manually labeled. It is shown that the proposed improvements can significantly increase the performance measured by quality metrics, calculated both at the token level (+22%) and at the entire entity level (+13%).
引用
收藏
页码:511 / 523
页数:13
相关论文
共 50 条
  • [21] Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition
    Winata, Genta Indra
    Lin, Zhaojiang
    Shin, Jamin
    Liu, Zihan
    Fung, Pascale
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3541 - 3547
  • [22] Named Entity Recognition Approaches
    Mansouri, Alireza
    Affendey, Lilly Suriani
    Mamat, Ali
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2008, 8 (02): : 339 - 344
  • [23] An Overview of Named Entity Recognition
    Sun, Peng
    Yang, Xuezhen
    Zhao, Xiaobing
    Wang, Zhijuan
    2018 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2018, : 273 - 278
  • [24] Named Entity Recognition on Arabic-English Code-Mixed Data
    Sabty, Caroline
    Elmahdy, Mohamed
    Abdennadher, Slim
    2019 13TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2019, : 93 - 97
  • [25] Arabic Named Entity Recognition
    Benajiba, Yassine
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (44): : 151 - 152
  • [26] Dynamic Named Entity Recognition
    Luiggi, Tristan
    Soulier, Laure
    Guigue, Vincent
    Jendoubi, Siwar
    Baelde, Aurelien
    38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023, 2023, : 890 - 897
  • [27] Speech recognition of a named entity
    Tomita, T
    Okimoto, Y
    Yamamoto, H
    Sagisaka, Y
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 1057 - 1060
  • [28] Named Entity Recognition in Query
    Guo, Jiafeng
    Xu, Gu
    Cheng, Xueqi
    Li, Hang
    PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 267 - 274
  • [29] Named Entity Recognition in Arabic: A Review of Some Current Systems
    Elsebai, Ali
    Meziane, Farid
    CREATING GLOBAL ECONOMIES THROUGH INNOVATION AND KNOWLEDGE MANAGEMENT: THEORY & PRACTICE, VOLS 1-3, 2009, : 1245 - 1251
  • [30] Recent Named Entity Recognition and Classification techniques: A systematic review
    Goyal, Archana
    Gupta, Vishal
    Kumar, Manish
    COMPUTER SCIENCE REVIEW, 2018, 29 : 21 - 43