Semi-automatic rule-based domain terminology and software feature-relevant information extraction from natural language user manualsAn approach and evaluation at Roche Diagnostics GmbH

被引:0
|
作者
Thomas Quirchmayr
Barbara Paech
Roland Kohl
Hannes Karey
Gunar Kasdepke
机构
[1] University of Heidelberg,Institute for Computer Science
[2] Roche Diagnostics GmbH,undefined
来源
关键词
Software feature; Terminology extraction; Atomic information extraction; NLP;
D O I
暂无
中图分类号
学科分类号
摘要
Mature software systems comprise a vast number of heterogeneous system capabilities which are usually requested by different groups of stakeholders and which evolve over time. Software features describe and bundle low level capabilities logically on an abstract level and thus provide a structured and comprehensive overview of the entire capabilities of a software system. Software features are often not explicitly managed. Quite the contrary, feature-relevant information is often spread across several software engineering artifacts (e.g., user manual, issue tracking systems). It requires huge manual effort to identify and extract feature-relevant information from these artifacts in order to make feature knowledge explicit. In this paper we present a two-step-approach to extract feature-relevant information from a user manual: First we semi-automatically extract a domain terminology from a natural language user manual based on linguistic patterns. Then, we apply natural language processing techniques based on the extracted domain terminology and structural sentence information. Our approach is able to extract atomic feature-relevant information with an F1-score of at least 92.00%. We describe the implementation of the approach as well as evaluations based on example sections of a user manual taken from industry.
引用
收藏
页码:3630 / 3683
页数:53
相关论文
共 2 条
  • [1] Semi-automatic rule-based domain terminology and software feature-relevant information extraction from natural language user manuals: An approach and evaluation at Roche Diagnostics GmbH
    Quirchmayr, Thomas
    Paech, Barbara
    Kohl, Roland
    Karey, Hannes
    Kasdepke, Gunar
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2018, 23 (06) : 3630 - 3683
  • [2] Semi-automatic Software Feature-Relevant Information Extraction from Natural Language User Manuals An Approach and Practical Experience at Roche Diagnostics GmbH
    Quirchmayr, Thomas
    Paech, Barbara
    Kohl, Roland
    Karey, Hannes
    [J]. REQUIREMENTS ENGINEERING: FOUNDATION FOR SOFTWARE QUALITY, REFSQ 2017, 2017, 10153 : 255 - 272