Hierarchical Rhetorical Sentence Categorization for Scientific Papers

被引:0
|
作者
Rachman, G. H. [1 ]
Khodra, M. L. [1 ]
Widyantoro, D. H. [1 ]
机构
[1] Inst Teknol Bandung, Sch Elect Engn & Informat, Bandung, Indonesia
关键词
CLASSIFICATION;
D O I
10.1088/1742-6596/978/1/012055
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Important information in scientific papers can be composed of rhetorical sentences that is structured from certain categories. To get this information, text categorization should be conducted. Actually, some works in this task have been completed by employing word frequency, semantic similarity words, hierarchical classification, and the others. Therefore, this paper aims to present the rhetorical sentence categorization from scientific paper by employing TF-IDF and Word2Vec to capture word frequency and semantic similarity words and employing hierarchical classification. Every experiment is tested in two classifiers, namely Naive Bayes and SVM Linear. This paper shows that hierarchical classifier is better than flat classifier employing either TF-IDF or Word2Vec, although it increases only almost 2% from 27.82% when using flat classifier until 29.61% when using hierarchical classifier. It shows also different learning model for child category can be built by hierarchical classifier.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] A Multiclass-based Classification Strategy for Rhetorical Sentence Categorization from Scientific Papers
    Widyantoro, Dwi H.
    Khodra, Masayu L.
    Riyanto, Bambang
    Aziz, E. Aminudin
    [J]. JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2013, 7 (03) : 235 - 249
  • [2] Word Embedding for Rhetorical Sentence Categorization on Scientific Articles
    Rachman, Ghoziyah Haitan
    Khodra, Masayu Leylia
    Widyantoro, Dwi Hendratmo
    [J]. JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2018, 12 (02) : 168 - 184
  • [3] Rhetorical Sentence Categorization for Scientific Paper Using Word2Vec Semantic Representation
    Rachman, G. H.
    Khodra, M. L.
    Widyantoro, D. H.
    [J]. 1ST INTERNATIONAL CONFERENCE ON COMPUTING AND APPLIED INFORMATICS 2016 : APPLIED INFORMATICS TOWARD SMART ENVIRONMENT, PEOPLE, AND SOCIETY, 2017, 801
  • [4] Automatic Rhetorical Sentence Categorization on Indonesian Meeting Minutes
    Rachman, Ghoziyah Haitan
    Khodra, Masayu Leylia
    [J]. PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON DATA AND SOFTWARE ENGINEERING (ICODSE), 2016,
  • [5] Tentativeness in term formation A study of neology as a rhetorical device in scientific papers
    Pecman, Mojca
    [J]. TERMINOLOGY, 2012, 18 (01): : 27 - 58
  • [6] THE AGENTLESS SENTENCE AS RHETORICAL DEVICE
    COETZEE, JM
    [J]. LANGUAGE AND STYLE, 1980, 13 (01): : 26 - 34
  • [8] Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific Abstracts
    Jin, Di
    Szolovits, Peter
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3100 - 3109
  • [9] Label informed hierarchical transformers for sequential sentence classification in scientific abstracts
    Takola, Yaswanth Sri Sai Santosh
    Aluru, Sai Saketh
    Vallabhajosyula, Anoop
    Sanyal, Debarshi Kumar
    Das, Partha Pratim
    [J]. EXPERT SYSTEMS, 2023, 40 (06)
  • [10] Multilingual sentence categorization and novelty mining
    Zhang, Yi
    Tsai, Flora S.
    Kwee, Agus Trisnajaya
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2011, 47 (05) : 667 - 675