Why Text Segment Classification Based on Part of Speech Feature Selection

被引:0
|
作者
Nagy, Iulia [1 ]
Tanaka, Katsuyuki [1 ]
Ariki, Yasuo [1 ]
机构
[1] Kobe Univ, Nada Ku, Kobe, Hyogo 6578501, Japan
来源
DISCOVERY SCIENCE, DS 2010 | 2010年 / 6332卷
关键词
Question-answering; supervised learning; feature selection;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The aim of our research is to develop a scalable automatic why question answering system for English based on supervised method that uses part of speech analysis. The prior approach consisted in building a why-classifier using function words. This paper investigates the performance of combining supervised data mining methods with various feature selection strategies in order to obtain a more accurate why classifier. Feature selection was performed a priori on the dataset to extract representative verbs and/or nouns and avoid the dimensionality curse. LogitBoost and SVM were used for the classification process. Three methods of extending the initial "function words only" approach, to handle context-dependent features, are proposed and experimentally evaluated on various datasets. The first considers function words and context-independent adverbs; the second incorporates selected lemmatized verbs; the third contains selected lemmatized verbs & nouns. Experiments on web-extracted datasets showed that all methods performed better than the baseline, with slightly more reliable results for the third one.
引用
收藏
页码:87 / 101
页数:15
相关论文
共 50 条
  • [1] Feature Selection for Text Classification Based on Part of Speech Filter and Synonym Merge
    Qin, Sijun
    Song, Jia
    Zhang, Pengzhou
    Tan, Yue
    [J]. 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 2015, : 681 - 685
  • [2] Feature Selection in Text Classification
    Sahin, Durmus Ozkan
    Ates, Nurullah
    Kilic, Erdal
    [J]. 2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 1777 - 1780
  • [3] Utility-based feature selection for text classification
    Heyong Wang
    Ming Hong
    Raymond Yiu Keung Lau
    [J]. Knowledge and Information Systems, 2019, 61 : 197 - 226
  • [4] Utility-based feature selection for text classification
    Wang, Heyong
    Hong, Ming
    Lau, Raymond Yiu Keung
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 61 (01) : 197 - 226
  • [5] Text classification based on feature selection and LDA model
    [J]. Zheng, C. (csahu@126.com), 1600, Binary Information Press, P.O. Box 162, Bethel, CT 06801-0162, United States (09):
  • [6] A Part of Speech Based Public Opinion Text Classification Method
    Liu, Rui
    Wei, Zhiqiang
    Liu, Hao
    Fu, Qianqian
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON HUMANITIES AND SOCIAL SCIENCE RESEARCH, 2015, 31 : 234 - 238
  • [7] Contextual feature selection for text classification
    Paradis, Francois
    Nie, Jian-Yun
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (02) : 344 - 352
  • [8] Hybrid feature selection for text classification
    Gunal, Serkan
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2012, 20 : 1296 - 1311
  • [9] Feature selection for text classification: A review
    Deng, Xuelian
    Li, Yuqing
    Weng, Jian
    Zhang, Jilian
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (03) : 3797 - 3816
  • [10] Dynamic feature selection in text classification
    Doan, Son
    Horiguchi, Susumu
    [J]. INTELLIGENT CONTROL AND AUTOMATION, 2006, 344 : 664 - 675