IRText: An Item Response Theory-Based Approach for Text Categorization

被引:0
|
作者
Onder Coban
机构
[1] Adiyaman University,Department of Computer Engineering
关键词
Item response theory; Text categorization; Term weighting; Feature selection;
D O I
暂无
中图分类号
学科分类号
摘要
Text categorization (TC) is a machine learning task that tries to assign a text to one of the predefined categories. In a nutshell, texts are converted into numerical feature vectors in which each feature is bounded with a weight value. Afterward, a classifier is trained on vectorized texts and is used to classify previously unseen documents. Feature selection (FS) is also optionally applied to achieve better classification accuracy by using a lower number of features. Item response theory (IRT), on the other hand, is a set of statistical models designed to understand persons based on their responses to questions by assuming that responses on a given item are a function of both person and item properties. Even though there exist many studies devoted to understand, explore, and improve methods, there is not any previous study that aims at combining powers of these fields. As such, in this study, an IRT-based approach is proposed that suggests using the IRT score of a feature in both term weighting and FS that are important inter-steps of TC. The efficiency of the proposed approach is measured on two well-known benchmark datasets by comparing it with its two traditional peers. Experimental results show that the IRT-based approach can be used for text FS and there is open room for possible improvements. To the best of our knowledge, this study is the first of its kind which tries to adapt IRT for classical TC.
引用
收藏
页码:9423 / 9439
页数:16
相关论文
共 50 条
  • [41] The Shortened Raven Standard Progressive Matrices: Item Response Theory-Based Psychometric Analyses and Normative Data
    Van der Elst, Wim
    Ouwehand, Carolijn
    van Rijn, Peter
    Lee, Nikki
    Van Boxtel, Martin
    Jolles, Jelle
    ASSESSMENT, 2013, 20 (01) : 48 - 59
  • [42] A new modification and application of item response theory-based feature selection for different machine learning tasks
    Coban, Onder
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (26):
  • [43] Content and form characteristics that predict discriminating power in personality items: An Item Response Theory-based analysis
    Ferrando Piera, Pere Joan
    Demestre Viladevall, Josep
    PSICOTHEMA, 2008, 20 (04) : 851 - 856
  • [44] Component tree based categorization: An novel categorization approach for rich format text
    Zhu, Fei
    2008 PROCEEDINGS OF INFORMATION TECHNOLOGY AND ENVIRONMENTAL SYSTEM SCIENCES: ITESS 2008, VOL 2, 2008, : 1196 - 1202
  • [45] A Redundancy Based Term Weighting Approach for Text Categorization
    Lu, Zhen-Yu
    Lin, Yong-Min
    Zhao, Shuang
    Chen, Jing-Nian
    Zhu, Wei-Dong
    2009 WRI WORLD CONGRESS ON SOFTWARE ENGINEERING, VOL 2, PROCEEDINGS, 2009, : 36 - +
  • [46] APPLICATIONS OF THE ITEM RESPONSE THEORY IN EXAMINING OPTIMAL CATEGORIZATION AND SHORTENING OUTCOME MEASURES
    Wong, Eric
    Tang, Wai-Kwong
    Chan, Sandra
    Chiu, Helen
    Ungvari, Gabor S.
    Wong, Ka-Sing
    Kwok, Timothy
    Mok, Vincent
    QUALITY OF LIFE RESEARCH, 2004, 13 (09) : 1532 - 1532
  • [47] ASSESSMENT OF TRAITEDNESS: AN ITEM RESPONSE THEORY APPROACH
    Onen, Emine
    TPM-TESTING PSYCHOMETRICS METHODOLOGY IN APPLIED PSYCHOLOGY, 2024, 31 (03) : 285 - 301
  • [48] Item selection counts: A comparison of empirical key and rational scale validities in theory-based and non-theory-based item pools
    Reiter-Palmon, R
    Connelly, MS
    JOURNAL OF APPLIED PSYCHOLOGY, 2000, 85 (01) : 143 - 151
  • [49] Comparison of multiple-indicators, multiple-causes- and item response theory-based analyses of subgroup differences
    Willse, John T.
    Goodman, Joshua T.
    EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 2008, 68 (04) : 587 - 602
  • [50] Reliability, Validity, and Efficiency of an Item Response Theory-Based Balance Confidence Patient-Reported Outcome Measure
    Deutscher, Daniel
    Kallen, Michael A.
    Werneke, Mark W.
    Mioduski, Jerome E.
    Hayes, Deanna
    PHYSICAL THERAPY, 2023, 103 (07):