Smoothing Temporal Difference for Text Categorization

被引:0
|
作者
Fukumoto, Fumiyo [1 ]
Suzuki, Yoshimi [1 ]
机构
[1] Univ Yamanashi, Grad Fac Interdisciplinary Res, Kofu, Yamanashi, Japan
关键词
Temporal adaptation; Term smoothing; Text categorization; Transfer learning;
D O I
10.1007/978-3-319-28940-3_16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses text categorization problem that training data may be derived from a different time period than test data. We present a method for text categorization that minimizes the impact of temporal effects by using term smoothing and transfer learning techniques. We first used a technique called Temporal-based Term Smoothing (TTS) to replace those time sensitive features with representative terms, then applied boosting based transfer learning algorithm called TrAda-Boost for categorization. The results using a 21-year Japanese Mainichi Newspaper corpus showed that integrating term smoothing and transfer learning improves overall performance, especially it is effective when the creation time period of the test data differs greatly from the training data.
引用
下载
收藏
页码:203 / 214
页数:12
相关论文
共 50 条
  • [11] Memetic feature selection for multilabel text categorization using label frequency difference
    Lee, Jaesung
    Yu, Injun
    Park, Jaegyun
    Kim, Dae-Won
    INFORMATION SCIENCES, 2019, 485 : 263 - 280
  • [12] Max-difference maximization criterion: a feature selection method for text categorization
    Jin, Lingbin
    Zhang, Li
    Zhao, Lei
    FRONTIERS OF COMPUTER SCIENCE, 2023, 17 (01)
  • [13] Text Categorization: Implementation
    Jo, Taeho
    Studies in Big Data, 2019, 45 : 129 - 156
  • [14] Noisy text categorization
    Vinciarelli, A
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (12) : 1882 - 1895
  • [15] Noisy text categorization
    Vinciarelli, A
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, 2004, : 554 - 557
  • [16] Text categorization with ILA
    Sever, H
    Gorur, A
    Tolun, MR
    COMPUTER AND INFORMATION SCIENCES - ISCIS 2003, 2003, 2869 : 300 - 307
  • [17] Automated Text Categorization
    Patel, Atul
    Pathak, Samprati
    Khan, Md Irfan
    ICSPC'21: 2021 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION (ICPSC), 2021, : 16 - 20
  • [18] Classification and categorization: A difference that makes a difference
    Jacob, EK
    LIBRARY TRENDS, 2004, 52 (03) : 515 - 540
  • [19] Neural Text Categorizer for Exclusive Text Categorization
    Jo, Taeho
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2008, 4 (02): : 77 - 86
  • [20] Contextual Text Categorization: An Improved Stemming Algorithm to Increase the Quality of Categorization in Arabic Text
    Gadri, Said
    Moussaoui, Abdelouahab
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2017, 14 (06) : 835 - 841