Text Document Classification

被引:0
|
作者
Novovicova, Jana [1 ]
机构
[1] UTIA, CRCIM, Prague, Czech Republic
来源
ERCIM NEWS | 2005年 / 62期
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
During the last twenty years the number of text documents in digital form has grown enormously in size. As a consequence, it is of great practical importance to be able to automatically organize and classify documents. Research into text classification aims to partition unstructured sets of documents into groups that describe the contents of the documents. There are two main variants of text classification: text clustering and text categorization. The former is concerned with finding a latent group structure in the set of documents, while the latter (also known as text classification) can be seen as the task of structuring the repository of documents according to a group structure that is known in advance.
引用
收藏
页码:53 / 54
页数:2
相关论文
共 50 条
  • [41] Topic document model approach for naive Bayes text classification
    Kim, SB
    Rim, HC
    Kim, JD
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (05): : 1091 - 1094
  • [42] Dataless Text Classification: A Topic Modeling Approach with Document Manifold
    Li, Ximing
    Li, Changchun
    Chi, Jinjin
    Ouyang, Jihong
    Li, Chenliang
    CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 973 - 982
  • [43] Temporal contexts: Effective text classification in evolving document collections
    Rocha, Leonardo
    Mourao, Fernando
    Mota, Hilton
    Salles, Thiago
    Goncalves, Marcos Andre
    Meira, Wagner, Jr.
    INFORMATION SYSTEMS, 2013, 38 (03) : 388 - 409
  • [44] Document classification using a deep neural network in text mining
    Lee, Bo-Hui
    Lee, Su-Jin
    Choi, Yong-Seok
    KOREAN JOURNAL OF APPLIED STATISTICS, 2020, 33 (05) : 615 - 625
  • [45] The ineffectiveness of within-document term frequency in text classification
    W. John Wilbur
    Won Kim
    Information Retrieval, 2009, 12 : 509 - 525
  • [46] Speculative text mining for document-level sentiment classification
    Wen, Jiahui
    Zhang, Guangda
    Zhang, Hongyun
    Yin, Wei
    Ma, Jingwei
    NEUROCOMPUTING, 2020, 412 (412) : 52 - 62
  • [47] The ineffectiveness of within-document term frequency in text classification
    Wilbur, W. John
    Kim, Won
    INFORMATION RETRIEVAL, 2009, 12 (05): : 509 - 525
  • [48] Heterogeneous Document Embeddings for Cross-Lingual Text Classification
    Moreo, Alejandro
    Pedrotti, Andrea
    Sebastiani, Fabrizio
    36TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2021, 2021, : 685 - 688
  • [49] Improved Document Feature Selection with Categorical Parameter for Text Classification
    Wang, Fen
    Li, Xiaoxuan
    Huang, Xiaotao
    Kang, Ling
    MOBILE, SECURE, AND PROGRAMMABLE NETWORKING (MSPN 2016), 2016, 10026 : 86 - 98
  • [50] Deep Inverse Regression with Modified Document Probability for Text Classification
    Ren, Ruoxu
    Ma, Li
    Tan, Kay Chen
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1654 - 1659