Text Document Classification

被引:0
|
作者
Novovicova, Jana [1 ]
机构
[1] UTIA, CRCIM, Prague, Czech Republic
来源
ERCIM NEWS | 2005年 / 62期
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
During the last twenty years the number of text documents in digital form has grown enormously in size. As a consequence, it is of great practical importance to be able to automatically organize and classify documents. Research into text classification aims to partition unstructured sets of documents into groups that describe the contents of the documents. There are two main variants of text classification: text clustering and text categorization. The former is concerned with finding a latent group structure in the set of documents, while the latter (also known as text classification) can be seen as the task of structuring the repository of documents according to a group structure that is known in advance.
引用
收藏
页码:53 / 54
页数:2
相关论文
共 50 条
  • [31] Picture, graphics and text classification of document image regions
    Revankar, S
    Fan, Z
    COLOR IMAGING: DEVICE-INDEPENDENT COLOR, COLOR HARDCOPY, AND GRAPHIC ARTS VI, 2001, 4300 : 224 - 228
  • [32] Arabic Text Classification Based on Word and Document Embeddings
    El Mahdaouy, Abdelkader
    Gaussier, Eric
    El Alaoui, Said Ouatik
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2016, 2017, 533 : 32 - 41
  • [33] Document Embedding based Supervised Methods for Turkish Text Classification
    Celenli, Halil I.
    Ozturk, S. Talha
    Sahin, Gurkan
    Gerek, Aydin
    Ganiz, Murat C.
    2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2018, : 477 - 482
  • [34] An efficient Wikipedia semantic matching approach to text document classification
    Wu, Zongda
    Zhu, Hui
    Li, Guiling
    Cui, Zongmin
    Huang, Hui
    Li, Jun
    Chen, Enhong
    Xu, Guandong
    INFORMATION SCIENCES, 2017, 393 : 15 - 28
  • [35] Improving Multi-Document Summarization via Text Classification
    Cao, Ziqiang
    Li, Wenjie
    Li, Sujian
    Wei, Furu
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3053 - 3059
  • [36] Unsupervised classification of text-centric XML document collections
    Doucet, Antoine
    Lehtonen, Miro
    COMPARATIVE EVALUATION OF XML INFORMATION RETRIEVAL SYSTEMS, 2007, 4518 : 497 - 509
  • [37] Text Document Classification with PCA and One-Class SVM
    Kumar, B. Shravan
    Ravi, Vadlamani
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON FRONTIERS IN INTELLIGENT COMPUTING: THEORY AND APPLICATIONS, FICTA 2016, VOL 1, 2017, 515 : 107 - 115
  • [38] Temporal Language Modeling for Short Text Document Classification with Transformers
    Pokrywka, Jakub
    Gralinski, Filip
    PROCEEDINGS OF THE 2022 17TH CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENCE SYSTEMS (FEDCSIS), 2022, : 121 - 128
  • [39] One-Class Text Document Classification with OCSVM and LSI
    Kumar, B. Shravan
    Ravi, Vadlamani
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2016, 2017, 517 : 597 - 606
  • [40] Bag-of-Concepts Document Representation for Bayesian Text Classification
    Mourino-Garcia, Marcos
    Perez-Rodriguez, Roberto
    Anido-Rifon, Luis
    Gomez-Carballa, Miguel
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (CIT), 2016, : 281 - 288