An approach for text categorization in digital library

被引:0
|
作者
Wang, Tao [1 ]
Desai, Bipin C. [1 ]
机构
[1] Concordia Univ, Dept Comp Sci, Montreal, PQ H3G 1M8, Canada
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text categorization is a very effective way to organize enormous number of documents in Digital Libraries. Accurate classification of documents is able to not only enhance document search precision, but also facilitate browsing-by-topic functionality. It is, nonetheless, difficult to obtain a satisfactory categorization accuracy compared to the corresponding results given by professional catalogers. This is due largely to the complexity of the pre-defined large-scaled category hierarchies that makes it difficult for learning algorithms to distinguish among categories. This paper describes a top-down document classification approach which takes advantage of the hierarchical structure, more specifically, in two ways: identifying the number of independent local classifiers and guiding top-down classification procedure. We finally evaluate it within the CINDI Digital Library applying ACM Classification System as targeted hierarchy. Experimental results show the promise of this approach.
引用
收藏
页码:21 / 27
页数:7
相关论文
共 50 条
  • [1] Hypatia Digital Library: A Text Classification Approach Based on Abstracts
    Vorgia, Frosso
    Triantafyllou, Ioannis
    Koulouris, Alexandros
    [J]. STRATEGIC INNOVATIVE MARKETING, 2017, : 727 - 733
  • [2] Text mining in a digital library
    Witten I.H.
    Don K.J.
    Dewsnip M.
    Tablan V.
    [J]. International Journal on Digital Libraries, 2004, 4 (1) : 56 - 59
  • [3] Digital library information categorization, visualization, and retrieval
    Chen, JX
    Alford, K
    Frieder, O
    [J]. INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, PROCEEDINGS, 1999, : 1404 - 1409
  • [4] Computing with words for text processing: An approach to the text categorization
    Zadrozny, S
    Kacprzyk, J
    [J]. INFORMATION SCIENCES, 2006, 176 (04) : 415 - 437
  • [5] Modeling with words: an approach to text categorization
    Shanahan, J
    [J]. 10TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3: MEETING THE GRAND CHALLENGE: MACHINES THAT SERVE PEOPLE, 2001, : 63 - 66
  • [6] A term weighting approach for text categorization
    Lee, KC
    Kang, SS
    Hahn, KS
    [J]. INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2005, 3689 : 673 - 678
  • [7] A fuzzy-based approach for text representation in text categorization
    Doan, S
    [J]. FUZZ-IEEE 2005: PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS: BIGGEST LITTLE CONFERENCE IN THE WORLD, 2005, : 1008 - 1013
  • [8] Metadata categorization for identifying search patterns in a digital library
    Bogaard, Tessel
    Hollink, Laura
    Wielemaker, Jan
    van Ossenbruggen, Jacco
    Hardman, Lynda
    [J]. JOURNAL OF DOCUMENTATION, 2019, 75 (02) : 270 - 286
  • [9] A New Approach of Feature Selection for Text Categorization
    CUI Zifeng~1
    2. Department of Computer Science and Engineering
    [J]. Wuhan University Journal of Natural Sciences, 2006, (05) : 1335 - 1339
  • [10] An incremental approach to text representation, categorization, and retrieval
    ONeil, P
    [J]. PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 714 - 717