SF-CNN: Deep Text Classification and Retrieval for Text Documents

被引:5
|
作者
Sarasu, R. [1 ]
Thyagharajan, K. K. [2 ]
Shanker, N. R. [3 ]
机构
[1] Anna Univ, Dhanalaksmi Coll Engn, Comp Sci & Engn, Chennai, India
[2] Anna Univ, RMD Engn Coll, Chennai, India
[3] Anna Univ, Aalim Muhammed Salegh Coll Engn, Comp Sci & Engn, Chennai, India
来源
关键词
Semantic; classification; convolution neural networks; semantic enhancement;
D O I
10.32604/iasc.2023.027429
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Researchers and scientists need rapid access to text documents such as research papers, source code and dissertations. Many research documents are available on the Internet and need more time to retrieve exact documents based on keywords. An efficient classification algorithm for retrieving documents based on keyword words is required. The traditional algorithm performs less because it never considers words' polysemy and the relationship between bag-of-words in keywords. To solve the above problem, Semantic Featured Convolution Neural Networks (SF-CNN) is proposed to obtain the key relationships among the searching keywords and build a structure for matching the words for retrieving correct text documents. The proposed SF-CNN is based on deep semantic-based bag-of-word representation for document retrieval. Traditional deep learning methods such as Convolutional Neural Network and Recurrent Neural Network never use semantic representation for bag-of-words. The experiment is performed with different document datasets for evaluating the performance of the proposed SF-CNN method. SF-CNN classifies the documents with an accuracy of 94% than the traditional algorithms.
引用
收藏
页码:1799 / 1813
页数:15
相关论文
共 50 条
  • [1] Classification of text documents
    Li, YH
    Jain, AK
    [J]. FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2, 1998, : 1295 - 1297
  • [2] Classification of text documents
    Li, YH
    Jain, AK
    [J]. COMPUTER JOURNAL, 1998, 41 (08): : 537 - 546
  • [3] DPTCN: A novel deep CNN model for short text classification
    Yu, Shujuan
    Liu, Danlei
    Zhang, Yun
    Zhao, Shengmei
    Wang, Weigang
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (06) : 7093 - 7100
  • [4] Text Documents Classification by Associating Terms with Text Categories
    Srividhya, V.
    Anitha, R.
    [J]. APPLICATIONS OF SOFT COMPUTING: FROM THEORY TO PRAXIS, 2009, 58 : 223 - +
  • [5] Improving the automatic retrieval of text documents
    Agosti, M
    Bacchin, M
    Ferro, N
    Melucci, M
    [J]. ADVANCES IN CROSS-LANGUAGE INFORMATION RETRIEVAL, 2003, 2785 : 279 - 290
  • [6] Toward text understanding - Classification of text documents by word map
    Visa, A
    Toivonen, J
    Back, B
    Vanharanta, H
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY: THEORY, TOOLS, AND TECHNOLOGY II, 2000, 4057 : 299 - 305
  • [7] The Research of Text Preprocessing Effect on Text Documents Classification Efficiency
    Kurbatow, Andrew
    [J]. 2015 INTERNATIONAL CONFERENCE "STABILITY AND CONTROL PROCESSES" IN MEMORY OF V.I. ZUBOV (SCP), 2015, : 653 - 655
  • [8] Multi-Label Classification of Text Documents Using Deep Learning
    Mohammed, Hamza Haruna
    Dogdu, Erdogan
    Gorur, Abdul Kadir
    Choupani, Roya
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 4681 - 4689
  • [9] Text classification for Chinese web documents
    Hu, Ming
    Xu, Jianchao
    Hu, Liang
    [J]. COMPUTATIONAL METHODS, PTS 1 AND 2, 2006, : 1171 - +
  • [10] A fuzzy approach to classification of text documents
    WeiYi Liu
    Ning Song
    [J]. Journal of Computer Science and Technology, 2003, 18 : 640 - 647