Enhancing Text Categorization Using Sentence Semantics

被引:0
|
作者
Shehata, Shady [1 ]
Karray, Fakhri [1 ]
Kamel, Mohamed [1 ]
机构
[1] Univ Waterloo, Pattern Anal & Machine Intelligence PAMI Res Grp, Waterloo, ON N2L 3G1, Canada
关键词
Data mining; text categorization; concept-based model;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of text categorization techniques are based on word and/or phrase analysis of the text. Statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in their documents, but one term contributes more to the meaning of its sentences than the other term. Thus, the underlying model should indicate terms that capture the semantics of text, In this case the model can capture terms that, present the concepts of the sentence, which leads to discover the topic of the document. A new concept-based model that analyzes terms on the sentence and levels rather than the traditional analysis of document only is introduced. The concept-based model can effectively discriminate between non-important terms with respect to sentence semantics and terms which hold the concepts that represent the sentence meaning. A set of experiments, using the proposed concept-based model on different datasets in text categorization is conducted. The experiments demonstrate the comparison between traditional weighting and the concept-based weighting, enhances the quality of Categorization quality of sets of documents substantially.
引用
收藏
页码:87 / 98
页数:12
相关论文
共 50 条
  • [1] AN EFFICIENT MODEL FOR ENHANCING TEXT CATEGORIZATION USING SENTENCE SEMANTICS
    Shehata, Shady
    Karray, Fakhri
    Kamel, Mohamed S.
    [J]. COMPUTATIONAL INTELLIGENCE, 2010, 26 (03) : 215 - 231
  • [2] The word, sentence, text and discourse semantics of the lexeme Kontext 'context'
    Nefedov, Sergey Trofimovich
    [J]. SPRACHE & SPRACHEN, 2020, : 29 - 42
  • [3] Video categorization using semantics and semiotics
    Rasheed, Z
    Shah, M
    [J]. VIDEO MINING, 2003, 6 : 185 - 217
  • [4] A Concept-based Model for Enhancing Text Categorization
    Shehata, Shady
    Karray, Fakhri
    Kamel, Mohamed
    [J]. KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2007, : 629 - 637
  • [5] Using WordNet for text categorization
    Elberrichi, Zakaria
    Rahmoun, Abdelattif
    Bentaalah, Mohamed Amine
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2008, 5 (01) : 16 - 24
  • [6] Using SVMs for text categorization
    Dumais, S
    [J]. IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1998, 13 (04): : 21 - 23
  • [7] Semantics of the sentence
    De Blasi, Francesca
    [J]. RASSEGNA DELLA LETTERATURA ITALIANA, 2020, 124 (02): : 635 - 636
  • [8] A Fusion Method of Text Categorization Based on Key Sentence Extraction and Neural Network
    Fang, Fang
    Wu, Zhen
    Zhang, Luchen
    Wang, Shi
    Cao, Cungen
    [J]. PROCEEDINGS OF 2017 2ND INTERNATIONAL CONFERENCE ON KNOWLEDGE ENGINEERING AND APPLICATIONS (ICKEA), 2017, : 166 - 172
  • [9] Text categorization: An experiment using phrases
    Kongovi, M
    Guzman, JC
    Dasigi, V
    [J]. ADVANCES IN INFORMATION REFTRIEVAL, 2002, 2291 : 213 - 228
  • [10] Automatic Text Categorization using NTC
    Jo, Taeho
    [J]. NDT: 2009 FIRST INTERNATIONAL CONFERENCE ON NETWORKED DIGITAL TECHNOLOGIES, 2009, : 26 - 31