Clustering suggestion for Chinese news web pages from multi-media sources

被引:0
|
作者
Chiu, Deng-Yiv [1 ]
Pan, Ya-Chen [1 ]
机构
[1] Chung Hua Univ, Dept Informat Management, Hsinchu, Taiwan
关键词
Chinese web pages clustering; multi-class SVM; fuzzy c-means algorithm; genetic algorithm;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There exist some news obviously classified into incorrect categories on Chinese web pages portal. The main reasons could be that it is difficult to automatically classify Chinese news and the news appearing on web pages portal are retrieved from many media sources. In this study, we integrate genetic algorithm and multi-class support vector machine (SVM) classifier to construct a Chinese news classification method. In addition, we find that some similar documents are scattered in different categories. The main reason could be that the categories of original media sources are different from those of news web pages portal. Those similar news should be collected to form a new category. We try to combine genetic algorithm and fuzzy c-means algorithm to propose a new approach to offer clustering suggestion for news web pages that are scattered in different categories and are from multi-media sources.
引用
收藏
页码:183 / 187
页数:5
相关论文
共 50 条
  • [1] Multi-media indexing over the web
    Agnew, B
    Faloutsos, C
    Wang, ZY
    Welch, D
    Xue, XG
    [J]. STORAGE AND RETRIEVAL FOR IMAGE AND VIDEO DATABASES V, 1997, 3022 : 72 - 83
  • [2] CHINESE ART - MULTI-MEDIA EXPRESSIONS
    不详
    [J]. SOCIAL EDUCATION, 1969, 33 (07): : 812 - 815
  • [3] Topic Detection and Tracking for Chinese News Web Pages
    Jing Qiu
    Liao, LeJian
    Dong, XiuJie
    [J]. ALPIT 2008: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 114 - 120
  • [4] Keyphrase extraction from Chinese news web pages based on semantic relations
    Xie, Fei
    Wu, Xindong
    Hu, Xue-Gang
    Wang, Fei-Yue
    [J]. INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2008, 5075 : 490 - +
  • [5] Keyphrase extraction from Chinese news web pages based on semantic relations
    Xie, Fei
    Wu, Xindong
    Hu, Xue-Gang
    Wang, Fei-Yue
    [J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2008, 5075 : 490 - 495
  • [6] APPLICATION OF WEB AND MULTI-MEDIA TO BASIC ENGLISH TEACHING
    Yang, Jianding
    [J]. PROCEEDINGS OF THE FIRST SINO-USA FORUM ON ENGLISH, THE WEB AND EDUCATION 2009, 2010, : 349 - 353
  • [7] Multi-media based web mining for an information resource
    Tan, HS
    George, SE
    [J]. DATA MINING III, 2002, 6 : 895 - 906
  • [8] Finding Topics in News Web Pages by Parameter-free Clustering
    Xiang, Ji
    Gao, Neng
    Jing, Jiwu
    [J]. ELECTRONIC-BUSINESS INTELLIGENCE: FOR CORPORATE COMPETITIVE ADVANTAGES IN THE AGE OF EMERGING TECHNOLOGIES & GLOBALIZATION, 2010, 14 : 418 - 426
  • [9] Hierarchical clustering of Chinese Web pages based on suffix tree
    School of Computer Science and Technology, Tianjin University, Tianjin 300072, China
    不详
    不详
    [J]. Liaoning Gongcheng Jishu Daxue Xuebao (Ziran Kexue Ban), 2006, 6 (890-892):
  • [10] Remoteness from sources of persistent organic pollutants in the multi-media global environment
    Goktas, Recep Kaya
    MacLeod, Matthew
    [J]. ENVIRONMENTAL POLLUTION, 2016, 217 : 33 - 41