An application of the nearest correlation matrix on web document classification

被引:6
|
作者
Qi, Houduo [1 ]
Xia, Zhonghang
Xing, Guangming
机构
[1] Univ Southampton, Sch Math, Southampton SO17 1BJ, Hants, England
[2] Western Kentucky Univ, Dept Comp Sci, Bowling Green, KY 42101 USA
关键词
support vector machines; classification; kernel matrix; semidefinite programming;
D O I
10.3934/jimo.2007.3.701
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The Web document is organized by a set of textual data according to a predefined logical structure. It has been shown that collecting Web documents with similar structures can improve query efficiency. The XML document has no vectorial representation, which is required in most existing classification algorithms. The kernel method has been applied to represent structural data with pairwise similarity. In this case, a set of Web data can be fed into classification algorithms in the format of a kernel matrix. However, since the distance between a pair of Web documents is usually obtained approximately, the derived distance matrix is not a kernel matrix. In this paper, we propose to use the nearest correlation matrix (of the estimated distance matrix) as the kernel matrix, which can be fast computed by a Newton- type method. Experimental studies show that the classification accuracy can be significantly improved.
引用
收藏
页码:701 / 713
页数:13
相关论文
共 50 条
  • [1] On near and the nearest correlation matrix
    Zusmanovich, Pasha
    JOURNAL OF NONLINEAR MATHEMATICAL PHYSICS, 2013, 20 (03) : 431 - 439
  • [2] Nearest Neighbour Distance Matrix Classification
    Sainin, Mohd Shamrie
    Alfred, Rayner
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2010, PT I, 2010, 6440 : 114 - 124
  • [3] BOUNDS FOR THE DISTANCE TO THE NEAREST CORRELATION MATRIX
    Higham, Nicholas J.
    Strabic, Natasa
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2016, 37 (03) : 1088 - 1102
  • [4] Feature reduction for Web document classification
    Song, MuHee
    Kang, DongJin
    Lee, SangJo
    2006 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY, PTS 1 AND 2, PROCEEDINGS, 2006, : 785 - 788
  • [5] Web document classification based on SVM
    Niu, Qiang
    Wang, Zhixiao
    Chen, Dai
    DCABES 2006 PROCEEDINGS, VOLS 1 AND 2, 2006, : 619 - 622
  • [6] Approximation of rank function and its application to the nearest low-rank correlation matrix
    Shujun Bi
    Le Han
    Shaohua Pan
    Journal of Global Optimization, 2013, 57 : 1113 - 1137
  • [7] COMPUTING A NEAREST CORRELATION MATRIX WITH FACTOR STRUCTURE
    Borsdorf, Ruediger
    Higham, Nicholas J.
    Raydan, Marcos
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2010, 31 (05) : 2603 - 2622
  • [8] Approximation of rank function and its application to the nearest low-rank correlation matrix
    Bi, Shujun
    Han, Le
    Pan, Shaohua
    JOURNAL OF GLOBAL OPTIMIZATION, 2013, 57 (04) : 1113 - 1137
  • [9] Adaptive web document classification with MCRDR
    Kim, YS
    Park, SS
    Deards, E
    Kang, BH
    ITCC 2004: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: CODING AND COMPUTING, VOL 1, PROCEEDINGS, 2004, : 476 - 480
  • [10] Generating word and document matrix representations for document classification
    Shun Guo
    Nianmin Yao
    Neural Computing and Applications, 2020, 32 : 10087 - 10108