Asymptotic Evaluation of Distance Measure on High Dimensional Vector Spaces in Text Mining

被引:0
|
作者
Goto, Masayuki [1 ]
Ishida, Takashi [2 ]
Suzuki, Makoto [3 ]
Hirasawa, Shigeichi [2 ]
机构
[1] Musashi Inst Technol, Fac Environm & Informat Studies, Tsuzuki Ku, Kanagawa 2240015, Japan
[2] Waseda Univ, Sch Creat Sci & Engn, Tokyo 1690015, Japan
[3] Shonan Inst Technol, Fac Engn, Fujisawa, Kanagawa 2518511, Japan
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper discusses the document classification problems in text mining from the viewpoint of asymptotic statistical analysis. In the problem of text mining, the several heuristics axe applied to practical analysis because of its experimental effectiveness in many case studies. The theoretical explanation about the performance of text mining techniques is required and such thinking will give us very clear idea. In this paper, the performances of distance measures used to classify the documents axe analyzed from the new viewpoint of asymptotic analysis. We also discuss the asymptotic performance of IDF measure used in the information retrieval field.
引用
收藏
页码:439 / +
页数:3
相关论文
共 50 条
  • [21] Bayesian interpretation of a distance function for navigating high-dimensional descriptor spaces
    Vogt, Martin
    Godden, Jeffrey W.
    Bajorath, Juergen
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2007, 47 (01) : 39 - 46
  • [22] A topology-independent similarity measure for high-dimensional feature spaces
    Kerdels, Jochen
    Peters, Cabriele
    [J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2007, PT 2, PROCEEDINGS, 2007, 4669 : 331 - +
  • [23] High Order Free Hyperplane Arrangements in 3-Dimensional Vector Spaces
    Norihiro Nakashima
    [J]. Algebras and Representation Theory, 2024, 27 : 877 - 896
  • [24] MKL-tree: an index structure for high-dimensional vector spaces
    Franco, Annalisa
    Lumini, Alessandra
    Maio, Dario
    [J]. MULTIMEDIA SYSTEMS, 2007, 12 (06) : 533 - 550
  • [25] High Order Free Hyperplane Arrangements in 3-Dimensional Vector Spaces
    Nakashima, Norihiro
    [J]. ALGEBRAS AND REPRESENTATION THEORY, 2024, 27 (01) : 877 - 896
  • [26] MKL-tree: an index structure for high-dimensional vector spaces
    Annalisa Franco
    Alessandra Lumini
    Dario Maio
    [J]. Multimedia Systems, 2007, 12 : 533 - 550
  • [27] A Local Discrete Text Data Mining Method in High-Dimensional Data Space
    Juan Li
    Aiping Chen
    [J]. International Journal of Computational Intelligence Systems, 15
  • [28] A Local Discrete Text Data Mining Method in High-Dimensional Data Space
    Li, Juan
    Chen, Aiping
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2022, 15 (01)
  • [29] Performance Evaluation of Evolving Classifier Algorithms in High Dimensional Spaces
    Rocha, Ranyeri
    Gomide, Fernando
    [J]. 2016 ANNUAL CONFERENCE OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY (NAFIPS), 2016,
  • [30] Distance Based Pattern Driven Mining for Outlier Detection in High Dimensional Big Dataset
    Kumar, Ankit
    Kumar, Abhishek
    Bashir, Ali Kashif
    Rashid, Mamoon
    Kumar, V. D. Ambeth
    Kharel, Rupak
    [J]. ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2022, 13 (01)