Trend-based Document Clustering for Sensitive and Stable Topic Detection

被引:0
|
作者
Sato, Yoshihide [1 ]
Kawashima, Harumi [2 ]
Okuda, Hidenori [2 ]
Oku, Masahiro [2 ]
机构
[1] NTT Corp, NTT West Corp, 1-1 Hikarino Oka, Yokosuka, Kanagawa 2390847, Japan
[2] NTT Corp, NTT Cyber Solut Labs, Yokosuka, Kanagawa 2390847, Japan
关键词
trend; clustering; gradient model; word frequency;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The ability to detect new topics and track them is important given the huge amounts of documents. This paper introduces a trend-based document clustering algorithm for analyzing them. Its key characteristic; is that it gives scores to words on the basis of the fluctuation in word frequency. The algorithm generates clusters in a practical time, with O(n) processing cost due to preliminary calculation of document distances. The attribute allows the user to settle on the best level of granularity for identifying topics. Experiments prove that our algorithm can gather relevant documents with F measure of 63.0% on average from the beginning to the end of topic lifetime and it largely surpasses other algorithms.
引用
收藏
页码:331 / +
页数:2
相关论文
共 50 条
  • [1] Continuous trend-based clustering in data streams
    Kontaki, Maria
    Papadopoulos, Apostolos N.
    Manolopoulos, Yannis
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2008, 5182 : 251 - 262
  • [2] Weighted Fuzzy Clustering for Time Series With Trend-Based Information Granulation
    Guo, Hongyue
    Wan, Mengjun
    Wang, Lidong
    Liu, Xiaodong
    Pedrycz, Witold
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (02) : 903 - 914
  • [3] Trend-Based Granular Representation of Time Series and Its Application in Clustering
    Guo, Hongyue
    Wang, Lidong
    Liu, Xiaodong
    Pedrycz, Witold
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (09) : 9101 - 9110
  • [4] Trend-based time series data clustering for wind speed forecasting
    Kushwah, Varsha
    Wadhvani, Rajesh
    Kushwah, Anil Kumar
    WIND ENGINEERING, 2021, 45 (04) : 992 - 1001
  • [5] Trend-based transmission system diagnosis
    Wiig, J.
    Noura, H.
    Brun-Picard, D.
    Derain, J. P.
    PROCEEDINGS OF THE 45TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2006, : 1171 - 1176
  • [6] Automatic trend detection: Time-biased document clustering
    Behpour, Sahar
    Mohammadi, Mohammadmahdi
    Albert, Mark V.
    Alam, Zinat S.
    Wang, Lingling
    Xiao, Ting
    KNOWLEDGE-BASED SYSTEMS, 2021, 220
  • [7] A Synthetic, Trend-Based Benchmark for XPath
    Dyreson, Curtis
    Jin, Hao
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2009, 5667 : 35 - +
  • [8] Trend-based forecast of cryptocurrency returns*
    Tan, Xilong
    Tao, Yubo
    ECONOMIC MODELLING, 2023, 124
  • [9] Noise Document Detection for Document Retrieval Based on Topic Match
    Noh, Yunseok
    Park, Seong-Bae
    ADVANCED SCIENCE LETTERS, 2017, 23 (10) : 9478 - 9481
  • [10] A comparison of forecasting methods for medical device demand using trend-based clustering scheme
    Shuojiang Xu
    Hing Kai Chan
    Eugene Ch’ng
    Kim Hua Tan
    Journal of Data, Information and Management, 2020, 2 (2): : 85 - 94