Extracting and evaluating topics by region

被引:0
|
作者
Joonho Noh
Soowon Lee
机构
[1] Soongsil University,Department of Computer Science and Engineering
来源
关键词
Topic extraction; Text mining; Clustering validity index;
D O I
暂无
中图分类号
学科分类号
摘要
Analyzing streaming data that contains regional information can derive the interest trends of a region and the differences from those of other regions. The results of analyzing regional differences can be used for making important decisions in areas such as regional marketing and national policy establishment. In this paper, we propose a method to extract topics that represent regional interests from news articles collected by region. The proposed method consists of a novel word-weighting step to extract regional keywords and a word-clustering step to extract regional topics based on the associations between the extracted keywords. The validity of the extracted regional topics is evaluated through a comparison with a ground-truth topic set. Since each topic is represented by a set of words, and a regional topic set is represented by a family of sets, we propose a new clustering validity index for families of sets for a given set of regions. Using the proposed clustering validity index, the optimal parameters for the collected data are presented through experiments.
引用
收藏
页码:12765 / 12777
页数:12
相关论文
共 50 条
  • [1] Extracting and evaluating topics by region
    Noh, Joonho
    Lee, Soowon
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (20) : 12765 - 12777
  • [2] Extracting shared topics of multiple documents
    Ji, X
    Zha, HY
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, 2003, 2637 : 100 - 110
  • [3] Extracting Turkish Tweet Topics Using LDA
    Gemci, Fahriye
    Peker, Kadir A.
    [J]. 2013 8TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ELECO), 2013, : 531 - 534
  • [4] Customer Experience: Extracting Topics From Tweets
    Mishra, Manit
    [J]. INTERNATIONAL JOURNAL OF MARKET RESEARCH, 2022, 64 (03) : 334 - 353
  • [5] Extracting latent communities from blogs based on topics
    Wu, Jianjun
    Chen, Junjie
    Huang, Ruihong
    [J]. RECENT ADVANCE OF CHINESE COMPUTING TECHNOLOGIES, 2007, : 306 - 309
  • [6] Extracting nonlinear neural topics with neural variational bayes
    Yiming Wang
    Ximing Li
    Jihong Ouyang
    Zeqi Guo
    Yimeng Wang
    [J]. World Wide Web, 2022, 25 : 131 - 149
  • [7] Extracting Multilingual Topics from Unaligned Comparable Corpora
    Jagarlamudi, Jagadeesh
    Daume, Hal, III
    [J]. ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2010, 5993 : 444 - 456
  • [8] Extracting Topics with Focused Communities for Social Content Recommendation
    Georgiou, Theodore
    El Abbadi, Amr
    Yan, Xifeng
    [J]. CSCW'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING, 2017, : 1432 - 1443
  • [9] Extracting nonlinear neural topics with neural variational bayes
    Wang, Yiming
    Li, Ximing
    Ouyang, Jihong
    Guo, Zeqi
    Wang, Yimeng
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2022, 25 (01): : 131 - 149
  • [10] Extracting Topics Based on Authors, Recipients and Content in Microblogs
    Rajani, Nazneen Fatema
    McArdle, Kate
    Baldridge, Jason
    [J]. SIGIR'14: PROCEEDINGS OF THE 37TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2014, : 1171 - 1174