Hierarchical Topic Modeling for Urdu Text Articles

被引:2
|
作者
Rehman, Anwar Ur [1 ]
Khan, Ali Haider [2 ]
Aftab, Mustansar [3 ]
Rehman, Zobia [1 ]
Shah, Munam Ali [1 ]
机构
[1] Comsats Univ Islamabad, Dept Comp Sci, Islamabad, Pakistan
[2] Univ Management & Technol, Dept Comp Sci, Lahore, Pakistan
[3] Natl Coll Business Adm & Econ, Lahore, Pakistan
关键词
Hierarchal Topic model; Hierarchal LDA; Urdu Topic Model; Urdu Hierarchal LDA; Natural Language Processing; Gibbs sampling;
D O I
10.23919/iconac.2019.8895047
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Digital text is increasing rapidly on the Internet with the excessive use of social media. For this reason, it is very challenging to extract effective information from the digital text due its high dimensionality, sparseness and big data. In this paper, we study the powerful nonparametric Bayesian topic model which is Hierarchical Latent Dirichlet Allocation (hLDA) We deal the issue of learning topics hierarchies from Urdu text data. The presented Topic Model for Urdu is combined with preprocessing activities, hLDA model, and Gibbs Sampling (GS) algorithm. We present hLDA base topic model called Urdu Hierarchical Latent Dirichlet Allocation (uhLDA) Empirical study showed that uhLDA effectively learns the topics hierarchies from 5000 Urdu text documents. Furthermore, we evaluated the results using Pointwise Mutual information (PMI) and it shows that uhLDA outperforms as compared to existing standard topic model LDA.
引用
下载
收藏
页码:464 / 469
页数:6
相关论文
共 50 条
  • [31] Hierarchical neural topic modeling with manifold regularization
    Ziye Chen
    Cheng Ding
    Yanghui Rao
    Haoran Xie
    Xiaohui Tao
    Gary Cheng
    Fu Lee Wang
    World Wide Web, 2021, 24 : 2139 - 2160
  • [32] Hierarchical topic modeling with automatic knowledge mining
    Xu, Yueshen
    Yin, Jianwei
    Huang, Jianbin
    Yin, Yuyu
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 103 : 106 - 117
  • [33] On the Affinity, Rationality, and Diversity of Hierarchical Topic Modeling
    Wu, Xiaobao
    Pan, Fengjun
    Thong Nguyen
    Feng, Yichao
    Liu, Chaoqun
    Cong-Duy Nguyen
    Anh Tuan Luu
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 19261 - 19269
  • [34] Hierarchical neural topic modeling with manifold regularization
    Chen, Ziye
    Ding, Cheng
    Rao, Yanghui
    Xie, Haoran
    Tao, Xiaohui
    Cheng, Gary
    Wang, Fu Lee
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2021, 24 (06): : 2139 - 2160
  • [35] Topic selection for text classification using ensemble topic modeling with grouping, scoring, and modeling approach
    Daniel Voskergian
    Rashid Jayousi
    Malik Yousef
    Scientific Reports, 14 (1)
  • [36] Fuzzy topic modeling approach for text mining over short text
    Rashid, Junaid
    Shah, Syed Muhammad Adnan
    Irtaza, Aun
    INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (06)
  • [37] Author Gender Identification for Urdu Articles
    Sarwar, Raheem
    COMPUTATIONAL AND CORPUS-BASED PHRASEOLOGY, 2022, 13528 : 221 - 235
  • [38] Investigating Cybersecurity News Articles by Applying Topic Modeling Method
    Ghasiya, Piyush
    Okamura, Koji
    35TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2021), 2021, : 432 - 438
  • [39] Short Text Topic Modeling with Topic Distribution Quantization and Negative Sampling Decoder
    Wu, Xiaobao
    Li, Chunping
    Zhu, Yan
    Miao, Yishu
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1772 - 1782
  • [40] Topic modeling for OLAP on multidimensional text databases: Topic cube and its applications
    Zhang, Duo
    Zhai, ChengXiang
    Han, Jiawei
    Srivastava, Ashok
    Oza, Nikunj
    Statistical Analysis and Data Mining, 2009, 2 (5-6): : 378 - 395