Latent Dirichlet Allocation (LDA) for improving the topic modeling of the official bulletin of the spanish state (BOE)

被引:4
|
作者
Bailon-Elvira, J. C. [1 ]
Cobo, M. J. [2 ]
Herrera-Viedma, E. [1 ]
Lopez-Herrera, A. G. [1 ]
机构
[1] Univ Granada, Dept Comp Sci & Artificial Intelligence, Calle Daniel Saucedo Aranda S-N, E-18071 Granada, Spain
[2] Univ Cadiz, Dept Comp Sci & Engn, Ave Ramon Puyol, Cadiz 11202, Spain
关键词
Recommender systems; BOE; LDA; Alerts; RECOMMENDER SYSTEM; HYBRID;
D O I
10.1016/j.procs.2019.11.277
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since Internet was born most people can access fully free to a lot sources of information. Every day a lot of web pages are created and new content is uploaded and shared. Never in the history the humans has been more informed but also uninformed due the huge amount of information that can be access. When we are looking for something in any search engine the results are too many for reading and filtering one by one. Recommended Systems (RS) was created to help us to discriminate and filter these information according to ours preferences. This contribution analyses the RS of the official agency of publications in Spain (BOE), which is known as "Mi BOE'. The way this RS works was analysed, and all the meta-data of the published documents were analysed in order to know the coverage of the system. The results of our analysis show that more than 89% of the documents cannot be recommended, because they are not well described at the documentary level, some of their key meta-data are empty. So, this contribution proposes a method to label documents automatically based on Latent Dirichlet Allocation (LDA). The results are that using this approach the system could recommend (at a theoretical point of view) more than twice of documents that it now does, 11% vs 23% after applied this approach. (C) 2020 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-ne-nd/4.0/) Peer-review under responsibility of the scientific committee of the 7th International Conference on Information Technology and Quantitative Management (ITQM 2019)
引用
收藏
页码:207 / 214
页数:8
相关论文
共 50 条
  • [1] Latent Dirichlet allocation (LDA) for topic modeling of the CFPB consumer complaints
    Bastani, Kaveh
    Namavari, Hamed
    Shaffer, Jeffrey
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 127 : 256 - 271
  • [2] Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey
    Jelodar, Hamed
    Wang, Yongli
    Yuan, Chi
    Feng, Xia
    Jiang, Xiahui
    Li, Yanchao
    Zhao, Liang
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (11) : 15169 - 15211
  • [3] Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey
    Hamed Jelodar
    Yongli Wang
    Chi Yuan
    Xia Feng
    Xiahui Jiang
    Yanchao Li
    Liang Zhao
    [J]. Multimedia Tools and Applications, 2019, 78 : 15169 - 15211
  • [4] A FRAMEWORK OF URDU TOPIC MODELING USING LATENT DIRICHLET ALLOCATION (LDA)
    Shakeel, Khadija
    Tahir, Ghulam Rasool
    Tehseen, Irsha
    Ali, Mubashir
    [J]. 2018 IEEE 8TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2018, : 117 - 123
  • [5] Topic Modeling of the Pakistani Economy in English Newspapers via Latent Dirichlet Allocation (LDA)
    Ahmed, Fasih
    Nawaz, Muhammad
    Jadoon, Aisha
    [J]. SAGE OPEN, 2022, 12 (01):
  • [6] Topic Modeling Using Latent Dirichlet allocation: A Survey
    Chauhan, Uttam
    Shah, Apurva
    [J]. ACM COMPUTING SURVEYS, 2021, 54 (07)
  • [7] AUGMENTED LATENT DIRICHLET ALLOCATION (LDA) TOPIC MODEL WITH GAUSSIAN MIXTURE TOPICS
    Prabhudesai, Kedar S.
    Mainsah, Boyla O.
    Collins, Leslie M.
    Throckmorton, Chandra S.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2451 - 2455
  • [8] Topic modeling for expert finding using latent Dirichlet allocation
    Momtazi, Saeedeh
    Naumann, Felix
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 3 (05) : 346 - 353
  • [9] Approaches to improve preprocessing for Latent Dirichlet Allocation topic modeling
    Zimmermann, Jamie
    Champagne, Lance E.
    Dickens, John M.
    Hazen, Benjamin T.
    [J]. DECISION SUPPORT SYSTEMS, 2024, 185
  • [10] Topic modeling with latent Dirichlet allocation for cancer disease posts
    Altintas, Volkan
    Albayrak, Mehmet
    Topal, Kamil
    [J]. JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY, 2021, 36 (04): : 2183 - 2196