Topic Modelling in Social Sciences - Case Study of Web of Science

被引:0
|
作者
Pandur, Maja Buhin [1 ]
Dobsa, Jasminka [1 ]
Kronegger, Luka [2 ]
机构
[1] Univ Zagreb, Fac Org & Informat, Pavlinska 2, Varazhdin, Croatia
[2] Univ Ljubljana, Fac Social Sci, Kardeljeva Ploscad 5, Ljubljana, Slovenia
关键词
topic modelling; Latent Dirichlet Allocation; Structural Topic Model; social sciences; INTERDISCIPLINARITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic modelling is one of the most popular topics investigated in the area of Natural Language Processing. One of the techniques used for topics modelling is Latent Dirichlet Allocation (LDA). It is an unsupervised machine learning technique which creates topics using a collection of documents based on words or n-grams with similar meaning. In this paper, we applied a Structural Topic Model with LDA to extract topics from scientific papers in Social Science. A structural topic modelling of 3663 articles from Web of Science Core Collection from 1999 to 2019 was conducted. The obtained results indicate that an optimal number of topics coincides with the existing number of research areas defined in Social Science or with its integer multiple. This opens an area for research into the comparison between the existing taxonomy and the taxonomy proposed by the LDA model and for the future identification of interdisciplinarity.
引用
收藏
页码:211 / 218
页数:8
相关论文
共 50 条