Mixed-membership models of scientific publications

被引:195
|
作者
Erosheva, E [1 ]
Fienberg, S
Lafferty, J
机构
[1] Univ Washington, Dept Stat, Sch Social Work, Seattle, WA 98195 USA
[2] Univ Washington, Ctr Stat & Social Sci, Seattle, WA 98195 USA
[3] Carnegie Mellon Univ, Dept Stat, Pittsburgh, PA 15213 USA
[4] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA
[5] Carnegie Mellon Univ, Ctr Automated Learning & Discovery, Pittsburgh, PA 15213 USA
关键词
D O I
10.1073/pnas.0307760101
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
PNAS is one of world's most cited multidisciplinary scientific journals. The PNAS official classification structure of subjects is reflected in topic labels submitted by the authors of articles, largely related to traditionally established disciplines. These include broad field classifications into physical sciences, biological sciences, social sciences, and further subtopic classifications within the fields. Focusing on biological sciences, we explore an internal soft-classification structure of articles based only on semantic decompositions of abstracts and bibliographies and compare it with the formal discipline classifications. Our model assumes that there is a fixed number of internal categories, each characterized by multinomial distributions over words (in abstracts) and references (in bibliographies). Soft classification for each article is based on proportions of the article's content coming from each category. We discuss the appropriateness of the model for the PNAS database as well as other features of the data relevant to soft classification.
引用
收藏
页码:5220 / 5227
页数:8
相关论文
共 50 条
  • [31] A Tensor Approach to Learning Mixed Membership Community Models
    Anandkumar, Animashree
    Ge, Rong
    Hsu, Daniel
    Kakade, Sham M.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 2239 - 2312
  • [32] conferences, publications, and membership activities
    He, Bin
    IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE, 2010, 29 (01): : 5 - +
  • [33] A tensor approach to learning mixed membership community models
    Anandkumar, Animashree
    Ge, Rong
    Hsu, Daniel
    Kakade, Sham M.
    Journal of Machine Learning Research, 2014, 15 : 2239 - 2312
  • [34] Models for public health policy analysis reported in scientific publications
    Martinez, Gino Montenegro
    Montoya, Adiley Carmona
    Franco-Giraldo, Alvaro
    GACETA SANITARIA, 2021, 35 (03) : 270 - 281
  • [35] Estimating Identification Disclosure Risk Using Mixed Membership Models
    Manrique-Vallier, Daniel
    Reiter, Jerome P.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2012, 107 (500) : 1385 - 1394
  • [36] Phenotype Inference with Semi-Supervised Mixed Membership Models
    Rodriguez, Victor A.
    Perotte, Adler
    MACHINE LEARNING FOR HEALTHCARE CONFERENCE, VOL 106, 2019, 106
  • [37] LONGITUDINAL MIXED MEMBERSHIP TRAJECTORY MODELS FOR DISABILITY SURVEY DATA
    Manrique-Vallier, Daniel
    ANNALS OF APPLIED STATISTICS, 2014, 8 (04): : 2268 - 2291
  • [38] Spatio-temporal mixed membership models for criminal activity
    Virtanen, Seppo
    Girolami, Mark
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2021, 184 (04) : 1220 - 1244
  • [39] Information services; membership publications, the web
    Voll, SK
    INTERNATIONAL JOURNAL OF CANCER, 2002, : 143 - 143
  • [40] Meetings, membership, and publications: The fabric of ASHS
    Davis, FS
    HORTSCIENCE, 2005, 40 (07) : 1937 - 1938