On Propagated Scoring for Semisupervised Additive Models

被引:5
|
作者
Culp, Mark [1 ]
机构
[1] W Virginia Univ, Dept Stat, Morgantown, WV 26506 USA
关键词
Additive model; Fixed-point optimization; Semisupervised learning;
D O I
10.1198/jasa.2011.tm09316
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This article presents a semisupervised modeling framework that combines feature-based (x) data and graph-based (G) data for classification/regression of the response Y. In this semisupervised setting, Y is observed for a subset of the observations (labeled) and missing for the remainder (unlabeled). The Propagated Scoring algorithm proposed for fitting this model is a semisupervised fixed-point regularization approach that essentially extends the generalized additive model into the semisupervised setting. I first articulate when semisupervised degeneracies are expected within my framework, and then provide a general regularization strategy to address such circumstances. For statistical analysis, I establish that the approach uses shrinking smoothers, provide circumstances in which when the result is consistent, provide measures of inference and description, and establish clear connections to supervised models. Several semisupervised approaches have been considered for the classification problem posed, typically motivated from energy optimization perspective. In this work, I rigorously connect the statistically based propagated scoring framework to this class of approaches. This is particularly insightful, especially with regard to supervised comparisons, because this type of analysis is lacking for the previous work. Two applications are presented, one involving classification of protein location on a cell using a network of protein interaction data and the other involving classification of text documents with citation network information and text data. This article has supplementary material online.
引用
收藏
页码:248 / 259
页数:12
相关论文
共 50 条
  • [21] Construction and application of scoring models
    Obrova, Vladena
    PROCEEDINGS OF 30TH INTERNATIONAL CONFERENCE MATHEMATICAL METHODS IN ECONOMICS, PTS I AND II, 2012, : 658 - 663
  • [22] Subagging for credit scoring models
    Paleologo, Giuseppe
    Elisseeff, Andre
    Antonini, Gianluca
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2010, 201 (02) : 490 - 499
  • [23] MULTIPLE CRITERIA SCORING MODELS
    MOORE, JR
    BAKER, NR
    MANAGEMENT SCIENCE SERIES B-APPLICATION, 1970, 17 (04): : B255 - B256
  • [24] Scoring hidden Markov models
    Barrett, C
    Hughey, R
    Karplus, K
    COMPUTER APPLICATIONS IN THE BIOSCIENCES, 1997, 13 (02): : 191 - 199
  • [25] Parsimonious additive models
    Avalos, Marta
    Grandvalet, Yves
    Ambroise, Christophe
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 51 (06) : 2851 - 2870
  • [26] ADDITIVE ISOTONIC MODELS
    BACCHETTI, P
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1989, 84 (405) : 289 - 294
  • [27] Sparse additive models
    Ravikumar, Pradeep
    Lafferty, John
    Liu, Han
    Wasserman, Larry
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2009, 71 : 1009 - 1030
  • [28] Functional Additive Models
    Mueller, Hans-Georg
    Yao, Fang
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (484) : 1534 - 1544
  • [29] Mixed Additive Models
    carvalho, Francisco
    Covas, Ricardo
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2015 (ICNAAM-2015), 2016, 1738
  • [30] ADDITIVE CHOICE MODELS
    TVERSKY, A
    AMERICAN PSYCHOLOGIST, 1964, 19 (07) : 539 - 539