Linear concepts and hidden variables

被引:7
|
作者
Grove, AJ
Roth, D
机构
[1] NECI, Princeton, NJ USA
[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
linear functions; Winnow; expectation-maximization; Naire Bayes;
D O I
10.1023/A:1007655119445
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study a learning problem which allows for a "fair" comparison between unsupervised learning methods-probabilistic model construction, and more traditional algorithms that directly learn a classification. The merits of each approach are intuitively clear: inducing a model is more expensive computationally, but may support a wider range of predictions. Its performance, however, will depend on how well the postulated probabilistic model fits that data. To compare the paradigms we consider a model which postulates a single binary-valued hidden variable on which all other attributes depend. In this model, finding the most likely value of any one variable (given known values for the others) reduces to testing a linear function of the observed values. We learn the model with two techniques: the standard EM algorithm, and a new algorithm we develop based on covariances. We compare these, in a controlled fashion, against an algorithm (a version of Winnow) that attempts to find a good linear classifier directly. Our conclusions help delimit the fragility of using a model that is even "slightly" simpler than the distribution actually generating the data, vs. the relative robustness of directly searching for a good predictor.
引用
收藏
页码:123 / 141
页数:19
相关论文
共 50 条