Unsupervised methods for developing taxonomies by combining syntactic and statistical information

被引:0
|
作者
Widdows, D [1 ]
机构
[1] Stanford Univ, Ctr Study Language & Informat, Stanford, CA 94305 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes an unsupervised algorithm for placing unknown words into a taxonomy and evaluates its accuracy on a large and varied sample of words. The algorithm works by first using a large corpus to find semantic neighbors of the unknown word, which we accomplish by combining latent semantic analysis with part-of-speech information. We then place the unknown word in the part of the taxonomy where these neighbors are most concentrated, using a class-labelling algorithm developed especially for this task. This method is used to reconstruct parts of the existing Word-Net database, obtaining results for common nouns, proper nouns and verbs. We evaluate the contribution made by part-of-speech tagging and show that automatic filtering using the class-labelling algorithm gives a fourfold improvement in accuracy.
引用
收藏
页码:276 / 283
页数:8
相关论文
共 50 条
  • [1] Combining brains: A survey of methods for statistical pooling of information
    Lazar, NA
    Luna, B
    Sweeney, JA
    Eddy, WF
    [J]. NEUROIMAGE, 2002, 16 (02) : 538 - 550
  • [2] Developing improved metamodels by combining phenomenological reasoning with statistical methods
    Bigelow, JH
    Davis, PK
    [J]. ENABLING TECHNOLOGIES FOR SIMULATION SCIENCE VI, 2002, 4716 : 167 - 180
  • [3] 3 METHODS OF DEVELOPING MMPI TAXONOMIES OF SEXUAL OFFENDERS
    HALL, GCN
    GRAHAM, JR
    SHEPHERD, JB
    [J]. JOURNAL OF PERSONALITY ASSESSMENT, 1991, 56 (01) : 2 - 13
  • [4] Combining syntactic information with HMM for term extraction
    Pan, Hua-Shan
    Zhao, Ji-Yuan
    [J]. 2015 2ND INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING ICISCE 2015, 2015, : 170 - 173
  • [5] Deep syntactic processing by combining shallow methods
    Dienes, M
    Dubey, A
    [J]. 41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2003, : 431 - 438
  • [6] Combining Acoustic, Lexical, and Syntactic Evidence for Automatic Unsupervised Prosody Labeling
    Ananthakrishnan, Sankaranarayanan
    Narayanan, Shrikanth
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 297 - 300
  • [7] Combining information in statistical modeling
    Pena, D
    [J]. AMERICAN STATISTICIAN, 1997, 51 (04): : 326 - 332
  • [8] Taxonomies: Practical approaches to developing and manging vocbularies for digital information
    Maceviciute, Elena
    [J]. INFORMATION RESEARCH-AN INTERNATIONAL ELECTRONIC JOURNAL, 2022, 27 (03):
  • [9] Unsupervised Numerical Information Extraction via Exploiting Syntactic Structures
    Wang, Zixiang
    Li, Tongliang
    Li, Zhoujun
    [J]. ELECTRONICS, 2023, 12 (09)
  • [10] STUDY OF SYNTACTIC STRUCTURES BY STATISTICAL-METHODS
    SEVBO, IP
    PETUNIN, YI
    [J]. NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1976, (02): : 17 - 36