Similarity between the Association Measures: a Case Study of Noun Phrases

被引:0
|
作者
Khokhlova, Maria [1 ]
机构
[1] St Petersburg State Univ, Dept Math Linguist, Univ Skaya Emb 11, St Petersburg 199034, Russia
关键词
collocability; collocations; corpora; statistics; statistical measures; gold standard;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Collocation extraction has gained much attention in natural language processing, its results are important in various areas of applied linguistics. The research focuses on a comparison between over a dozen of association measures based on a subset of the Russian Web corpus. The paper studies the automatically extracted Adj-Noun collocations. The aim of the experiments is two-fold. First, to examine the difference between statistical measures and second to find the most effective one for the Russian data. The former assumes the calculation of the Spearman's rank correlation coefficient and the latter implies the evaluation of the extracted lists against a Russian dictionary, i.e. identifying automatically extracted and manually collected collocations. The results are not such straightforward, one can distinguish between groups of measures that demonstrate a relative interchangeability. Also the produced bigrams can be considered as collocations by experts and thus may enrich dictionaries.
引用
收藏
页码:21 / 27
页数:7
相关论文
共 50 条
  • [41] On measures of similarity and distances between objects
    V. K. Leont’ev
    Computational Mathematics and Mathematical Physics, 2009, 49 : 1949 - 1965
  • [42] Similarity Measures between Hybrid Variables
    Li, Xiaozhong
    Tang, Wansheng
    Zhao, Ruiqing
    2008 IEEE INTERNATIONAL SYMPOSIUM ON KNOWLEDGE ACQUISITION AND MODELING WORKSHOP PROCEEDINGS, VOLS 1 AND 2, 2008, : 466 - 469
  • [43] Similarity Measures Between Arguments Revisited
    Amgoud, Leila
    David, Victor
    Doder, Dragan
    SYMBOLIC AND QUANTITATIVE APPROACHES TO REASONING WITH UNCERTAINTY, ECSQARU 2019, 2019, 11726 : 3 - 13
  • [44] A study of similarity measures through the paradigm of measurement theory: the classic case
    Giulianella Coletti
    Bernadette Bouchon-Meunier
    Soft Computing, 2019, 23 : 6827 - 6845
  • [45] Empirical Study of the Effects of Different Similarity Measures on Test Case Prioritization
    Wang, Rongcun
    Jiang, Shujuan
    Chen, Deng
    Zhang, Yanmei
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2016, 2016
  • [46] A study of similarity measures through the paradigm of measurement theory: the classic case
    Coletti, Giulianella
    Bouchon-Meunier, Bernadette
    SOFT COMPUTING, 2019, 23 (16) : 6827 - 6845
  • [47] Similarity and dissimilarity measures between fuzzy sets: A formal relational study
    Couso, Ines
    Garrido, Laura
    Sanchez, Luciano
    INFORMATION SCIENCES, 2013, 229 : 122 - 141
  • [48] A study of similarity measures through the paradigm of measurement theory: the fuzzy case
    Giulianella Coletti
    Bernadette Bouchon-Meunier
    Soft Computing, 2020, 24 : 11223 - 11250
  • [49] A study of similarity measures through the paradigm of measurement theory: the fuzzy case
    Coletti, Giulianella
    Bouchon-Meunier, Bernadette
    SOFT COMPUTING, 2020, 24 (15) : 11223 - 11250
  • [50] A Structured Approach to Form-Focused Instruction for Reading Comprehension in EAP: The Case of Complex Noun Phrases
    Priven, Dmitri
    TESL CANADA JOURNAL, 2022, 39 (02): : 89 - 104