Language independent statistical software for Corpus exploration

被引:3
|
作者
Sinclair, J [1 ]
Mason, O [1 ]
Ball, J [1 ]
Barnbrook, G [1 ]
机构
[1] Univ Birmingham, Sch English, Corpus Res, Birmingham B15 2TT, W Midlands, England
来源
COMPUTERS AND THE HUMANITIES | 1997年 / 31卷 / 03期
关键词
collocation; concordance lines; language independent software; lexical statistics;
D O I
10.1023/A:1000911520943
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this report two programs for statistical analysis of concordance lines are described. The programs have been developed for analysing the lexical context of a given word. It is shown how different parameter settings influence the outcome of collocational analysis, and how the concept of collocation can be extended to allow the extraction of lines typical for a word from a set of concordance lines. Even though all the examples are for English, the software is completely language independent and only requires minimal linguistic resources.
引用
收藏
页码:229 / 255
页数:27
相关论文
共 50 条
  • [21] The Fun of Exploration: How to Access a Non-Standard Language Corpus Visually
    Theron, Roberto
    Wandl-Vogt, Eveline
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [22] Language-independent and automated software composition: The FeatureHouse experience
    Apel, S., 1600, Institute of Electrical and Electronics Engineers Inc., United States (39):
  • [23] Sign language recognition by combining statistical DTW and independent classification
    Lichtenauer, Jeroen F.
    Hendriks, Emile A.
    Reinders, Marcel J. T.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (11) : 2040 - 2046
  • [24] A corpus exploration of huidiglik
    van Huyssteen, Gerhard B.
    LITERATOR-JOURNAL OF LITERARY CRITICISM COMPARATIVE LINGUISTICS AND LITERARY STUDIES, 2018, 39 (02):
  • [25] Phrase-based statistical language Modeling from bilingual parallel corpus
    Mao, Jun
    Cheng, Gang
    He, Yanxiang
    COMBINATORICS, ALGORITHMS, PROBABILISTIC AND EXPERIMENTAL METHODOLOGIES, 2007, 4614 : 317 - +
  • [26] A Standardized Project Gutenberg Corpus for Statistical Analysis of Natural Language and Quantitative Linguistics
    Gerlach, Martin
    Font-Clos, Francesc
    ENTROPY, 2020, 22 (01) : 126
  • [27] Exploration of the Problems and Solutions Based on the Translation of Computer Software into Japanese Language
    Hu, Lian
    Hu, Jing
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [28] A SYSTEM-INDEPENDENT GRAPHICAL USER-INTERFACE FOR STATISTICAL SOFTWARE
    LIU, LM
    CHAN, KK
    MONTGOMERY, AL
    MULLER, ME
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1995, 19 (01) : 23 - 44
  • [29] In the heart of Swahili: An exploration of data collection methods and corpus curation for natural language processing
    Masua, Bernard
    Masasi, Noel
    DATA IN BRIEF, 2024, 55
  • [30] mStatGraph: Exploration and statistical treatment software to process, compute and validate oceanographic data
    Varona, H. L.
    Noriega, C.
    Araujo, J.
    Lira, S. M. A.
    Araujo, M.
    Hernandez, F.
    SOFTWARE IMPACTS, 2023, 17