Syntactic and semantic disambiguation of numeral strings using an n-gram method

被引:0
|
作者
Min, KH [1 ]
Wilson, WH
Moon, YJ
机构
[1] AUT, Sch Comp & Informat Sci, Auckland, New Zealand
[2] UNSW, Sch Engn & Comp Sci, Sydney, NSW, Australia
[3] Hankuk Univ Foreign Studies, Dept Management Informat Syst, Yongin, Kyonggi, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes the interpretation of numerals, and strings including numerals, composed of a number and words or symbols that indicate whether the string is a SPEED, LENGTH, or whatever. The interpretation is done at three levels: lexical, syntactic, and semantic. The system employs three interpretation processes: a word trigram constructor with tokeniser, a rule-based processor of number strings, and n-gram based disambiguation of meanings. We extracted numeral strings from 378 online newspaper articles, finding that, on average, they comprised about 2.2% of the words in the articles. We chose 287 of these articles to provide unseen test data (3251 numeral strings), and used the remaining 91 articles to provide 886 numeral strings for use in manually extracting n-gram constraints to disambiguate the meanings of the numeral strings. We implemented six different disambiguation methods based on category frequency statistics collected from the sample data and on the number of word trigram constraints of each category. Precision ratios for the six methods when applied to the test data ranged from 85.6% to 87.9%.
引用
收藏
页码:82 / 91
页数:10
相关论文
共 50 条
  • [1] Comparison of numeral strings interpretation: Rule-based and feature-based N-gram methods
    Min, Kyongho
    Wilson, William H.
    [J]. AI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4304 : 1226 - +
  • [2] Unsupervised word sense disambiguation with N-gram features
    Preotiuc-Pietro, Daniel
    Hristea, Florentina
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2014, 41 (02) : 241 - 260
  • [3] Unsupervised word sense disambiguation with N-gram features
    Daniel Preotiuc-Pietro
    Florentina Hristea
    [J]. Artificial Intelligence Review, 2014, 41 : 241 - 260
  • [4] Semantic N-Gram Topic Modeling
    Kherwa, Pooja
    Bansal, Poonam
    [J]. EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2020, 7 (26) : 1 - 12
  • [5] Document classification using n-gram and word semantic similarity
    Ren, Mei-Ying
    Kang, Sinjae
    [J]. International Journal of Future Generation Communication and Networking, 2015, 8 (08): : 111 - 118
  • [6] Effects of semantic plausibility, syntactic complexity and n-gram frequency on children's sentence repetition
    Polisenska, Kamila
    Chiat, Shula
    Szewczyk, Jakub
    Twomey, Katherine E.
    [J]. JOURNAL OF CHILD LANGUAGE, 2021, 48 (02) : 261 - 284
  • [7] Web-Scale N-gram Models for Lexical Disambiguation
    Bergsma, Shane
    Lin, Dekang
    Goebel, Randy
    [J]. 21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1507 - 1512
  • [8] n-gram Models for Video Semantic Indexing
    Inoue, Nakamasa
    Shinoda, Koichi
    [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 777 - 780
  • [9] A hybrid method for syntactic and semantic structure disambiguation for Chinese
    Li, TQ
    Yang, XF
    Hong, QY
    Li, SZ
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: E-SYSTEMS AND E-MAN FOR CYBERNETICS IN CYBERSPACE, 2002, : 847 - 852
  • [10] Arabic supervised learning method using N-gram
    Sanan, Majed
    Rammal, Mahmoud
    Zreik, Khaldoun
    [J]. INTERACTIVE TECHNOLOGY AND SMART EDUCATION, 2008, 5 (03) : 157 - +