A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences

被引:32
|
作者
Lee, Ming Che [1 ]
Chang, Jia Wei [2 ]
Hsieh, Tung Cheng [3 ]
机构
[1] Ming Chuan Univ, Dept Comp & Commun Engn, Taoyuan 333, Taiwan
[2] Natl Cheng Kung Univ, Dept Engn Sci, Tainan 701, Taiwan
[3] Hsuan Chuang Univ, Dept Visual Commun Design, Hsinchu 300, Taiwan
来源
关键词
INFORMATION; PRINCIPLES; EXTRACTION; RETRIEVAL; WORDNET;
D O I
10.1155/2014/437162
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to "artificial language", such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] GGREADA: A graph grammar-based machine design algorithm
    Schmidt, LC
    Cagan, J
    RESEARCH IN ENGINEERING DESIGN-THEORY APPLICATIONS AND CONCURRENT ENGINEERING, 1997, 9 (04): : 195 - 213
  • [22] SEMANTIC SIMILARITY BETWEEN SENTENCES
    HONECK, RP
    JOURNAL OF PSYCHOLINGUISTIC RESEARCH, 1973, 2 (02) : 137 - 151
  • [23] Selected Challenges in Grammar-Based Text Generation from the Semantic Web
    Mille, Simon
    ARTIFICIAL INTELLIGENCE, 2019, 11866 : 85 - 95
  • [24] A Grammar-Based Multi-Agent System for Language Evolution
    Dolores Jimenez-Lopez, Ma
    HIGHLIGHTS ON PRACTICAL APPLICATIONS OF AGENTS AND MULTI-AGENT SYSTEMS, 2012, 156 : 45 - 52
  • [25] Grammar-based Fuzzing
    Sargsyan, Sevak
    Kurmangaleev, Shamil
    Mehrabyan, Matevos
    Mishechkin, Maksim
    Ghukasyan, Tsolak
    Asryan, Sergey
    2018 IVANNIKOV MEMORIAL WORKSHOP (IVMEM 2018), 2018, : 32 - 35
  • [26] Grammar-based layout for a visual programming language generation system
    Zhang, KB
    Zhang, K
    Orgun, MA
    DIAGRAMMATIC REPRESENTATION AND INFERENCE, 2002, 2317 : 106 - 108
  • [27] A Space-Saving Approximation Algorithm for Grammar-Based Compression
    Sakamoto, Hiroshi
    Maruyama, Shirou
    Kida, Takuya
    Shimozono, Shinichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (02): : 158 - 165
  • [28] Lexicon-Grammar based open information extraction from natural language sentences in Italian
    Guarasci, Raffaele
    Damiano, Emanuele
    Minutolo, Aniello
    Esposito, Massimo
    De Pietro, Giuseppe
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 143
  • [29] Measuring the Similarity of Proteomes using Grammar-based Compression via Domain Combinations
    Hayashida, Morihiro
    Koyano, Hitoshi
    Nacher, Jose C.
    PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES, VOL 3: BIOINFORMATICS, 2020, : 117 - 122
  • [30] An effective grammar-based compression algorithm for tree structured data
    Yamagata, K
    Uchida, T
    Shoudai, T
    Nakamura, Y
    INDUCTIVE LOGIC PROGRAMMING, PROCEEDINGS, 2003, 2835 : 383 - 400