Vari-gram Language Model Based On Category

被引：0

作者：

Yuan, Lichi ^{[1
]}

机构：

[1] Jiangxi Univ Finance & Econ Nanchang, Sch Informat Technol, Nanchang 330013, Peoples R China

来源：

INFORMATION TECHNOLOGY FOR MANUFACTURING SYSTEMS II, PTS 1-3 | 2011年 / 58-60卷

关键词：

Word clustering; statistical language model; Vari-gram language model;

D O I：

10.4028/www.scientific.net/AMM.58-60.995

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Category-based statistic language model is an important method to solve the problem of sparse data. But there are two bottlenecks about this model: (1) the problem of word clustering, it is hard to find a suitable clustering method that has good performance and not large amount of computation. (2) class based method always lose some prediction ability to adapt the text of different domain. The authors try to solve above problems in this paper. This paper presents a novel definition of word similarity. Based on word similarity, this paper gives the definition of word set similarity. Experiments show that word clustering algorithm based on similarity is better than conventional greedy clustering method in speed and performance. At the same time, this paper presents a new method to create the van-gram model.

引用

页码：995 / 1000

页数：6

共 50 条

[21] Language puzzles - A prospective retrospective on the linguistic category model
Semin, Guen R.
[J]. JOURNAL OF LANGUAGE AND SOCIAL PSYCHOLOGY, 2008, 27 (02) : 197 - 209
[22] An N-gram based model for predicting of word-formation in Assamese language
Bhuyan, M. P.
Sarma, S. K.
[J]. JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2019, 40 (02): : 427 - 440
[23] Managed N-gram Language Model Based on Hadoop Framework and a Hbase Tables
Allam, Tahani Mahmoud
Sallam, Alsayed Abdelhameed
Abdullkader, Hatem M.
[J]. 2014 9TH INTERNATIONAL CONFERENCE ON INFORMATICS AND SYSTEMS (INFOS), 2014,
[24] A New Estimate of the n-gram Language Model
Aouragh, Si Lhoussain
Yousfi, Abdellah
Laaroussi, Saida
Gueddah, Hicham
Nejja, Mohammed
[J]. AI IN COMPUTATIONAL LINGUISTICS, 2021, 189 : 211 - 215
[25] W-n-gram: a hybrid language model
Wang, XL
Yeung, DS
Liu, JNK
Luk, R
Wang, X
[J]. IC-AI'2000: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 1-III, 2000, : 1265 - 1269
[26] Development of the N-gram Model for Azerbaijani Language
Bannayeva, Aliya
Aslanov, Mustafa
[J]. 2020 IEEE 14TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT2020), 2020,
[27] A graphical language for quantum protocols based on the category of cobordisms
DorDevic, Dusan
Petric, Zoran
Zekic, Mladen
[J]. QUANTUM STUDIES-MATHEMATICS AND FOUNDATIONS, 2024, 11 (03) : 643 - 671
[28] English grammar intelligent error correction technology based on the n-gram language model
Xiao, Fan
Yin, Shehui
[J]. JOURNAL OF INTELLIGENT SYSTEMS, 2024, 33 (01)
[29] A Corpus Based Unsupervised Bangla Word Stemming Using N-Gram Language Model
Urmi, Tapashee Tabassum
Jammy, Jasmine Jahan
Ismail, Sabir
[J]. 2016 5TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS AND VISION (ICIEV), 2016, : 824 - 828
[30] Dynamic Language Model Adaptation Using Keyword Category Classification
Yamamoto, Hitoshi
Hanazawa, Ken
Miki, Kiyokazu
Shinoda, Koichi
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2426 - +

← 1 2 3 4 5 →