Combining probability models and web mining models: a framework for proper name transliteration

被引:5
|
作者
Zhou, Yilu [1 ]
Huang, Feng [2 ]
Chen, Hsinchun [3 ]
机构
[1] George Washington Univ, Dept Informat Syst & Management, Washington, DC 20052 USA
[2] Adv Micro Devices Inc, Handheld Div, Consumer Elect Grp, Sunnyvale, CA 94088 USA
[3] Univ Arizona, Dept Management Informat Syst, Tucson, AZ 85721 USA
来源
INFORMATION TECHNOLOGY & MANAGEMENT | 2008年 / 9卷 / 02期
关键词
name transliteration; Hidden Markov model; web mining;
D O I
10.1007/s10799-007-0031-9
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
The rapid growth of the Internet has created a tremendous number of multilingual resources. However, language boundaries prevent information sharing and discovery across countries. Proper names play an important role in search queries and knowledge discovery. When foreign names are involved, proper names are often translated phonetically which is referred to as transliteration. In this research we propose a generic transliteration framework, which incorporates an enhanced Hidden Markov Model (HMM) and a Web mining model. We improved the traditional statistical-based transliteration in three areas: (1) incorporated a simple phonetic transliteration knowledge base; (2) incorporated a bigram and a trigram HMM; ( 3) incorporated a Web mining model that uses word frequency of occurrence information from the Web. We evaluated the framework on an English-Arabic back transliteration. Experiments showed that when using HMM alone, a combination of the bigram and trigram HMM approach performed the best for English-Arabic transliteration. While the bigram model alone achieved fairly good performance, the trigram model alone did not. The Web mining approach boosted the performance by 79.05%. Overall, our framework achieved a precision of 0.72 when the eight best transliterations were considered. Our results show promise for using transliteration techniques to improve multilingual Web retrieval.
引用
收藏
页码:91 / 103
页数:13
相关论文
共 50 条
  • [41] Combining Web 2.0 and IMS:: The road to new services and business models
    Dorbes, G.
    Amosse, H.
    ENRICHING COMMUNICATIONS, 2008, 2 (01): : 42 - 47
  • [42] FINDING CHOICE ALTERNATIVES IN MEMORY - PROBABILITY-MODELS OF BRAND-NAME RECALL
    HUTCHINSON, JW
    RAMAN, K
    MANTRALA, MK
    JOURNAL OF MARKETING RESEARCH, 1994, 31 (04) : 441 - 461
  • [43] A Framework for Interoperability Analysis on the Semantic Web using Architecture Models
    Ullberg, Johan
    Lagerstroem, Robert
    Ekstedt, Mathias
    EDOCW: 2008 12TH ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE WORKSHOPS, 2008, : 278 - 286
  • [44] Domain Ontology Mapping Based Semantic Web Mining Models Research
    Cai, Jiaojie
    Zhang, Yufeng
    Hu, Feng
    Dong, Jianfeng
    ELECTRONIC-BUSINESS INTELLIGENCE: FOR CORPORATE COMPETITIVE ADVANTAGES IN THE AGE OF EMERGING TECHNOLOGIES & GLOBALIZATION, 2010, 14 : 374 - 381
  • [45] Mining Behavior Models from User-Intensive Web Applications
    Ghezzi, Carlo
    Pezze, Mauro
    Sama, Michele
    Tamburrelli, Giordano
    36TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2014), 2014, : 277 - 287
  • [46] Evaluating Web Based Instructional Models Using Association Rule Mining
    Garcia, Enrique
    Romero, Cristobal
    Ventura, Sebastian
    de Castro, Carlos
    USER MODELING, ADAPTATION, AND PERSONALIZATION, PROCEEDINGS, 2009, 5535 : 16 - 29
  • [47] Models for Recommender Systems in Web Usage Mining Based on User Ratings
    Ganapathy, Gopinath
    Arunesh, K.
    WORLD CONGRESS ON ENGINEERING, WCE 2011, VOL I, 2011, : 525 - 530
  • [48] A framework for combining theoretical conceptual practice models in occupational therapy practice
    Ikiugu, Moses N.
    Smallfield, Stacy
    Condit, Charity
    CANADIAN JOURNAL OF OCCUPATIONAL THERAPY-REVUE CANADIENNE D ERGOTHERAPIE, 2009, 76 (03): : 162 - 170
  • [49] Combining description logics and object oriented models in an information integration framework
    Lukácsy, Gergely
    Szeredi, Péter
    Periodica Polytechnica Electrical Engineering, 2009, 53 (1-2): : 17 - 30
  • [50] GrAM: Reasoning with grounded action models by combining knowledge representation and data mining
    Hoyningen-Huene, Nicolai V.
    Kirchlechner, Bernhard
    Beetz, Michael
    TOWARDS AFFORDANCE-BASED ROBOT CONTROL, 2008, 4760 : 47 - +