OOV Words in an English-Arabic CLIR System

被引:0
|
作者
Bellaachia, Abdelghani [1 ]
Amor-Tijani, Ghita [1 ]
机构
[1] George Washington Univ, Dept Comp Sci, Washington, DC 20052 USA
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Proper nouns are usually primary keys in a query. Their correct translation might be necessary to maintain a good retrieval performance in a Cross Language Information Retrieval (CLIR) system. However, dictionaries only include the most commonly used proper nouns, like major countries and capitals. As they are spelling variants of each other in most languages, using an approximate string matching technique against the target database index is the common approach taken to find the target language correspondents of the original query key. N-gram. technique proved to be the most effective among other approximate string matching techniques. As we are dealing with an English-Arabic CLIR system which involves two languages of different alphabets, we decided to combine transliteration with the n-gram technique to generate the different spelling variants of Out Of Vocabulary (OOV) words. We call this technique: Transliteration Ngram (TNG). One issue that arises with the Arabic language is that words that are spelled similarly can have different meanings depending on the context of the sentence. This is particularly true for proper names, which usually have a meaning if used as a verb or adjective. To further enhance our transliteration approach, we chose to use Part Of Speech (POS) disambiguation to reduce the number of unrelated words from the set transliterations obtained using TNG.
引用
收藏
页码:886 / 894
页数:9
相关论文
共 50 条
  • [31] Text-based English-Arabic sentence alignment
    Fattah, Mohamed Abdel
    Ren, Fuji
    Kuroiwa, Shingo
    [J]. COMPUTATIONAL INTELLIGENCE, PT 2, PROCEEDINGS, 2006, 4114 : 748 - 753
  • [32] Translation. A Practical Guide for English-Arabic Translators
    En-Nehas, Jamal
    [J]. BABEL-REVUE INTERNATIONALE DE LA TRADUCTION-INTERNATIONAL JOURNAL OF TRANSLATION, 2013, 59 (04): : 508 - 511
  • [33] Intrinsic managing and the English-Arabic translation of fictional registers
    Saed, Hadeel
    Haider, Ahmad S.
    Tair, Sausan Abu
    Asiri, Eisa
    [J]. COGENT ARTS & HUMANITIES, 2024, 11 (01):
  • [34] English-Arabic Statistical Machine Translation: State of the Art
    Ebrahim, Sara
    Hegazy, Doaa
    Mostafa, Mostafa G. M.
    El-Beltagy, Samhaa R.
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT I, 2015, 9041 : 520 - 533
  • [35] Language and Emotion: Certain English-Arabic Translations Are Not Equivalent
    Kayyal, Mary H.
    Russell, James A.
    [J]. JOURNAL OF LANGUAGE AND SOCIAL PSYCHOLOGY, 2013, 32 (03) : 261 - 271
  • [36] Handling OOV Words In Arabic ASR Via Flexible Morphological Constraints
    Bach, Nguyen
    Noamany, Mohamed
    Lane, Ian
    Schultz, Tanja
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1057 - 1060
  • [37] Word Agreement and Ordering in English-Arabic Machine Translation
    Abu Shquier, Mohammed M.
    Sembok, Tengku Mohd T.
    [J]. INTERNATIONAL SYMPOSIUM OF INFORMATION TECHNOLOGY 2008, VOLS 1-4, PROCEEDINGS: COGNITIVE INFORMATICS: BRIDGING NATURAL AND ARTIFICIAL KNOWLEDGE, 2008, : 644 - +
  • [39] Is It Justified To Use Arabic In English Class? Efficacy Of English-Arabic Bilingual Teaching For Teaching English At Elementary Level
    Benyo, Ahmed
    Supriyatno, Triyo
    Borah, Anindita
    Kumar, Tribhuwan
    [J]. IJAZ ARABI JOURNAL OF ARABIC LEARNING, 2022, 5 (01):
  • [40] Investigating Cultural Competence in English-Arabic Translator Training Programs
    Bahumaid, Showqi
    [J]. META, 2010, 55 (03) : 569 - 588