Low-Density Language Bootstrapping: The Case of Tajiki Persian

被引:0
|
作者
Megerdoomian, Karine [1 ]
Parvaz, Dan [1 ]
机构
[1] Mitre Corp, Mclean, VA 22102 USA
来源
SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008 | 2008年
关键词
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
Low-density languages raise difficulties for standard approaches to natural language processing that depend on large online corpora. Using Persian as a case study, we propose a novel method for bootstrapping MT capability for a low-density language in the case where it relates to a higher density variant. Tajiki Persian is a low-density language that uses the Cyrillic alphabet, while Iranian Persian (Farsi) is written in an extended version of the Arabic script and has many computational resources available. Despite the orthographic differences, the two languages have literary written forms that are almost identical. The paper describes the development of a comprehensive finite-state transducer that converts Tajik text to Farsi script and runs the resulting transliterated document through an existing Persian-to-English MT system. Due to divergences that arise in mapping the two writing systems and phonological and lexical distinctions, the system uses contextual cues (such as the position of a phoneme in a word) as well as available Farsi resources (such as a morphological analyzer to deal with differences in the affixal structures and a lexicon to disambiguate the analyses) to control the potential combinatorial explosion. The results point to a valuable strategy for the rapid prototyping of MT packages for languages of similar uneven density.
引用
收藏
页码:3293 / 3298
页数:6
相关论文
共 50 条
  • [41] SOME PROPERTIES OF LINEAR LOW-DENSITY POLYETHYLENE AND BRANCHED LOW-DENSITY POLYETHYLENE
    DOBRESCU, V
    ANDREI, G
    CIMPEANU, A
    ANDREI, C
    REVUE ROUMAINE DE CHIMIE, 1988, 33 (04) : 399 - 403
  • [42] MELT VISCOSITY AND ELASTICITY OF LOW-DENSITY AND LINEAR LOW-DENSITY POLYETHYLENE BLENDS
    ABRAHAM, D
    GEORGE, KE
    FRANCIS, DJ
    INTERNATIONAL JOURNAL OF POLYMERIC MATERIALS, 1992, 18 (3-4) : 197 - 211
  • [43] MECHANICAL-PROPERTIES OF BLENDS OF LOW-DENSITY WITH LINEAR LOW-DENSITY POLYETHYLENE
    LAMANTIA, FP
    ACIERNO, D
    EUROPEAN POLYMER JOURNAL, 1985, 21 (09) : 811 - 813
  • [44] THE OBSERVATIONAL CASE FOR A LOW-DENSITY UNIVERSE WITH A NONZERO COSMOLOGICAL CONSTANT
    OSTRIKER, JP
    STEINHARDT, PJ
    NATURE, 1995, 377 (6550) : 600 - 602
  • [45] Case 1: A patient with elevated low-density lipoprotein cholesterol
    Friedewald, VE
    Gotto, AM
    AMERICAN JOURNAL OF CARDIOLOGY, 2000, 85 (01): : 131 - +
  • [46] Phrase-Based Statistical Machine Translation for a Low-Density Language Pair
    Roy, Maxim
    Popowich, Fred
    ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2010, 6085 : 273 - 277
  • [47] PHYSICAL-CHEMICAL INVESTIGATIONS OF LOW-DENSITY LIPOPROTEIN AND VERY LOW-DENSITY LIPOPROTEIN
    SMALL, DM
    JOURNAL OF THE AMERICAN OIL CHEMISTS SOCIETY, 1979, 56 (02) : A189 - A189
  • [48] Is directly measured low-density lipoprotein clinically equivalent to calculated low-density lipoprotein?
    Baruch, Lawrence
    Agarwal, Sanjay
    Gupta, Bhanu
    Haynos, Ann
    Johnson, Swapna
    Kelly-Johnson, Katelyn
    Eng, Calvin
    JOURNAL OF CLINICAL LIPIDOLOGY, 2010, 4 (04) : 259 - 264
  • [49] Biosensors to detect low-density lipoprotein and oxidized low-density lipoprotein in cardiovascular disease
    Ranjbari, Sara
    Ritchie, Leona A.
    Arefinia, Reza
    Kesharwani, Prashant
    Sahebkar, Amirhossein
    SENSORS AND ACTUATORS A-PHYSICAL, 2024, 365
  • [50] DISTRIBUTION OF BETA-LIPOPROTEIN BETWEEN LOW-DENSITY AND VERY LOW-DENSITY LIPOPROTEINS
    JONES, RJ
    ALSADIR, J
    JOURNAL OF LABORATORY AND CLINICAL MEDICINE, 1971, 78 (06): : 994 - &