Sentence splitting in Arabic to Spanish translation

被引:0
|
作者
Roldan, Juan [1 ]
Garcia, Manuel Feria [1 ,2 ]
机构
[1] Univ Granada, Granada, Spain
[2] Univ Granada, Fac Traducc & Interpretac, C-Buensuceso 11, Granada 18002, Spain
来源
关键词
Arabic to Spanish translation; sentence splitting; sentence boundary detection;
D O I
10.1075/resla.21008.rol
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
Modern Standard Arabic makes extensive use of coordination particles whereas punctuation marks are scarce and erratic, leading to long clauses. This is generally assumed to hinder Sentence Boundary Detection and to promote sentence splitting when translating from Arabic into English. Previous literature on translation from Arabic to Spanish is practically inexistent. We have tested this hypothesis regarding translation from Arabic to Spanish on a sample of 282,714 graphic words extracted from a bilingual corpus of 8,681,110 graphic words and found that each Arabic sentence yielded an average of 1.5 Spanish sentences. Furthermore, our data shows the potential impact of directionality in that sentence splitting when translating from Arabic into Spanish is 50% more frequent than from English into Arabic. We also determined statistically that five elements (wa [(sic)], haythu [(sic)], kama [(sic)], wa-qad [(sic)], and wa-dhalika [(sic)]) are the most salient potential markers for sentence splitting in the resulting Spanish translations. Our findings should be particularly interesting for Computational Linguistics and translator training.
引用
收藏
页码:585 / 614
页数:30
相关论文
共 50 条