Longest Sorted Sequence algorithm for parallel text alignment

被引:0
|
作者
Ildefonso, T [1 ]
Lopes, GP [1 ]
机构
[1] Univ Nova Lisboa, Fac Ciencias & Tecnol, CITI, P-2829516 Caparica, Portugal
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a language independent method for aligning parallel texts (texts that are translations of each other, or of a common source text), statistically supported. This new approach is inspired on previous work by Ribeiro et al (2000). The application of the second statistical filter, proposed by Ribeiro et al, based on Confidence Bands (CB), is substituted by the application of the Longest Sorted Sequence algorithm (LSSA). LSSA is described in this paper. As a result, 35% decrease in processing time and 18% increase in the number of aligned segments was obtained, for Portuguese-French alignments. Similar results were obtained regarding Portuguese-English alignments. Both methods are compared and evaluated, over a large parallel corpus made up of Portuguese, English and French parallel texts (approximately 250Mb of text per language).
引用
收藏
页码:81 / 90
页数:10
相关论文
共 50 条
  • [11] A parallel hybrid genetic algorithm for multiple protein sequence alignment
    Nguyen, HD
    Yoshihara, I
    Yamamori, K
    Yasunaga, M
    CEC'02: PROCEEDINGS OF THE 2002 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2002, : 309 - 314
  • [12] A Parallel Niched Pareto Evolutionary Algorithm for Multiple Sequence Alignment
    Mateus da Silva, Fernando Jose
    Sanchez Perez, Juan Manuel
    Gomez Pulido, Juan Antonio
    Vega Rodriguez, Miguel A.
    5TH INTERNATIONAL CONFERENCE ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY & BIOINFORMATICS (PACBB 2011), 2011, 93 : 157 - +
  • [13] A PARALLEL ALGORITHM FOR LARGE-SCALE MULTIPLE SEQUENCE ALIGNMENT
    Lopes, Heitor S.
    Erig Lima, Carlos R.
    Moritz, Guilherme L.
    COMPUTING AND INFORMATICS, 2010, 29 (06) : 1233 - 1250
  • [14] Parallel Longest Common Sequence Algorithm on Multicore Systems Using OpenACC, OpenMP and OpenMPI
    Li, Zuqing
    Goyal, Aakashdeep
    Kimm, Haklin
    2017 IEEE 11TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2017), 2017, : 158 - 165
  • [15] A PARALLEL ALGORITHM FOR THE CONSTRAINED MULTIPLE SEQUENCE ALIGNMENT PROBLEM DESIGNED FOR GPUs
    Gudys, Adam
    Deorowicz, Sebastian
    INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2012, 23 (04) : 877 - 901
  • [16] Multithreaded Parallel Sequence Alignment Based on Needleman-Wunsch Algorithm
    Gancheva, Veska
    Georgiev, Ivaylo
    2019 IEEE 19TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2019, : 165 - 169
  • [17] A scalable parallel algorithm for global sequence alignment with customizable scoring scheme
    Sadiq, Muhammad Umair
    Yousaf, Muhammad Murtaza
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (25):
  • [18] Parallel Linear Space algorithm for large-scale sequence alignment
    Li, E
    Xu, C
    Wang, T
    Jin, L
    Zhang, YM
    EURO-PAR 2005 PARALLEL PROCESSING, PROCEEDINGS, 2005, 3648 : 1207 - 1216
  • [19] ParAlign: a parallel sequence alignment algorithm for rapid and sensitive database searches
    Rognes, T
    NUCLEIC ACIDS RESEARCH, 2001, 29 (07) : 1647 - 1652
  • [20] A sorted Jacobi algorithm and its parallel implementation
    Xu, De-Chen
    Liu, Zhi-Wen
    Xu, You-Gen
    Cao, Jin-Liang
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2010, 30 (12): : 1470 - 1474