Syllable-Based Text Compression: A Language Case Study

被引:2
|
作者
Adubi, Stephen A. [1 ]
Misra, Sanjay [1 ,2 ]
机构
[1] Covenant Univ, Dept Comp & Informat Sci, Ota, Nigeria
[2] Atilim Univ, Dept Comp Engn, Ankara, Turkey
关键词
Syllables; Syllable-based compression; Text compression; Syllabification; ALGORITHM;
D O I
10.1007/s13369-016-2070-1
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Compression of texts has been widely studied by various researchers and in the process, several algorithms have been proposed. However, compression of texts using the syllabic structure of words in syllable-based languages has emerged as another dimension to the compression of texts. An algorithm for syllable extraction from words should be designed based on the structure of a language due to the ineffectiveness of the presently existing "universal" algorithms. Several syllable-based methods of compression proposed by different authors are reviewed in this work, including the methodologies used in achieving text compression. Finally, an algorithm for syllable extraction from words in the Yoruba language is presented and compared with four universal algorithms, recording the best result (100 % accuracy) among the five; the significance of this is that a dictionary of common syllables does not need to be created to achieve syllable-based text compression on the Yoruba Language.
引用
收藏
页码:3089 / 3097
页数:9
相关论文
共 50 条
  • [1] Syllable-Based Text Compression: A Language Case Study
    Stephen A. Adubi
    Sanjay Misra
    Arabian Journal for Science and Engineering, 2016, 41 : 3089 - 3097
  • [2] Genetic Algorithms in Syllable-Based Text Compression
    Kuthan, Tomas
    Lansky, Jan
    DATESO 2007 - DATABASES, TEXTS, SPECIFICATIONS, OBJECTS: PROCEEDINGS OF THE 7TH ANNUAL INTERNATIONAL WORKSHOP, 2007, 235 : 21 - 34
  • [3] A Syllable-Based Technique for Uyghur Text Compression
    Abliz, Wayit
    Wu, Hao
    Maimaiti, Maihemuti
    Wushouer, Jiamila
    Abiderexiti, Kahaerjiang
    Yibulayin, Tuergen
    Wumaier, Aishan
    INFORMATION, 2020, 11 (03)
  • [4] A genetic algorithm approach for verification of the syllable-based text compression technique
    Ucoluk, G
    Toroslu, IH
    JOURNAL OF INFORMATION SCIENCE, 1997, 23 (05) : 365 - 372
  • [5] Improved Syllable-Based Text to Speech Synthesis for Tone Language Systems
    Ekpenyong, Moses
    Udoh, EmemObong
    Udosen, Escor
    Urua, Eno-Abasi
    HUMAN LANGUAGE TECHNOLOGY CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, 2014, 8387 : 3 - 15
  • [6] Syllable-based Compression for XML Documents
    Chernik, Katsiaryna
    Lansky, Jan
    Galambos, Leo
    DATESO 2006 - DATABASES, TEXTS, SPECIFICATIONS, OBJECTS: PROCEEDINGS OF THE 6TH ANNUAL INTERNATIONAL WORKSHOP, 2006, 176 : 21 - 31
  • [7] Syllable-based Myanmar Language Model for Speech Recognition
    Soe, Wunna
    Thein, Yadana
    2015 IEEE/ACIS 14TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2015, : 291 - 296
  • [8] Syllable-Based Concatenative Speech Synthesis for Marathi Language
    Ghate, Pravin M.
    Shirbahadurkar, Suresh D.
    INFORMATION AND COMMUNICATION TECHNOLOGY FOR COMPETITIVE STRATEGIES, 2019, 40 : 615 - 624
  • [9] A Novel Text-to-Speech Synthesis System Using Syllable-Based HMM for Tamil Language
    Manoharan, J. Samuel
    PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON SUSTAINABLE EXPERT SYSTEMS (ICSES 2021), 2022, 351 : 305 - 314
  • [10] Development of syllable-based text to speech synthesis system in Bengali
    Narendra, N.
    Rao, K.
    Ghosh, Krishnendu
    Vempada, Ramu
    Maity, Sudhamay
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2011, 14 (03) : 167 - 181