Transliteration Based Bengali Text Compression Using Huffman Principle

被引:0
|
作者
Hossain, Md Mamun [1 ]
Habib, Ahsan [1 ]
Rahman, Mohammad Shahidur [1 ]
机构
[1] Shahjalal Univ Sci & Technol, Sylhet, Bangladesh
关键词
Data compression; ASCII code; UNICODE; Huffman principle; Avro; Bengali text; Transliteration;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a new technique to compress more symbolic language like Bengali through less symbolic language like English using Huffman principle. First we transliterate the text of more symbolic language to less symbolic language, and then we apply Huffman principle on the transliterated text. We have also shown that our transliteration based proposed method outperform the existing basic Huffman technique for every piece of Bengali text and significant compression ratio can be achieved.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Transliteration Based Text Input Methods for Telugu
    Sowmya, V. B.
    Varma, Vasudeva
    COMPUTER PROCESSING OF ORIENTAL LANGUAGES: LANGUAGE TECHNOLOGY FOR THE KNOWLEDGE-BASED ECONOMY, 2009, 5459 : 122 - 132
  • [22] The Performance of Text File Compression Using Shannon-Fano and Huffman on Small Mobile Devices
    Mantoro, Teddy
    Ayu, Media A.
    Anggraini, Yayuk
    2017 INTERNATIONAL CONFERENCE ON COMPUTING, ENGINEERING, AND DESIGN (ICCED), 2017,
  • [23] Auto-correction of English to Bengali Transliteration System using Levenshtein Distance
    Hossain, Md Mosabbir
    Labib, Md Farhan
    Rifat, Ahmed Sady
    Das, Amit Kumar
    Mukta, Monira
    2019 7TH INTERNATIONAL CONFERENCE ON SMART COMPUTING & COMMUNICATIONS (ICSCC), 2019, : 197 - 201
  • [24] Manipuri Transliteration from Bengali Script to Meitei Mayek: A Rule Based Approach
    Nongmeikapam, Kishorjit
    Singh, Ningombam Herojit
    Thoudam, Sonia
    Bandyopadhyay, Sivaji
    INFORMATION SYSTEMS FOR INDIAN LANGUAGES, 2011, 139 : 195 - +
  • [25] Code Compression using Huffman and Dictionary-based Pattern Blocks
    Dias, W. R. A.
    Moreno, E. D.
    IEEE LATIN AMERICA TRANSACTIONS, 2015, 13 (07) : 2314 - 2321
  • [26] Text analysis for Bengali Text Summarization using Deep Learning
    Al Munzir, Abdullah
    Rahman, Md. Lutfor
    Abujar, Sheikh
    Ohidujjaman
    Hossain, Syed Akhter
    2019 10TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2019,
  • [27] PROBLEMS IN TRANSLITERATION OF A RUSSIAN TEXT USING LATIN ALPHABET
    USPENSKII, VA
    NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1967, (07): : 12 - +
  • [28] Design and Analysis of an Effective Corpus for Evaluation of Bengali Text Compression Schemes
    Islam, Md. Rafiqul
    Rajon, S. A. Ahsan
    JOURNAL OF COMPUTERS, 2010, 5 (01) : 59 - 68
  • [29] Human Abnormality Detection Based on Bengali Text
    Mridha, M. F.
    Rahman, Md Saifur
    Ohi, Abu Quwsar
    2020 IEEE REGION 10 SYMPOSIUM (TENSYMP) - TECHNOLOGY FOR IMPACTFUL SUSTAINABLE DEVELOPMENT, 2020, : 1102 - 1105
  • [30] A Rule-Based Kurdish Text Transliteration System
    Ahmadi, Sina
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (02)