Research on Multilingual Indexing and Query Processing in Uyghur, Kazak, and Kyrgyz Multilingual Information Retrieval System

被引:0
|
作者
Tursun, Dilmurat [1 ]
Tohti, Turdi [1 ]
Hamdulla, Askar [1 ]
机构
[1] Xinjiang Univ, Sch Informat Sci & Engn, Urumqi 830046, Xinjiang, Peoples R China
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Uyghur, Kazak, and Kyrgyz languages no language ID and the some letters in this languages are sharing code points in Unicode area, so it is difficult to distinguish between Uyghur, Kazak, and Kyrgyz letters in information exchange, automatic word segmentation and retrieval applications, existing linguistic ambiguity. In addition, in the region in alphabetical order with the Arabic alphabet, Uyghur, Kazak, and Kyrgyz letter is the order of chaos, this will led to great difficulties for Uyghur, Kazak, and Kyrgyz multilingual data indexing, query processing and sorting process. In this paper, studied and proposed the most effective solutions and ideas for above actual problems: in view of the problem of linguistic ambiguity, proposed a Relocated Unicode Format (short for RuniForm) Encoding Method; For multilingual indexing, proposed a multilingual indexing technology based on MD5 encryption and related query processing approach in Uyghur, Kazak, and Kyrgyz information retrieval system (UKKIRS). The experimental results indicated that, the proposed algorithms solved well the problems mentioned above, and are very dedicated to this UKKIRS.
引用
收藏
页码:263 / 271
页数:9
相关论文
共 50 条
  • [1] Character Code Conversion and Misspelled Word Processing in Uyghur, Kazak, Kyrgyz Multilingual Information Retrieval System
    Tohti, Turdi
    Musajan, Winira
    Hamdulla, Askar
    [J]. ALPIT 2008: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 139 - 144
  • [2] Query optimization in web based Uyghur, Kazak and Kyrgyz information retrieval
    Tohti, Turdi
    Hamdulla, Askar
    Musajan, Winira
    [J]. RECENT ADVANCE OF CHINESE COMPUTING TECHNOLOGIES, 2007, : 329 - 332
  • [3] A Multilingual Language Processing Tool for Uyghur Kazak and Kirghiz
    Ablimit, Mijit
    Parhat, Sardar
    Hamdulla, Askar
    Zheng, Thomas Fang
    [J]. 2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 737 - 740
  • [4] Conceptual indexing for multilingual information retrieval
    Guyot, Jacques
    Radhouani, Said
    Falquet, Gilles
    [J]. ACCESSING MULTILINGUAL INFORMATION REPOSITORIES, 2006, 4022 : 102 - 112
  • [5] Multilingual Stemming and Term extraction for Uyghur, Kazak and Kirghiz
    Ablimit, Mijit
    Parhat, Sardar
    Hamdulla, Askar
    Zheng, Thomas Fang
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 587 - 590
  • [6] Multilingual information retrieval system
    Hong, Z
    Syin, C
    Lia, KF
    [J]. MULTIMEDIA STORAGE AND ARCHIVING SYSTEMS, 1996, 2916 : 33 - 44
  • [7] A Review on Indexing Techniques and its Application in Multilingual Information Retrieval System
    Madankar, Mangala
    Chandak, Manoj
    [J]. INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2021, 12 (05): : 610 - 616
  • [8] A multilingual approach to multilingual information retrieval
    Nie, JY
    Jin, F
    [J]. ADVANCES IN CROSS-LANGUAGE INFORMATION RETRIEVAL, 2003, 2785 : 101 - 110
  • [9] CIMWOS: A multimedia, multimodal and multilingual indexing and retrieval system
    Papageorgiou, H
    Protopapas, A
    [J]. DIGITAL MEDIA: PROCESSING MULTIMEDIA INTERACTIVE SERVICES, 2003, : 563 - 568
  • [10] Indexing multilingual information on the web
    Yip, CL
    Kao, B
    [J]. TWENTY-SECOND ANNUAL INTERNATIONAL COMPUTER SOFTWARE & APPLICATIONS CONFERENCE - PROCEEDINGS, 1998, : 576 - 581