A Discourse-Based Information Retrieval for Tamil Literary Texts

被引:0
|
作者
Ramalingam, Anita [1 ]
Navaneethakrish, Subalalitha Chinnaudayar [1 ]
机构
[1] SRM Inst Sci & Technol, Dept Comp Sci & Engn, Chennai, Tamil Nadu, India
关键词
Discourse parser; Morphological Analyzer; Inverted indexing; Ranking; Tamil information retrieval;
D O I
10.32890/jict2021.20.3.4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Tamil literature has many valuable thoughts that can help the human community to lead a successful and happy life. Tamil literary works are abundantly available and searchable on the Internet. However, the existing search systems follow a keyword-based match strategy that fails to satisfy user needs. This necessitates the demand for a focused information retrieval system that semantically analyzes the Tamil literary text, which will eventually improve the search system performance. This paper proposes a novel information retrieval framework that uses discourse processing techniques in aiding semantic analysis and representation of Tamil literary texts. The proposed framework was tested using two ancient literary works, the Thirukkural and Naladiyar, which were written in 300 Before the Common Era (BCE). The Thirukkural comprises 1,330 couplets, each 7 words long, while the Naladiyar consists of 400 quatrains, each 15 words long. The proposed system, tested with all the 1,330 Thirukkural couplets and 400 Naladiyar quatrains, achieved a mean average precision (MAP) score of 89 percent. The performance of the proposed framework was compared with Google Tamil search and a keyword-based search, which was a substandard version of the proposed framework. Google Tamil search achieved a MAP score of 56 percent while keyword-based method achieved a MAP score of 62 percent. It showed that the discourse processing techniques could improve the search performance of an information retrieval system.
引用
收藏
页码:353 / 389
页数:37
相关论文
共 50 条
  • [1] A Discourse-based Chinese Chunkbank
    Lu, Lu
    Jiao, Hong-Yan
    Li, Meng
    Xun, En-Dong
    [J]. Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (12): : 2911 - 2921
  • [2] A discourse-based Approach for the Semicolon
    Rothstein, Bjorn
    [J]. MUTTERSPRACHE, 2016, 126 (03): : 185 - 192
  • [3] Discourse-Based Sentence Splitting
    Cripwell, Liam
    Legrand, Joel
    Gardent, Claire
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 261 - 273
  • [4] A discourse-based approach to semantic feature analysis for the treatment of aphasic word retrieval failures
    Peach, Richard K.
    Reuter, Katherine A.
    [J]. APHASIOLOGY, 2010, 24 (09) : 971 - 990
  • [5] Internationalization of the firm: A discourse-based view
    Trevino, Len J.
    Doh, Jonathan P.
    [J]. JOURNAL OF INTERNATIONAL BUSINESS STUDIES, 2021, 52 (07) : 1375 - 1393
  • [6] Internationalization of the firm: A discourse-based view
    Len J Treviño
    Jonathan P Doh
    [J]. Journal of International Business Studies, 2021, 52 : 1375 - 1393
  • [7] LITERARY-TEXTS IN THE CLASSROOM - A DISCOURSE
    KRAMSCH, C
    [J]. MODERN LANGUAGE JOURNAL, 1985, 69 (04): : 356 - 366
  • [8] An information retrieval approach based on discourse type
    Wang, D. Y.
    Luk, R. W. P.
    Wong, K. F.
    Kwok, K. L.
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PROCEEDINGS, 2006, 3999 : 197 - 202
  • [10] Theoretical Foundations of Discourse Analysis in Discourse-based Context Analysis
    Zhao, Shu-Bo
    [J]. 3RD ANNUAL INTERNATIONAL CONFERENCE ON MODERN EDUCATION AND SOCIAL SCIENCE (MESS 2017), 2017, 135 : 123 - 125