A Discourse-Based Information Retrieval for Tamil Literary Texts

被引:0
|
作者
Ramalingam, Anita [1 ]
Navaneethakrish, Subalalitha Chinnaudayar [1 ]
机构
[1] SRM Inst Sci & Technol, Dept Comp Sci & Engn, Chennai, Tamil Nadu, India
关键词
Discourse parser; Morphological Analyzer; Inverted indexing; Ranking; Tamil information retrieval;
D O I
10.32890/jict2021.20.3.4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Tamil literature has many valuable thoughts that can help the human community to lead a successful and happy life. Tamil literary works are abundantly available and searchable on the Internet. However, the existing search systems follow a keyword-based match strategy that fails to satisfy user needs. This necessitates the demand for a focused information retrieval system that semantically analyzes the Tamil literary text, which will eventually improve the search system performance. This paper proposes a novel information retrieval framework that uses discourse processing techniques in aiding semantic analysis and representation of Tamil literary texts. The proposed framework was tested using two ancient literary works, the Thirukkural and Naladiyar, which were written in 300 Before the Common Era (BCE). The Thirukkural comprises 1,330 couplets, each 7 words long, while the Naladiyar consists of 400 quatrains, each 15 words long. The proposed system, tested with all the 1,330 Thirukkural couplets and 400 Naladiyar quatrains, achieved a mean average precision (MAP) score of 89 percent. The performance of the proposed framework was compared with Google Tamil search and a keyword-based search, which was a substandard version of the proposed framework. Google Tamil search achieved a MAP score of 56 percent while keyword-based method achieved a MAP score of 62 percent. It showed that the discourse processing techniques could improve the search performance of an information retrieval system.
引用
收藏
页码:353 / 389
页数:37
相关论文
共 50 条
  • [21] Information retrieval on Turkish texts
    Can, Fazli
    Kocberber, Seyit
    Balcik, Erman
    Kaynak, Cihan
    Ocalan, H. Cagdas
    Vursavas, Onur M.
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2008, 59 (03): : 407 - 421
  • [22] Blaming the Brahmins: Texts Lost and Found in Tamil Literary History
    Tieken, Herman
    [J]. STUDIES IN HISTORY, 2010, 26 (02) : 227 - 243
  • [23] The death of the Uppsala school: Towards a discourse-based paradigm?
    Hakanson, Lars
    [J]. JOURNAL OF INTERNATIONAL BUSINESS STUDIES, 2021, 52 (07) : 1417 - 1424
  • [24] Legal Communication of Chinese Judiciary: A Discourse-based View
    Wang, Fang
    [J]. DISCOURSE STUDIES, 2014, 16 (06) : 849 - 850
  • [26] A discourse-based evaluation of a classroom peer teaching project
    Mennim, Paul
    [J]. ELT JOURNAL, 2017, 71 (01) : 37 - 49
  • [27] Discourse-based learning using a multimedia discussion forum
    Tay, MH
    Hooi, CM
    Chee, YS
    [J]. INTERNATIONAL CONFERENCE ON COMPUTERS IN EDUCATION, VOLS I AND II, PROCEEDINGS, 2002, : 293 - 294
  • [28] Discourse-based treatment in mild traumatic brain injury
    Kintz, Stephen
    Hibbs, Valentyna
    Henderson, Amy
    Andrews, Morgan
    Wright, Heather Harris
    [J]. JOURNAL OF COMMUNICATION DISORDERS, 2018, 76 : 47 - 59
  • [29] Discourse-based technology support for intercultural communication in multinationals
    Zaidman, Nurit
    Te'eni, Dov
    Schwartz, David
    [J]. JOURNAL OF COMMUNICATION MANAGEMENT, 2008, 12 (03) : 263 - +
  • [30] Managerial Organization and Professional Autonomy: A Discourse-Based Conceptualization
    Thomas, Pete
    Hewitt, Jan
    [J]. ORGANIZATION STUDIES, 2011, 32 (10) : 1373 - 1393