A Discourse-Based Information Retrieval for Tamil Literary Texts

被引：0

作者：

Ramalingam, Anita ^{[1
]}

Navaneethakrish, Subalalitha Chinnaudayar ^{[1
]}

机构：

[1] SRM Inst Sci & Technol, Dept Comp Sci & Engn, Chennai, Tamil Nadu, India

来源：

JOURNAL OF INFORMATION AND COMMUNICATION TECHNOLOGY-MALAYSIA | 2021年 / 20卷 / 03期

关键词：

Discourse parser; Morphological Analyzer; Inverted indexing; Ranking; Tamil information retrieval;

D O I：

10.32890/jict2021.20.3.4

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Tamil literature has many valuable thoughts that can help the human community to lead a successful and happy life. Tamil literary works are abundantly available and searchable on the Internet. However, the existing search systems follow a keyword-based match strategy that fails to satisfy user needs. This necessitates the demand for a focused information retrieval system that semantically analyzes the Tamil literary text, which will eventually improve the search system performance. This paper proposes a novel information retrieval framework that uses discourse processing techniques in aiding semantic analysis and representation of Tamil literary texts. The proposed framework was tested using two ancient literary works, the Thirukkural and Naladiyar, which were written in 300 Before the Common Era (BCE). The Thirukkural comprises 1,330 couplets, each 7 words long, while the Naladiyar consists of 400 quatrains, each 15 words long. The proposed system, tested with all the 1,330 Thirukkural couplets and 400 Naladiyar quatrains, achieved a mean average precision (MAP) score of 89 percent. The performance of the proposed framework was compared with Google Tamil search and a keyword-based search, which was a substandard version of the proposed framework. Google Tamil search achieved a MAP score of 56 percent while keyword-based method achieved a MAP score of 62 percent. It showed that the discourse processing techniques could improve the search performance of an information retrieval system.

引用

页码：353 / 389

页数：37

共 50 条

[21] Information retrieval on Turkish texts
Can, Fazli
Kocberber, Seyit
Balcik, Erman
Kaynak, Cihan
Ocalan, H. Cagdas
Vursavas, Onur M.
[J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2008, 59 (03): : 407 - 421
[22] Blaming the Brahmins: Texts Lost and Found in Tamil Literary History
Tieken, Herman
[J]. STUDIES IN HISTORY, 2010, 26 (02) : 227 - 243
[23] The death of the Uppsala school: Towards a discourse-based paradigm?
Hakanson, Lars
[J]. JOURNAL OF INTERNATIONAL BUSINESS STUDIES, 2021, 52 (07) : 1417 - 1424
[24] Legal Communication of Chinese Judiciary: A Discourse-based View
Wang, Fang
[J]. DISCOURSE STUDIES, 2014, 16 (06) : 849 - 850
[25] Automating Judicial Document Drafting: A Discourse-Based Approach
[J]. Artificial Intelligence and Law, 1998, 6 (2-4) : 111 - 149
[26] A discourse-based evaluation of a classroom peer teaching project
Mennim, Paul
[J]. ELT JOURNAL, 2017, 71 (01) : 37 - 49
[27] Discourse-based learning using a multimedia discussion forum
Tay, MH
Hooi, CM
Chee, YS
[J]. INTERNATIONAL CONFERENCE ON COMPUTERS IN EDUCATION, VOLS I AND II, PROCEEDINGS, 2002, : 293 - 294
[28] Discourse-based treatment in mild traumatic brain injury
Kintz, Stephen
Hibbs, Valentyna
Henderson, Amy
Andrews, Morgan
Wright, Heather Harris
[J]. JOURNAL OF COMMUNICATION DISORDERS, 2018, 76 : 47 - 59
[29] Discourse-based technology support for intercultural communication in multinationals
Zaidman, Nurit
Te'eni, Dov
Schwartz, David
[J]. JOURNAL OF COMMUNICATION MANAGEMENT, 2008, 12 (03) : 263 - +
[30] Managerial Organization and Professional Autonomy: A Discourse-Based Conceptualization
Thomas, Pete
Hewitt, Jan
[J]. ORGANIZATION STUDIES, 2011, 32 (10) : 1373 - 1393

← 1 2 3 4 5 →