An Efficient Machine Learning-based Text Summarization in the Malayalam Language

被引:0
|
作者
Haroon, Rosna P. [1 ]
Abdul Gafur, M. [2 ]
Barakkath Nisha, U. [3 ]
机构
[1] APJ Abdul Kalam Technol Univ, Dept CSE, Ilahia Coll Engn & Technol, Thiruvananthapuram, Kerala, India
[2] APJ Abdul Kalam Technol Univ, Ilahia Coll Engn & Technol, Thiruvananthapuram, Kerala, India
[3] Sri Krishna Coll Engn & Technol, Dept IT, Coimbatore, Tamil Nadu, India
关键词
Malayalam Text Summarization; Supervised Machine Learning; SVM; Text Mining; Sentence Extraction; Summary Generation;
D O I
10.3837/tiis.2022.06.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Automatic text summarization is a procedure that packs enormous content into a more limited book that incorporates significant data. Malayalam is one of the toughest languages utilized in certain areas of India, most normally in Kerala and in Lakshadweep. Natural language processing in the Malayalam language is relatively low due to the complexity of the language as well as the scarcity of available resources. In this paper, a way is proposed to deal with the text summarization process in Malayalam documents by training a model based on the Support Vector Machine classification algorithm. Different features of the text are taken into account for training the machine so that the system can output the most important data from the input text. The classifier can classify the most important, important, average, and least significant sentences into separate classes and based on this, the machine will be able to create a summary of the input document. The user can select a compression ratio so that the system will output that much fraction of the summary. The model performance is measured by using different genres of Malayalam documents as well as documents from the same domain. The model is evaluated by considering content evaluation measures precision, recall, F score, and relative utility. Obtained precision and recall value shows that the model is trustable and found to be more relevant compared to the other summarizers.
引用
收藏
页码:1778 / 1799
页数:22
相关论文
共 50 条
  • [1] Machine Learning-Based Automatic Text Summarization Techniques
    Radhakrishnan P.
    Senthil kumar G.
    [J]. SN Computer Science, 4 (6)
  • [2] An Efficient Text Summarization Using Term and Inverse Frequency With Key Phrase Identification in Malayalam Language
    Haroon, Rosna P.
    Gafur, Abdul M.
    Nisha, Barakkath U.
    Ali, Nasreen
    [J]. 2021 IEEE INTERNATIONAL WOMEN IN ENGINEERING (WIE) CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE), 2022, : 145 - 148
  • [3] Machine Learning-Based Text Classification Comparison: Turkish Language Context
    Alzoubi, Yehia Ibrahim
    Topcu, Ahmet E.
    Erkaya, Ahmed Enis
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (16):
  • [4] Malayalam Text Summarization: An Extractive Approach
    Krishnaprasad, P.
    Sooryanarayanan, A.
    Ramanujan, Ajeesh
    [J]. 2016 INTERNATIONAL CONFERENCE ON NEXT GENERATION INTELLIGENT SYSTEMS (ICNGIS), 2016, : 40 - 43
  • [5] Machine Learning Based Text Summarization for Turkish News
    Kartal, Yavuz Selim
    Kutlu, Mucahid
    [J]. 2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [6] Machine Learning-Based Soccer Video Summarization System
    Zawbaa, Hossam M.
    El-Bendary, Nashwa
    Hassanien, Aboul Ella
    Kim, Tai-hoon
    [J]. MULTIMEDIA, COMPUTER GRAPHICS AND BROADCASTING, PT II, 2011, 263 : 19 - +
  • [7] Machine learning-based guilt detection in text
    Abdul Gafar Manuel Meque
    Nisar Hussain
    Grigori Sidorov
    Alexander Gelbukh
    [J]. Scientific Reports, 13
  • [8] Machine learning-based guilt detection in text
    Meque, Abdul Gafar Manuel
    Hussain, Nisar
    Sidorov, Grigori
    Gelbukh, Alexander
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)
  • [9] Analytics of machine learning-based algorithms for text classification
    Hassan, Sayar Ul
    Ahamed, Jameel
    Ahmad, Khaleel
    [J]. Sustainable Operations and Computers, 2022, 3 : 238 - 248
  • [10] Biomedical Text Mining: Applicability of Machine Learning-based Natural Language Processing in Medical Database
    Mollaei, Nafiseh
    Cepeda, Catia
    Rodrigues, Joao
    Gamboa, Hugo
    [J]. BIOSIGNALS: PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES - VOL 4: BIOSIGNALS, 2022, : 159 - 166