Multi-label Log-Loss function using L-BFGS for document categorization

被引:16
|
作者
Borhani, Mostafa [1 ]
机构
[1] Shahid Beheshti Univ, Quran Miracle Res Inst, Tehran 1983963113, Iran
关键词
Multi-label classification; Text mining; Quasi-Newton method; Holy Quran; Corpus analysis; BFGS; Scikit-learn; Artificial neural networks; Text classification;
D O I
10.1016/j.engappai.2020.103623
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text mining, which fundamentally involves quantitative tactics to analyze textual data, can be used for discovering knowledge and to achieve scholarly research goals. For large-scale data such as corpus text, intelligent learning methods have been effectively approached. In this paper, an artificial neural network with a quasi-Newton updating procedure is presented for multi-label multi-class text classification. This numerical unconstrained training technique, the Multi-Label extension of Log-Loss function using in Limited-memory Broyden-Fletcher-Goldfarb-Shanno algorithm (ML4BFGS), provides a noteworthy opportunity for text mining and leads to a significant improvement in text classification performances. The ML4BFGS training approach is applied to allocate some (one or multi) of the classes to each corresponding sentence from different available labels. We evaluate this method on English translations of the Holy Quran. These religious texts have been chosen for experiments of this manuscript because each verse (sentence) usually has multiple labels (topics) and different translations of each verse should have the same labels. Experimental results show that ML4BFGS is talented for multi-label multi-class classification in the Quranic corpus. Evaluation criteria of some advanced updating methods such as ITCG, BFGS, L-BFGS-B, L3BFGS as well as some other multi-label approaches such as ML-k-NN, and well-known SVM are compared with the proposed ML4BFGS and the outcomes are fullydescribed in this study. The performance measures including the Hamming loss, recall, precision, and F1 score show that the ML4BFGS achieves the best results in extracting related classes for each verse, while the proposed network takes the least epochs compared to the other training approach for completing learning or training phase. Simultaneously, the elapsed time for ML4BFGS is just 78% (in seconds) of the best experience of this term. Compared with the applicability of some state-of-the-art algorithms, ML4BFGS has a less computational cost, faster convergence rate, and much accuracy in corpus analysis.
引用
收藏
页数:7
相关论文
共 46 条
  • [1] Document transformation for multi-label feature selection in text categorization
    Chen, Weizhu
    Yan, Jun
    Zhang, Benyu
    Chen, Zheng
    Yang, Qiang
    [J]. ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 451 - +
  • [2] Multi-label Categorization of Accounts of Sexism using a Neural Framework
    Parikh, Pulkit
    Abburi, Harika
    Badjatiya, Pinkesh
    Krishnan, Radhika
    Chhaya, Niyati
    Gupta, Manish
    Varma, Vasudeva
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 1642 - 1652
  • [3] Multi-Label Object Categorization Using Histograms of Global Relations
    Mustafa, Wail
    Xiong, Hanchen
    Kraft, Dirk
    Szedmak, Sandor
    Piater, Justus
    Kruger, Norbert
    [J]. 2015 INTERNATIONAL CONFERENCE ON 3D VISION, 2015, : 309 - 317
  • [4] Exploiting Label Dependencies for Multi-Label Document Classification Using Transformers
    Fallah, Haytame
    Bruno, Emmanuel
    Bellot, Patrice
    Murisasco, Elisabeth
    [J]. PROCEEDINGS OF THE 2023 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, DOCENG 2023, 2023,
  • [5] Loss Function Approaches for Multi-label Music Tagging
    Knox, Dillon
    Greer, Timothy
    Ma, Benjamin
    Kuo, Emily
    Somandepalli, Krishna
    Narayanan, Shrikanth
    [J]. 2021 INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2021, : 191 - 194
  • [6] Choosing the right loss function for multi-label Emotion Classification
    Hurtado, Lluis-E
    Gonzalez, Jose-Angel
    Pla, Ferran
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (05) : 4697 - 4708
  • [7] Solving multi-label text categorization problem using support vector machine approach with membership function
    Department of Industrial and Information Management, National Cheng Kung University, 1 Ta-Shueh Road, Tainan City 70101, Taiwan
    不详
    [J]. Neurocomputing, 1600, 17 (3682-3689):
  • [8] Multi-label text categorization using L21-norm minimization extreme learning machine
    Jiang, Mingchu
    Pan, Zhisong
    Li, Na
    [J]. NEUROCOMPUTING, 2017, 261 : 4 - 10
  • [9] Multi-label Text Categorization Using L21-norm Minimization Extreme Learning Machine
    Jiang, Mingchu
    Li, Na
    Pan, Zhisong
    [J]. PROCEEDINGS OF ELM-2015, VOL 1: THEORY, ALGORITHMS AND APPLICATIONS (I), 2016, 6 : 121 - 133
  • [10] Solving multi-label text categorization problem using support vector machine approach with membership function
    Wang, Tai-Yue
    Chiang, Huei-Min
    [J]. NEUROCOMPUTING, 2011, 74 (17) : 3682 - 3689