Machine Learning for Imbalanced Datasets of Recognizing Inference in Text with Linguistic Phenomena

被引:0
|
作者
Day, Min-Yuh [1 ]
Tsai, Cheng-Chia [1 ]
机构
[1] Tamkang Univ, Dept Informat Management, New Taipei, Taiwan
关键词
Imbalanced Datasets; Linguistic Phenomena; Machine Learning; Recognizing Inference in Text; Textual Entailment;
D O I
10.1109/IRI.2015.99
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recognizing inference in text (RITE) plays an important role in the answer validation modules for a Question Answering (QA) system. The problem of class imbalance has received increased attention in the machine learning community. In recent years, several attempts have been made on the linguistic phenomena analysis, however, little is known about the effects of imbalanced datasets with linguistic phenomenon in recognizing inference in text. The objective of this paper is to provide an empirical study on learning imbalanced datasets of recognizing inference in text with linguistic phenomena for a better understanding of the effects of imbalanced datasets with linguistic phenomenon in recognizing inference in text. In this paper, we proposed an analysis of imbalanced datasets of recognizing inference in text with linguistic phenomena using NTCIR 11 RITE-VAL gold standard dataset and development dataset. The experimental results suggest that the distribution of imbalanced datasets of recognizing inference in text with linguistic phenomenon could be dramatically varied on the performance of a machine learning classifier.
引用
收藏
页码:562 / 568
页数:7
相关论文
共 50 条
  • [1] Analysis of Identifying Linguistic Phenomena for Recognizing Inference in Text
    Day, Min-Yuh
    Wang, Ya-Jung
    2014 IEEE 15TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2014, : 607 - 612
  • [2] A Hybrid Machine Learning Methodology for Imbalanced Datasets
    Lipitakis, Anastasia-Dimitra
    Kotsiantis, Sotirios
    5TH INTERNATIONAL CONFERENCE ON INFORMATION, INTELLIGENCE, SYSTEMS AND APPLICATIONS, IISA 2014, 2014, : 252 - +
  • [3] Interpretable machine learning for imbalanced credit scoring datasets
    Chen, Yujia
    Calabrese, Raffaella
    Martin-Barragan, Belen
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2024, 312 (01) : 357 - 372
  • [4] Machine Learning with Variational AutoEncoder for Imbalanced Datasets in Intrusion Detection
    Lin, Ying-Dar
    Liu, Zi-Qiang
    Hwang, Ren-Hung
    Nguyen, Van-Linh
    Lin, Po-Ching
    Lai, Yuan-Cheng
    IEEE Access, 2022, 10 : 15247 - 15260
  • [5] Machine Learning With Variational AutoEncoder for Imbalanced Datasets in Intrusion Detection
    Lin, Ying-Dar
    Liu, Zi-Qiang
    Hwang, Ren-Hung
    Van-Linh Nguyen
    Lin, Po-Ching
    Lai, Yuan-Cheng
    IEEE ACCESS, 2022, 10 : 15247 - 15260
  • [6] The method of text categorization on imbalanced datasets
    Li Xin-fu
    Yu Yan
    Yin Peng
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS, 2009, : 650 - 653
  • [7] Imbalanced-learn: A Python']Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning
    Lemaitre, Guillaume
    Nogueira, Fernando
    Aridas, Christos K.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2017, 18
  • [8] Effect of Imbalanced Datasets on Security of Industrial IoT Using Machine Learning
    Zolanvari, Maede
    Teixeira, Marcio A.
    Jain, Raj
    2018 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2018, : 112 - 117
  • [9] Universum based kernelized weighted extreme learning machine for imbalanced datasets
    Raghuwanshi, Bhagat Singh
    Mangal, Akansha
    Shukla, Sanyam
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2022, 13 (11) : 3387 - 3408
  • [10] Universum based kernelized weighted extreme learning machine for imbalanced datasets
    Bhagat Singh Raghuwanshi
    Akansha Mangal
    Sanyam Shukla
    International Journal of Machine Learning and Cybernetics, 2022, 13 : 3387 - 3408