Machine Learning for Imbalanced Datasets of Recognizing Inference in Text with Linguistic Phenomena

被引:0
|
作者
Day, Min-Yuh [1 ]
Tsai, Cheng-Chia [1 ]
机构
[1] Tamkang Univ, Dept Informat Management, New Taipei, Taiwan
关键词
Imbalanced Datasets; Linguistic Phenomena; Machine Learning; Recognizing Inference in Text; Textual Entailment;
D O I
10.1109/IRI.2015.99
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recognizing inference in text (RITE) plays an important role in the answer validation modules for a Question Answering (QA) system. The problem of class imbalance has received increased attention in the machine learning community. In recent years, several attempts have been made on the linguistic phenomena analysis, however, little is known about the effects of imbalanced datasets with linguistic phenomenon in recognizing inference in text. The objective of this paper is to provide an empirical study on learning imbalanced datasets of recognizing inference in text with linguistic phenomena for a better understanding of the effects of imbalanced datasets with linguistic phenomenon in recognizing inference in text. In this paper, we proposed an analysis of imbalanced datasets of recognizing inference in text with linguistic phenomena using NTCIR 11 RITE-VAL gold standard dataset and development dataset. The experimental results suggest that the distribution of imbalanced datasets of recognizing inference in text with linguistic phenomenon could be dramatically varied on the performance of a machine learning classifier.
引用
收藏
页码:562 / 568
页数:7
相关论文
共 50 条
  • [41] Constructing support vector machine ensemble with segmentation for imbalanced datasets
    Li, Qian
    Yang, Bing
    Li, Yi
    Deng, Naiyang
    Jing, Ling
    NEURAL COMPUTING & APPLICATIONS, 2013, 22 : S249 - S256
  • [42] Deep Learning Applied to Imbalanced Malware Datasets Classification
    Salas, Marcelo Palma
    de Geus, Paulo Licio
    JOURNAL OF INTERNET SERVICES AND APPLICATIONS, 2024, 15 (01) : 342 - 359
  • [43] Constructing support vector machine ensemble with segmentation for imbalanced datasets
    Qian Li
    Bing Yang
    Yi Li
    Naiyang Deng
    Ling Jing
    Neural Computing and Applications, 2013, 22 : 249 - 256
  • [44] Fuzzy support vector machine with graph for classifying imbalanced datasets
    Chen, Baihua
    Fan, Yuling
    Lan, Weiyao
    Liu, Jinghua
    Cao, Chao
    Gao, Yunlong
    NEUROCOMPUTING, 2022, 514 : 296 - 312
  • [45] RDPVR: Random Data Partitioning with Voting Rule for Machine Learning from Class-Imbalanced Datasets
    Hassanat, Ahmad B.
    Tarawneh, Ahmad S.
    Abed, Samer Subhi
    Altarawneh, Ghada Awad
    Alrashidi, Malek
    Alghamdi, Mansoor
    ELECTRONICS, 2022, 11 (02)
  • [46] Exploring Early Prediction of Chronic Kidney Disease Using Machine Learning Algorithms for Small and Imbalanced Datasets
    da Silveira, Andressa C. M.
    Sobrinho, Alvaro
    da Silva, Leandro Dias
    Costa, Evandro de Barros
    Pinheiro, Maria Eliete
    Perkusich, Angelo
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [47] Modifying the learning rate of FLNG dealing with imbalanced datasets
    Machon-Gonzalez, Ivan
    Lopez-Garcia, Hilario
    Luis Calvo-Rolle, Jose
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [48] RSG: A Simple but Effective Module for Learning Imbalanced Datasets
    Wang, Jianfeng
    Lukasiewicz, Thomas
    Hu, Xiaolin
    Cai, Jianfei
    Xu, Zhenghua
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3783 - 3792
  • [49] Prediction of toxicity: Deep learning with small and imbalanced datasets
    Ecker, Gerhard
    Hemmerich, Jennifer
    Asilar, Ece
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2019, 257
  • [50] A GENETIC RULE LEARNING APPROACH TO DEAL WITH IMBALANCED DATASETS
    Mahani, Aouatef
    Benkhider, Sadjia
    Baba-Ali, Ahmed Riadh
    PROCEEDINGS OF THE EUROPEAN CONFERENCE ON DATA MINING 2015 AND INTERNATIONAL CONFERENCES ON INTELLIGENT SYSTEMS AND AGENTS 2015 AND THEORY AND PRACTICE IN MODERN COMPUTING 2015, 2015, : 151 - 156