Mining border descriptions of emerging patterns from dataset pairs

被引:0
|
作者
Guozhu Dong
Jinyan Li
机构
[1] Wright State University,Department of Computer Science and Engineering
[2] Institute for Infocomm Research,undefined
来源
关键词
Border algorithms; Border descriptions; Changes; Classification rules; Contrasts; Differences; Emerging patterns; Minimal/maximal patterns; Trends;
D O I
暂无
中图分类号
学科分类号
摘要
The mining of changes or differences or other comparative patterns from a pair of datasets is an interesting problem. This paper is focused on the mining of one type of comparative pattern called emerging patterns. Emerging patterns are denoted by EPs and are defined as patterns for which support increases from one dataset to the other with a big ratio. The number of EPs is sometimes huge. To provide a good structure for and to reduce the size of mining results, we use borders to concisely describe large collections of EPs in a lossless way. Such a border consists of only the minimal (under set inclusion) and the maximal EPs in the collection. We also present an algorithm for efficiently computing the borders of some desired EPs by manipulating the input borders only. Our experience with many datasets in the UCI Repository and recent cancer diagnosis datasets demonstrated that: Both the EP pattern type and our algorithm are useful for building accurate classifiers and useful for mining multifactor interactions, for example, minimal gene groups potentially responsible for the development of cancer.
引用
下载
收藏
页码:178 / 202
页数:24
相关论文
共 50 条
  • [41] A novel approach for mining emerging patterns in rare-class datasets
    Alhammady, Hamad
    INNOVATIONS AND ADVANCED TECHNIQUES IN COMPUTER AND INFORMATION SCIENCES AND ENGINEERING, 2007, : 207 - 211
  • [42] Mining strong jumping emerging patterns with a novel list data structure
    Chen, Xiangtao
    Guan, Ziping
    SECOND INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION, 2017, 10443
  • [43] WikiDes: A Wikipedia-based dataset for generating short descriptions from paragraphs
    Ta, Hoang Thang
    Rahman, Abu Bakar Siddiqur
    Majumder, Navonil
    Hussain, Amir
    Najjar, Lotfollah
    Howard, Newton
    Poria, Soujanya
    Gelbukh, Alexander
    INFORMATION FUSION, 2023, 90 : 265 - 282
  • [44] Challenge Dataset of Cognates and False Friend Pairs from Indian Languages
    Kanojia, Diptesh
    Bhattacharyya, Pushpak
    Kulkarni, Malhar
    Haffari, Gholamreza
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3096 - 3102
  • [45] MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions
    Soldan, Mattia
    Pardo, Alejandro
    Alcazar, Juan Leon
    Heilbron, Fabian Caba
    Zhao, Chen
    Giancola, Silvio
    Ghanem, Bernard
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5016 - 5025
  • [46] Patterns of near-crash events in a naturalistic driving dataset: Applying rules mining
    Kong, Xiaoqiang
    Das, Subasish
    Zhou, Hongmin Tracy
    Zhang, Yunlong
    ACCIDENT ANALYSIS AND PREVENTION, 2021, 161
  • [47] From dynamical emerging patterns to patterns in visual art
    Bucolo, M.
    Buscarino, A.
    Fortuna, L.
    Frasca, M.
    Xibilia, M. G.
    INTERNATIONAL JOURNAL OF BIFURCATION AND CHAOS, 2008, 18 (01): : 51 - 81
  • [48] Extracting and using attribute-value pairs from product descriptions on the web
    Probst, Katharina
    Ghani, Rayid
    Krema, Marko
    Fano, Andy
    Liu, Yan
    FROM WEB TO SOCIAL WEB: DISCOVERING AND DEPLOYING USER AND CONTENT PROFILES, 2007, 4737 : 41 - +
  • [49] Automatic visual pattern mining from categorical image dataset
    Li, Hongzhi
    Ellis, Joseph G.
    Zhang, Lei
    Chang, Shih-Fu
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2019, 8 (01) : 35 - 45
  • [50] Automatic visual pattern mining from categorical image dataset
    Hongzhi Li
    Joseph G. Ellis
    Lei Zhang
    Shih-Fu Chang
    International Journal of Multimedia Information Retrieval, 2019, 8 : 35 - 45