QuMinS: Fast and scalable querying, mining and summarizing multi-modal databases

被引:0
|
作者
Cordeiro, Robson L. F. [1 ]
Guo, Fan [2 ]
Haverkamp, Donna S. [3 ]
Horne, James H. [3 ]
Hughes, Ellen K. [3 ]
Kim, Gunhee [2 ]
Romani, Luciana A. S. [4 ]
Coltri, Priscila P. [5 ]
Souza, Tamires T. [1 ]
Traina, Agma J. M. [1 ]
Traina, Caetano, Jr. [1 ]
Faloutsos, Christos [2 ]
机构
[1] Univ Sao Paulo, BR-13560970 Sao Carlos, SP, Brazil
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Sci Applicat Int Corp, Mclean, VA 22102 USA
[4] Embrapa Agr Informat, BR-13083886 Campinas, SP, Brazil
[5] Univ Estadual Campinas, BR-13083970 Campinas, SP, Brazil
基金
巴西圣保罗研究基金会; 美国国家科学基金会;
关键词
Low-labor labeling; Summarization; Outlier detection; Query by example; Clustering; Satellite imagery; IMAGE ANNOTATION; RANDOM-WALK; CLASSIFICATION; RECOGNITION; OBJECT; GRAPH;
D O I
10.1016/j.ins.2013.11.013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Given a large image set, in which very few images have labels, how to guess labels for the remaining majority? How to spot images that need brand new labels different from the predefined ones? How to summarize these data to route the user's attention to what really matters? Here we answer all these questions. Specifically, we propose QuMinS, a fast, scalable solution to two problems: (i) Low-labor labeling (LLL) - given an image set, very few images have labels, find the most appropriate labels for the rest; and (ii) Mining and attention routing - in the same setting, find clusters, the top-N-O outlier images, and the N-R images that best represent the data. Experiments on satellite images spanning up to 2.25 GB show that, contrasting to the state-of-the-art labeling techniques, QuMinS scales linearly on the data size, being up to 40 times faster than top competitors (GCap), still achieving better or equal accuracy, it spots images that potentially require unpredicted labels, and it works even with tiny initial label sets, i.e., nearly five examples. We also report a case study of our method's practical usage to show that QuMinS is a viable tool for automatic coffee crop detection from remote sensing images. (C) 2013 Elsevier Inc. All rights reserved.
引用
收藏
页码:211 / 229
页数:19
相关论文
共 50 条
  • [41] A Scalable Platform for Integrated Multi-Modal Neuroimaging Data Processing and Analysis Across Psychiatric Studies
    Ji, Jie Lisa
    Demsar, Jure
    Fonteneau, Clara
    Warrington, Shaun
    Tamayo, Zailyn
    Kraljic, Aleksij
    Matkovic, Andraz
    Purg, Nina
    Helmer, Markus
    Sotiropoulos, Stamatios
    Murray, John
    Anticevic, Alan
    Repovs, Grega
    BIOLOGICAL PSYCHIATRY, 2022, 91 (09) : S321 - S322
  • [42] Mining Multi-Modal Crime Patterns At Different Levels of Granularity Using Hierarchical Clustering
    Boo, Yee Ling
    Alahakoon, Damminda
    2008 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING CONTROL & AUTOMATION, VOLS 1 AND 2, 2008, : 1268 - 1273
  • [43] An Abnormal Behavior Detection Method Leveraging Multi-modal Data Fusion and Deep Mining
    Tian, Xinyu
    Zheng, Qinghe
    Jiang, Nan
    IAENG International Journal of Applied Mathematics, 2021, 51 (01)
  • [44] A Multi-modal Data Mining Algorithm for Corner Case of Automatic Driving Road Scene
    Wang, Hai
    Zhang, Guirong
    Luo, Tong
    Qiu, Meng
    Cai, Yingfeng
    Chen, Long
    Qiche Gongcheng/Automotive Engineering, 2024, 46 (07): : 1239 - 1248
  • [45] Editorial: Special Issue on Multi-modal Information mining and Analytics for Environmental Technology & Innovation
    Xu, Zheng
    Yen, Neil
    Sugumaran, Vijayan
    ENVIRONMENTAL TECHNOLOGY & INNOVATION, 2022, 28
  • [46] Fast Multi-Task SCCA Learning with Feature Selection for Multi-Modal Brain Imaging Genetics
    Du, Lei
    Liu, Kefei
    Yao, Xiaohui
    Risacher, Shannon L.
    Han, Junwei
    Guo, Lei
    Saykin, Andrew J.
    Shen, Li
    Weiner, Michael
    Aisen, Paul
    Petersen, Ronald
    Jack, Clifford R., Jr.
    Jagust, William
    Trojanowki, John Q.
    Toga, Arthur W.
    Beckett, Laurel
    Green, Robert C.
    Saykin, Andrew J.
    Morris, John
    Liu, Enchi
    Montine, Tom
    Gamst, Anthony
    Thomas, Ronald G.
    Donohue, Michael
    Walter, Sarah
    Gessert, Devon
    Sather, Tamie
    Harvey, Danielle
    Kornak, John
    Dale, Anders
    Bernstein, Matthew
    Felmlee, Joel
    Fox, Nick
    Thompson, Paul
    Schuff, Norbert
    Alexander, Gene
    DeCarli, Charles
    Bandy, Dan
    Koeppe, Robert A.
    Foster, Norm
    Reiman, Eric M.
    Chen, Kewei
    Mathis, Chet
    Cairns, Nigel J.
    Taylor-Reinwald, Lisa
    Shaw, Les
    Lee, Virginia M. Y.
    Korecka, Magdalena
    Crawford, Karen
    Neu, Scott
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 356 - 361
  • [47] Fast image registration via joint gradient maximization: Application to multi-modal data
    Mei, Xue
    Porikli, Fatih
    ELECTRO-OPTICAL AND INFRARED SYSTEMS: TECHNOLOGY AND APPLICATIONS III, 2006, 6395
  • [48] Binary multi-modal matrix factorization for fast item cold-start recommendation
    Peng, Chengmei
    Zhu, Lei
    Xu, Yang
    Li, Yaping
    Guo, Lei
    NEUROCOMPUTING, 2022, 507 : 145 - 156
  • [49] Efficient and fast multi-modal foreground-background segmentation using RGBD data
    Trabelsi, Rim
    Jabri, Issam
    Smach, Fethi
    Bouallegue, Ammar
    PATTERN RECOGNITION LETTERS, 2017, 97 : 13 - 20