Automatic Removal of Visual Stop-Words

被引:3
|
作者
Roman-Rangel, Edgar [1 ]
Marchand-Maillet, Stephane [1 ]
机构
[1] Univ Geneva, CVMLab, Geneva, Switzerland
基金
瑞士国家科学基金会;
关键词
Bag-of-visual-words; stop-words; entropy; Bhattacharyya coefficient; content-based image retrieval; image classification;
D O I
10.1145/2647868.2655005
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper presents a new methodology for the automatic estimation of the optimal amount of visual words that can be removed from a visual dictionary, such that no harm is induced in the discriminative potential of the resulting bag-of-visual-words representations. The proposed approach relies on a special definition of the entropy of each visual word when considered as a random variable, and a new definition of the overlap of class models computed with a normalized Bhattacharyya coefficient. We combined our proposed methodology with a recent approach that labels visual words as stop-words showing that this combination is beneficial to reduce the dimensionality of bag representations, while obtaining good results in terms of classification accuracy and retrieval performance.
引用
收藏
页码:1145 / 1148
页数:4
相关论文
共 50 条
  • [1] Stop-words in Keyphrase Extraction Problem
    Popova, S.
    Kovriguina, L.
    Mouromtsev, D.
    Khodyrev, I.
    2013 14TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2013, : 113 - 121
  • [2] Influence of Stop-Words Removal on Sequence Patterns Identification within Comparable Corpora
    Munkova, Dasa
    Munk, Michal
    Vozar, Martin
    ICT INNOVATIONS 2013: ICT INNOVATIONS AND EDUCATION, 2014, 231 : 67 - 76
  • [3] On Arabic Stop-Words: A Comprehensive List and a Dedicated Morphological Analyzer
    Namly, Driss
    Bouzoubaa, Karim
    Tajmout, Rachida
    Laadimi, Ali
    ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, ICALP 2019, 2019, 1108 : 149 - 163
  • [4] THE AUTOMATIC IDENTIFICATION OF STOP WORDS
    WILBUR, WJ
    SIROTKIN, K
    JOURNAL OF INFORMATION SCIENCE, 1992, 18 (01) : 45 - 55
  • [5] Building Hybrid Stop-Words Technique with Normalization for Pre-Processing Arabic Text
    Atwan, Jaffar
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (07): : 65 - 74
  • [6] The Research of Sina Malicious Comments Detection Based on Semantic Information and Stop-Words Table
    Wang, Yanan
    Shi, Yijie
    PROCEEDINGS OF 2017 IEEE 7TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC), 2017, : 444 - 447
  • [7] Enhancing text pre-processing for Swahili language: Datasets for common Swahili stop-words, slangs and typos with equivalent proper words
    Masua, Bernard
    Masasi, Noel
    DATA IN BRIEF, 2020, 33
  • [8] On Continent and Script-Wise Divisions-Based Statistical Measures for Stop-Words Lists of International Languages
    Saini, Jatinderkumar R.
    Rakholia, Rajnish M.
    TWELFTH INTERNATIONAL CONFERENCE ON COMMUNICATION NETWORKS, ICCN 2016 / TWELFTH INTERNATIONAL CONFERENCE ON DATA MINING AND WAREHOUSING, ICDMW 2016 / TWELFTH INTERNATIONAL CONFERENCE ON IMAGE AND SIGNAL PROCESSING, ICISP 2016, 2016, 89 : 313 - 319
  • [9] Automatic Detection of Stop Words for Texts in the Uzbek Language
    Madatov K.
    Bekchanov S.
    Vičič J.
    Informatica (Slovenia), 2023, 47 (02): : 143 - 150
  • [10] Refined stop-words and morphological variants solutions applied to Hindi-English cross-lingual information retrieval
    Sharma, Vijay
    Mittal, Namita
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (03) : 2219 - 2227