Text-based approaches for non-topical image categorization

被引:8
|
作者
Sable C.L. [1 ]
Hatzivassiloglou V. [1 ]
机构
[1] Department of Computer Science, Columbia University, New York, NY 10027, 450 Computer Science Building
关键词
Evaluation in the presence of uncertainty; High-level image features; Image categorization; Probabilistic TF*IDF; Text similarity features;
D O I
10.1007/s007990000038
中图分类号
学科分类号
摘要
The rapid expansion of multimedia digital collections brings to the fore the need for classifying not only text documents but their embedded non-textual parts as well. We propose a model for basing classification of multimedia on broad, non-topical features, and show how information on targeted nearby pieces of text can be used to effectively classify photographs on a first such feature, distinguishing between indoor and outdoor images. We examine several variations to a TF*IDFbased approach for this task, empirically analyze their effects, and evaluate our system on a large collection of images from current news newsgroups. In addition, we investigate alternative classification and evaluation methods, and the effects that secondary features have on indoor/outdoor classification. Using density estimation over the raw TF*IDF values, we obtain a classification accuracy of 82%, a number that outperforms baseline estimates and earlier, image-based approaches, at least in the domain of news articles, and that nears the accuracy of humans who perform the same task with access to comparable information. © Springer-Verlag 2000.
引用
收藏
页码:261 / 275
页数:14
相关论文
共 50 条
  • [41] Text-based Question Difficulty Prediction: A Systematic Review of Automatic Approaches
    Alkhuzaey, Samah
    Grasso, Floriana
    Payne, Terry R.
    Tamma, Valentina
    INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE IN EDUCATION, 2023, 34 (3) : 862 - 914
  • [42] An Evaluation of Statistical Approaches to Text Categorization
    Yiming Yang
    Information Retrieval, 1999, 1 (1-2): : 69 - 90
  • [43] PRIMM: Exploring pedagogical approaches for teaching text-based programming in school
    Sentance, Sue
    Waite, Jane
    PROCEEDINGS OF THE 12TH WORKSHOP IN PRIMARY AND SECONDARY COMPUTING EDUCATION (WIPSCE 2017), 2017, : 113 - 114
  • [44] Extractive Text-Based Summarization of Arabic Videos: Issues, Approaches and Evaluations
    Menacer, Mohamed Amine
    Gonzalez-Gallardo, Carlos-Emiliano
    Abidi, Karima
    Fohr, Dominique
    Jouvet, Denis
    Langlois, David
    Mella, Odile
    Sadat, Fatiha
    Torres-Moreno, Juan-Manuel
    Smaili, Kamel
    ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, ICALP 2019, 2019, 1108 : 65 - 78
  • [45] WHAT IS TOPICAL AND NON-TOPICAL WITHIN THE REGULATING PRINCIPLES OF MODERN SOCIETY - THE OVERHANDEDNESS OF PARASITISM
    BAGIOTTI, T
    RIVISTA INTERNAZIONALE DI SCIENZE ECONOMICHE E COMMERCIALI, 1982, 29 (04): : 375 - 386
  • [46] Non-Topical Coherence in Social Talk: A Call for Dialogue Model Enrichment
    Lu'u, Alex
    Malamud, Sophia A.
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): STUDENT RESEARCH WORKSHOP, 2020, : 118 - 133
  • [47] An alternative approach to natural language query expansion in search engines: Text analysis of non-topical terms in Web documents
    Fattahi, Rahmatollah
    Wilson, Concepcion S.
    Cole, Fletcher
    INFORMATION PROCESSING & MANAGEMENT, 2008, 44 (04) : 1503 - 1516
  • [48] Text-based interfaces and text-based bibliographic enhancements: Thinking beyond standard bibliographic information (and text)
    Wall, TB
    PROCEEDINGS OF THE ASIS ANNUAL MEETING, 1996, 33 : 278 - 278
  • [49] EAES: Effective Augmented Embedding Spaces for Text-Based Image Captioning
    Khang Nguyen
    Bui, Doanh C.
    Truc Trinh
    Vo, Nguyen D.
    IEEE ACCESS, 2022, 10 : 32443 - 32452
  • [50] Towards Accurate Text-based Image Captioning with Content Diversity Exploration
    Xu, Guanghui
    Niu, Shuaicheng
    Tan, Mingkui
    Luo, Yucheng
    Du, Qing
    Wu, Qi
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12632 - 12641