Improving automatic query classification via semi-supervised learning

被引:31
|
作者
Beitzel, SM
Jensen, EC
Frieder, O
Lewis, DD
Chowdhury, A
Kolcz, A
机构
关键词
D O I
10.1109/ICDM.2005.80
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Accurate topical classification of user queries allows for increased effectiveness and efficiency in general-purpose web search systems. Such classification becomes critical if the system is to return results not just from a general web collection but from topic-specific back-end databases as well. Maintaining sufficient classification recall is very difficult as web queries are typically short, yielding few features per query. This feature sparseness coupled with the high query volumes typical for a large-scale search service makes manual and supervised learning approaches alone insufficient. We use an application of computational linguistics to develop an approach for mining the vast amount of unlabeled data in web query logs to improve automatic topical web query classification. We show that our approach in combination with manual matching and supervised learning allows its to classify a substantially larger proportion of queries than any single technique. We examine the performance of each approach on a real web query stream and show that our combined method accurately classifies 46% of queries, out performing the recall of best single approach by nearly 20% with a 7% improvement in overall effectiveness.
引用
收藏
页码:42 / 49
页数:8
相关论文
共 50 条
  • [31] Semi-supervised tensor learning for image classification
    Jianguang Zhang
    Yahong Han
    Jianmin Jiang
    Multimedia Systems, 2017, 23 : 63 - 73
  • [32] Semi-supervised learning for question classification in CQA
    Yiyang Li
    Lei Su
    Jun Chen
    Liwei Yuan
    Natural Computing, 2017, 16 : 567 - 577
  • [33] VideoSSL: Semi-Supervised Learning for Video Classification
    Jing, Longlong
    Parag, Toufiq
    Wu, Zhe
    Tian, Yingli
    Wang, Hongcheng
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1109 - 1118
  • [34] Semi-supervised Learning for Image Modality Classification
    de Herrera, Alba Garcia Seco
    Markonis, Dimitrios
    Joyseeree, Ranveer
    Schaer, Roger
    Foncubierta-Rodriguez, Antonio
    Mueller, Henning
    MULTIMODAL RETRIEVAL IN THE MEDICAL DOMAIN, MRMD 2015, 2015, 9059 : 85 - 98
  • [35] Semi-Supervised Classification Based on Transformed Learning
    Kang Z.
    Liu L.
    Han M.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (01): : 103 - 111
  • [36] A review of semi-supervised learning for text classification
    Duarte, Jose Marcio
    Berton, Lilian
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (09) : 9401 - 9469
  • [37] Safe semi-supervised learning for pattern classification
    Ma, Jun
    Yu, Guolin
    Xiong, Weizhi
    Zhu, Xiaolong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 121
  • [38] Semi-supervised learning for question classification in CQA
    Li, Yiyang
    Su, Lei
    Chen, Jun
    Yuan, Liwei
    NATURAL COMPUTING, 2017, 16 (04) : 567 - 577
  • [39] SEMI-SUPERVISED LEARNING FOR MARS IMAGERY CLASSIFICATION
    Wang, Wenjing
    Lin, Lilang
    Fan, Zejia
    Liu, Baying
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 499 - 503
  • [40] Integrated Semi-Supervised Model for Learning and Classification
    Bhalla, Vandna
    Chaudhury, Santanu
    PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON COMPUTER VISION AND IMAGE PROCESSING, CVIP 2018, VOL 1, 2020, 1022 : 183 - 195