Improving automatic query classification via semi-supervised learning

被引:31
|
作者
Beitzel, SM
Jensen, EC
Frieder, O
Lewis, DD
Chowdhury, A
Kolcz, A
机构
关键词
D O I
10.1109/ICDM.2005.80
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Accurate topical classification of user queries allows for increased effectiveness and efficiency in general-purpose web search systems. Such classification becomes critical if the system is to return results not just from a general web collection but from topic-specific back-end databases as well. Maintaining sufficient classification recall is very difficult as web queries are typically short, yielding few features per query. This feature sparseness coupled with the high query volumes typical for a large-scale search service makes manual and supervised learning approaches alone insufficient. We use an application of computational linguistics to develop an approach for mining the vast amount of unlabeled data in web query logs to improve automatic topical web query classification. We show that our approach in combination with manual matching and supervised learning allows its to classify a substantially larger proportion of queries than any single technique. We examine the performance of each approach on a real web query stream and show that our combined method accurately classifies 46% of queries, out performing the recall of best single approach by nearly 20% with a 7% improvement in overall effectiveness.
引用
收藏
页码:42 / 49
页数:8
相关论文
共 50 条
  • [1] DualGraph: Improving Semi-supervised Graph Classification via Dual Contrastive Learning
    Luo, Xiao
    Ju, Wei
    Qu, Meng
    Chen, Chong
    Deng, Minghua
    Hua, Xian-Sheng
    Zhang, Ming
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 699 - 712
  • [2] Improving Semi-Supervised Learning for Audio Classification with FixMatch
    Grollmisch, Sascha
    Cano, Estefania
    ELECTRONICS, 2021, 10 (15)
  • [3] Semi-supervised classification with active query selection
    Wang, Jiao
    Luo, Siwei
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, PROCEEDINGS, 2006, 4109 : 741 - 746
  • [4] Plusmine: Dynamic Active Learning with Semi-Supervised Learning for Automatic Classification
    Klein, Jan
    Bhulai, Sandjai
    Hoogendoorn, Mark
    van der Mei, Rob
    2021 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2021), 2021, : 146 - 153
  • [5] Mutual Learning of Complementary Networks via Residual Correction for Improving Semi-Supervised Classification
    Wu, Si
    Li, Jichang
    Liu, Cheng
    Yu, Zhiwen
    Wong, Hau-San
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6493 - 6502
  • [6] Improving classification with semi-supervised and fine-grained learning
    Lai, Danyu
    Tian, Wei
    Chen, Long
    PATTERN RECOGNITION, 2019, 88 : 547 - 556
  • [7] GraphixMatch: Improving semi-supervised learning for graph classification with FixMatch
    Koh, Eunji
    Lee, Young Jae
    Kim, Seoung Bum
    NEUROCOMPUTING, 2024, 607
  • [8] Malware classification for the cloud via semi-supervised transfer learning
    Gao, Xianwei
    Hu, Changzhen
    Shan, Chun
    Liu, Baoxu
    Niu, Zequn
    Xie, Hui
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2020, 55
  • [9] Automatic Defect Classification Using Semi-Supervised Learning With Defect Localization
    Kim, Yusung
    Lee, Jin-Seop
    Lee, Jee-Hyong
    IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, 2023, 36 (03) : 476 - 485
  • [10] Semi-Supervised Learning for ECG Classification
    Rodrigues, Rui
    Couto, Paula
    2021 COMPUTING IN CARDIOLOGY (CINC), 2021,