Mining Domain Terminologies Using Search Engine's Query Log

被引：0

作者：

Ni, Weijian ^{[1
]}

Liu, Tong ^{[1
]}

Zeng, Qingtian ^{[1
]}

Xie, Nengfu ^{[2
]}

机构：

[1] Shandong Univ Sci & Technol, Qingdao, Peoples R China

[2] Chinese Acad Agr Sci, Beijing, Peoples R China

来源：

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING | 2021年 / 20卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Domain terminology; search engine; query log; network embedding; transductive learning; EXTRACTION; TERMS;

D O I：

10.1145/3462327

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Domain terminologies are a basic resource for various natural language processing tasks. To automatically discover terminologies for a domain of interest, most traditional approaches mostly rely on a domain-specific corpus given in advance; thus, the performance of traditional approaches can only be guaranteed when collecting a high-quality domain-specific corpus, which requires extensive human involvement and domain expertise. In this article, we propose a novel approach that is capable of automatically mining domain terminologies using search engine's query log-a type of domain-independent corpus of higher availability, coverage, and timeliness than a manually collected domain-specific corpus. In particular, we represent query log as a heterogeneous network and formulate the task of mining domain terminology as transductive learning on the heterogeneous network. In the proposed approach, the manifold structure of domain-specificity inherent in query log is captured by using a novel network embedding algorithm and further exploited to reduce the need for the manual annotation efforts for domain terminology classification. We select Agriculture and Healthcare as the target domains and experiment using a real query log from a commercial search engine. Experimental results show that the proposed approach outperforms several state-of-the-art approaches.

引用

页数：32

共 50 条

[1] Mining search engine query log for evaluating content and structure of a web site
Hosseini, Mehdi
Abolhassani, Hassan
PROCEEDINGS OF THE IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE: WI 2007, 2007, : 235 - 241
[2] Clustering Search Engine Log for Query Recommendation
Hosseini, Mehdi
Abolhassani, Hassan
ADVANCES IN COMPUTER SCIENCE AND ENGINEERING, 2008, 6 : 380 - 387
[3] Privacy in Web Search Query Log Mining
Jones, Rosie
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2009, 5781 : 4 - 4
[4] Mining a Search Engine's Corpus Without a Query Pool
Zhang, Mingyang
Zhang, Nan
Das, Gautam
PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 29 - 38
[5] Search Engine Pictures: Empirical Analysis of a Web Search Engine Query Log
Shoeleh, Farzaneh
Zahedi, Mohammad Sadegh
Farhoodi, Mojgan
2017 3RD INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2017, : 90 - 95
[6] Mining synonymous entities using search engine query logs
Li, Zhichao
Zhang, Min
Ma, Shaoping
Journal of Computational Information Systems, 2009, 5 (03): : 1217 - 1224
[7] Content free clustering for search engine query log
Hosseini, Mehdi
Abolhassani, Hassan
Harikandeh, Mohsn Sayyadi
NEW ADVANCES IN SIMULATION, MODELLING AND OPTIMIZATION (SMO '07), 2007, : 201 - +
[8] Query intent inference via search engine log
Jiang, Di
Yang, Lingxiao
KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 49 (02) : 661 - 685
[9] Query intent inference via search engine log
Di Jiang
Lingxiao Yang
Knowledge and Information Systems, 2016, 49 : 661 - 685
[10] Intent Based Clustering of Search Engine Query Log
Veilumuthu, Ashok
Ramachandran, Parthasarathy
2009 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING, 2009, : 647 - 652

← 1 2 3 4 5 →