Data Mining From Web Search Queries: A Comparison of Google Trends and Baidu Index

被引:74
|
作者
Vaughan, Liwen [1 ,2 ]
Chen, Yue [2 ]
机构
[1] Univ Western Ontario, Fac Informat & Media Studies, London, ON N6A 5B7, Canada
[2] Dalian Univ Technol, Sch Publ Adm, Inst Sci Studies & S&T Management, WISELAB, Dalian 116085, Liaoning Provin, Peoples R China
关键词
web mining; webometrics;
D O I
10.1002/asi.23201
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Numerous studies have explored the possibility of uncovering information from web search queries but few have examined the factors that affect web query data sources. We conducted a study that investigated this issue by comparing Google Trends and Baidu Index. Data from these two services are based on queries entered by users into Google and Baidu, two of the largest search engines in the world. We first compared the features and functions of the two services based on documents and extensive testing. We then carried out an empirical study that collected query volume data from the two sources. We found that data from both sources could be used to predict the quality of Chinese universities and companies. Despite the differences between the two services in terms of technology, such as differing methods of language processing, the search volume data from the two were highly correlated and combining the two data sources did not improve the predictive power of the data. However, there was a major difference between the two in terms of data availability. Baidu Index was able to provide more search volume data than Google Trends did. Our analysis showed that the disadvantage of Google Trends in this regard was due to Google's smaller user base in China. The implication of this finding goes beyond China. Google's user bases in many countries are smaller than that in China, so the search volume data related to those countries could result in the same issue as that related to China.
引用
收藏
页码:13 / 22
页数:10
相关论文
共 50 条
  • [41] Deriving query intents from web search engine queries
    Lewandowski, Dirk
    Drechsler, Jessica
    von Mach, Sonja
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2012, 63 (09): : 1773 - 1788
  • [43] Analysis of characteristics and trends of Web queries submitted to NAVER, a major Korean search engine
    Park, Soyeon
    LIBRARY & INFORMATION SCIENCE RESEARCH, 2009, 31 (02) : 126 - 133
  • [44] Using Search Trends to Analyze Web-Based Interest in Lower Urinary Tract Symptoms-Related Inquiries, Diagnoses, and Treatments in Mainland China: Infodemiology Study of Baidu Index Data
    Wei, Shanzun
    Ma, Ming
    Wu, Changjing
    Yu, Botao
    Jiang, Lisha
    Wen, Xi
    Fu, Fudong
    Shi, Ming
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (07)
  • [45] Referred by Google: mining Google Trends data to identify patterns in and correlates to searches for dermatological concerns and providers
    Ransohoff, J. D.
    Sarin, K. Y.
    BRITISH JOURNAL OF DERMATOLOGY, 2018, 178 (03) : 794 - 795
  • [46] How often people google for vaccination: Qualitative and quantitative insights from a systematic search of the web-based activities using Google Trends
    Bragazzi, Nicola Luigi
    Barberis, Ilaria
    Rosselli, Roberto
    Gianfredi, Vincenza
    Nucci, Daniele
    Moretti, Massimo
    Salvatori, Tania
    Martucci, Gianfranco
    Martini, Mariano
    HUMAN VACCINES & IMMUNOTHERAPEUTICS, 2017, 13 (02) : 464 - 469
  • [47] New Horizons in Web Search, Web Data Mining, and Web-Based Applications
    Zhang, Jing
    Qiang, Jipeng
    Zhou, Cangqi
    APPLIED SCIENCES-BASEL, 2024, 14 (02):
  • [48] Using Baidu search values to monitor and predict the confirmed cases of COVID-19 in China: - evidence from Baidu index
    Tu, Bizhi
    Wei, Laifu
    Jia, Yaya
    Qian, Jun
    BMC INFECTIOUS DISEASES, 2021, 21 (01)
  • [49] Using Baidu search values to monitor and predict the confirmed cases of COVID-19 in China: – evidence from Baidu index
    Bizhi Tu
    Laifu Wei
    Yaya Jia
    Jun Qian
    BMC Infectious Diseases, 21
  • [50] Web-Based Content on Diet and Nutrition Written in Japanese: Infodemiology Study Based on Google Trends and Google Search
    Murakami, Kentaro
    Shinozaki, Nana
    Kimoto, Nana
    Onodera, Hiroko
    Oono, Fumi
    McCaffrey, Tracy A.
    Livingstone, M. Barbara E.
    Okuhara, Tsuyoshi
    Matsumoto, Mai
    Katagiri, Ryoko
    Ota, Erika
    Chiba, Tsuyoshi
    Nishida, Yuki
    Sasaki, Satoshi
    JMIR FORMATIVE RESEARCH, 2023, 7