Data Mining From Web Search Queries: A Comparison of Google Trends and Baidu Index

被引:74
|
作者
Vaughan, Liwen [1 ,2 ]
Chen, Yue [2 ]
机构
[1] Univ Western Ontario, Fac Informat & Media Studies, London, ON N6A 5B7, Canada
[2] Dalian Univ Technol, Sch Publ Adm, Inst Sci Studies & S&T Management, WISELAB, Dalian 116085, Liaoning Provin, Peoples R China
关键词
web mining; webometrics;
D O I
10.1002/asi.23201
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Numerous studies have explored the possibility of uncovering information from web search queries but few have examined the factors that affect web query data sources. We conducted a study that investigated this issue by comparing Google Trends and Baidu Index. Data from these two services are based on queries entered by users into Google and Baidu, two of the largest search engines in the world. We first compared the features and functions of the two services based on documents and extensive testing. We then carried out an empirical study that collected query volume data from the two sources. We found that data from both sources could be used to predict the quality of Chinese universities and companies. Despite the differences between the two services in terms of technology, such as differing methods of language processing, the search volume data from the two were highly correlated and combining the two data sources did not improve the predictive power of the data. However, there was a major difference between the two in terms of data availability. Baidu Index was able to provide more search volume data than Google Trends did. Our analysis showed that the disadvantage of Google Trends in this regard was due to Google's smaller user base in China. The implication of this finding goes beyond China. Google's user bases in many countries are smaller than that in China, so the search volume data related to those countries could result in the same issue as that related to China.
引用
收藏
页码:13 / 22
页数:10
相关论文
共 50 条
  • [1] DIURNAL PATTERNS OF INSOMNIA INTERNET SEARCH QUERIES: AN ANALYSIS OF GOOGLE TRENDS DATA
    Prairie, M. L.
    Cook, J. D.
    Plante, D. T.
    SLEEP, 2017, 40 : A150 - A150
  • [2] Search trends and prediction of human brucellosis using Baidu index data from 2011 to 2018 in China
    Chenhao Zhao
    Yuhan Yang
    Songyu Wu
    Wenchao Wu
    Hetian Xue
    Kai An
    Qing Zhen
    Scientific Reports, 10
  • [3] Search trends and prediction of human brucellosis using Baidu index data from 2011 to 2018 in China
    Zhao, Chenhao
    Yang, Yuhan
    Wu, Songyu
    Wu, Wenchao
    Xue, Hetian
    An, Kai
    Zhen, Qing
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [4] Mining longitudinal web queries: Trends and patterns
    Wang, PL
    Berry, MW
    Yang, YH
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2003, 54 (08): : 743 - 758
  • [5] Can Google Trends search queries contribute to risk diversification?
    Ladislav Kristoufek
    Scientific Reports, 3
  • [6] Can Google Trends search queries contribute to risk diversification?
    Kristoufek, Ladislav
    SCIENTIFIC REPORTS, 2013, 3
  • [7] A survey of inflammatory bowel disease in China based on the Google Trends,Baidu index and WeChat index
    聂凯
    China Medical Abstracts (Internal Medicine), 2022, 39 (03) : 173 - 173
  • [8] Children's Web Search With Google: The Effectiveness of Natural Language Queries
    Kammerer, Yvonne
    Bohnacker, Maja
    PROCEEDINGS OF IDC 2012: THE 11TH INTERNATIONAL CONFERENCE ON INTERACTION DESIGN AND CHILDREN, 2012, : 184 - 187
  • [9] Web data mining trends
    Baeza-Yates, Ricardo
    PROFESIONAL DE LA INFORMACION, 2009, 18 (01): : 5 - 10
  • [10] Using Google Trends and Baidu Index to analyze the impacts of disaster events on company stock prices
    Liu, Ying
    Peng, Geng
    Hu, Lanyi
    Dong, Jichang
    Zhang, Qingqing
    INDUSTRIAL MANAGEMENT & DATA SYSTEMS, 2020, 120 (02) : 350 - 365