A Structured Analysis of Unstructured Big Data by Leveraging Cloud Computing

被引:107
|
作者
Liu, Xiao [1 ]
Singh, Param Vir [2 ]
Srinivasan, Kannan [3 ]
机构
[1] NYU, Stern Sch Business, 550 1St Ave, New York, NY 10012 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Carnegie Mellon Univ, Tepper Sch Business, Pittsburgh, PA 15213 USA
关键词
big data; cloud computing; text mining; user generated content; Twitter; Google Trends; WORD-OF-MOUTH; USER-GENERATED CONTENT; BOX-OFFICE; PANEL-DATA; GMM ESTIMATION; DYNAMICS; REVIEWS; CHATTER; IMPACT; MATTER;
D O I
10.1287/mksc.2015.0972
中图分类号
F [经济];
学科分类号
02 ;
摘要
Accurate forecasting of sales/consumption is particularly important for marketing because this information can be used to adjust marketing budget allocations and overall marketing strategies. Recently, online social platforms have produced an unparalleled amount of data on consumer behavior. However, two challenges have limited the use of these data in obtaining meaningful business marketing insights. First, the data are typically in an unstructured format, such as texts, images, audio, and video. Second, the sheer volume of the data makes standard analysis procedures computationally unworkable. In this study, we combine methods from cloud computing, machine learning, and text mining to illustrate how online platform content, such as Twitter, can be effectively used for forecasting. We conduct our analysis on a significant volume of nearly two billion Tweets and 400 billion Wikipedia pages. Our main findings emphasize that, by contrast to basic surface-level measures such as the volume of or sentiments in Tweets, the information content of Tweets and their timeliness significantly improve forecasting accuracy. Our method endogenously summarizes the information in Tweets. The advantage of our method is that the classification of the Tweets is based on what is in the Tweets rather than preconceived topics that may not be relevant. We also find that, by contrast to Twitter, other online data (e.g., Google Trends, Wikipedia views, IMDB reviews, and Huffington Post news) are very weak predictors of TV show demand because users tweet about TV shows before, during, and after a TV show, whereas Google searches, Wikipedia views, IMDB reviews, and news posts typically lag behind the show.
引用
收藏
页码:363 / 388
页数:26
相关论文
共 50 条
  • [41] Cloud Computing: The Future of Big Data Management
    Ouf, Shimaa
    Nasr, Mona
    [J]. INTERNATIONAL JOURNAL OF CLOUD APPLICATIONS AND COMPUTING, 2015, 5 (02) : 53 - 61
  • [42] Challenges and Opportunities in Big Data and Cloud Computing
    Sohail, Hassan
    Zameer, Zeenia
    Ahmed, Hafiz Farhan
    Iqbal, Usama
    Shah, Pir Amad Ali
    [J]. FUTURE INTELLIGENT VEHICULAR TECHNOLOGIES, FUTURE 5V 2016, 2017, 185 : 175 - 181
  • [43] Big data analytics in Cloud computing: an overview
    Blend Berisha
    Endrit Mëziu
    Isak Shabani
    [J]. Journal of Cloud Computing, 11
  • [44] A survey of big data for IoT in cloud computing
    Cao, Junkuo
    Lin, Mingcai
    Ma, Xiaojin
    [J]. Ma, Xiaojin (xjma@shu.edu.cn), 1600, International Association of Engineers (47): : 585 - 592
  • [45] Modeling and simulation of cloud computing and big data
    Karatza, Helen D.
    Stavrinides, Georgios L.
    [J]. SIMULATION MODELLING PRACTICE AND THEORY, 2019, 93 (1-2) : 1 - 2
  • [46] Big Data Cleaning Algorithms in Cloud Computing
    Feng, Zhang
    Hui-Feng, Xue
    Dong-Sheng, Xu
    Yong-Heng, Zhang
    Fei, You
    [J]. INTERNATIONAL JOURNAL OF ONLINE ENGINEERING, 2013, 9 (03) : 77 - 81
  • [47] Big Data with Cloud Computing: Discussions and Challenges
    Sandhu, Amanpreet Kaur
    [J]. BIG DATA MINING AND ANALYTICS, 2022, 5 (01) : 32 - 40
  • [48] 'Big data', Hadoop and cloud computing in genomics
    O'Driscoll, Aisling
    Daugelaite, Jurate
    Sleator, Roy D.
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2013, 46 (05) : 774 - 781
  • [49] Big Data Processing in Cloud Computing Environments
    Ji, Changqing
    Li, Yu
    Qiu, Wenming
    Awada, Uchechukwu
    Li, Keqiu
    [J]. PROCEEDINGS OF THE 2012 12TH INTERNATIONAL SYMPOSIUM ON PERVASIVE SYSTEMS, ALGORITHMS, AND NETWORKS (I-SPAN 2012), 2012, : 17 - 23
  • [50] Big Data Analytic Using Cloud Computing
    Jain, Vinay Kumar
    Kumar, Shishir
    [J]. 2015 SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING AND COMMUNICATION ENGINEERING ICACCE 2015, 2015, : 667 - 672