Extracting insights from social media with large-scale matrix approximations

被引：2

作者：

Sindhwani, V. ^{[1
]}

Ghoting, A. ^{[1
]}

Ting, E. ^{[2
]}

Lawrence, R. ^{[1
]}

机构：

[1] IBM Corp, Div Res, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA

[2] IBM Software Grp, Silicon Valley Lab, San Jose, CA 95141 USA

来源：

IBM JOURNAL OF RESEARCH AND DEVELOPMENT | 2011年 / 55卷 / 05期

关键词：

FACTORIZATION; ALGORITHM;

D O I：

10.1147/JRD.2011.2163281

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Social media platforms such as blogs, Twitter (R) accounts, and online discussion sites are large-scale forums where every individual can potentially voice an influential public opinion. According to recent surveys, a massive number of Internet users are turning to such forums to collect recommendations and reviews for products and services, and to shape their individual choices and stances by the commentary of the online community as a whole. The unsupervised extraction of insight from unstructured user-generated web content requires new methodologies that are likely to be rooted in natural language processing and machine-learning techniques. Furthermore, the unprecedented scale of data begging to be analyzed necessitates the implementation of these methodologies on modern distributed computing platforms. In this paper, we describe a flexible new family of low-rank matrix approximation algorithms for modeling topics in a given corpus of documents (e.g., blog posts and tweets). We benchmark distributed optimization algorithms for running these models in a Hadoop (TM)-enabled cluster environment. We describe online learning strategies for tracking the evolution of ongoing topics and rapidly detecting the emergence of new themes in a streaming setting.

引用

页数：13

共 50 条

[1] A System for Extracting Sentiment from Large-Scale Arabic Social Data
Wang, Hao
Bommireddipalli, Vijay R.
Hanafy, Ayman
Bahgat, Mohamed
Noeman, Sara
Emam, Ossama S.
2015 FIRST INTERNATIONAL CONFERENCE ON ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2015): ADVANCES IN ARABIC COMPUTATIONAL LINGUISTICS, 2015, : 71 - 77
[2] RANDOMIZED SKETCHING FOR KRYLOV APPROXIMATIONS OF LARGE-SCALE MATRIX FUNCTIONS
Guttel, Stefan
Schweitzer, Marcel
SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2023, 44 (03) : 1073 - 1095
[3] Extracting large-scale knowledge bases from the web
Kumar, R
Raghavan, P
Rajagopalan, S
Tomkins, A
PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, 1999, : 639 - 650
[4] Appraising SPARK on Large-Scale Social Media Analysis
Belcastro, Loris
Marozzo, Fabrizio
Talia, Domenico
Trunfio, Paolo
EURO-PAR 2017: PARALLEL PROCESSING WORKSHOPS, 2018, 10659 : 483 - 495
[5] Large-Scale Social-Media Analytics on Stratosphere
Boden, Christoph
Markl, Volker
Karnstedt, Marcel
Fernandez, Miriam
PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION), 2013, : 257 - 260
[6] Insights into a spatially embedded social network from a large-scale snowball sample
J. Illenberger
M. Kowald
K. W. Axhausen
K. Nagel
The European Physical Journal B, 2011, 84 : 549 - 561
[7] Insights into a spatially embedded social network from a large-scale snowball sample
Illenberger, J.
Kowald, M.
Axhausen, K. W.
Nagel, K.
EUROPEAN PHYSICAL JOURNAL B, 2011, 84 (04): : 549 - 561
[8] Large-Scale Sleep Condition Analysis Using Selfies from Social Media
Peng, Xuefeng
Luo, Jiebo
Glenn, Catherine
Zhan, Jingyao
Liu, Yuhan
SOCIAL, CULTURAL, AND BEHAVIORAL MODELING, 2017, 10354 : 151 - 161
[9] Reasoning human emotional responses from large-scale social and public media
Li, Xianghua
Wang, Zhen
Gao, Chao
Shi, Lei
APPLIED MATHEMATICS AND COMPUTATION, 2017, 310 : 182 - 193
[10] Cross-domain semantic transfer from large-scale social media
Nie, Weizhi
Liu, Anan
Su, Yuting
MULTIMEDIA SYSTEMS, 2016, 22 (01) : 75 - 85

← 1 2 3 4 5 →