A distributed architecture for large scale news and social media processing

被引:0
|
作者
Varlamis, Iraklis [1 ]
Michail, Dimitrios [1 ]
Polydoras, Pavlos [2 ]
Tsantilas, Panagiotis [2 ]
机构
[1] Department of Informatics and Telematics, Harokopio University of Athens, Athens, Greece
[2] Palo Ltd, Kokkoni, Greece
关键词
Data handling - Middleware - Computer architecture - Pipeline processing systems - Social networking (online) - Text processing;
D O I
10.1504/IJWET.2020.114029
中图分类号
学科分类号
摘要
When designing a data processing and analytics pipeline for data streams, it is important to provide the data load and be able to successfully balance it over the available resources. This can be achieved more easily if small processing modules, which require limited resources, replace large monolithic processing software. In this work, we present the case of a social media and news analytics platform, called PaloAnalytics, which performs a series of content aggregation, information extraction (e.g., NER, sentiment tagging, etc.) and visualisation tasks in a large amount of data, on a daily basis. We demonstrate the architecture of the platform that relies on micro-modules and message-oriented middleware for delivering distributed content processing. Early results show that the proposed architecture can easily stand the increased content load that occasionally occurs in social media (e.g., when a major event takes place) and quickly release unused resources when the content load reaches its normal flow. Copyright © 2021 Inderscience Enterprises Ltd.
引用
收藏
页码:383 / 406
相关论文
共 50 条
  • [1] A Large Scale Distributed Virtual Environment Architecture
    Elfizar
    Baba, Mohd Sapiyan
    Herawan, Tutut
    [J]. STUDIES IN INFORMATICS AND CONTROL, 2015, 24 (02): : 159 - 170
  • [2] An architecture and platform for developing distributed recommendation algorithms on large-scale social networks
    Corbellini, Alejandro
    Mateos, Cristian
    Godoy, Daniela
    Zunino, Alejandro
    Schiaffino, Silvia
    [J]. JOURNAL OF INFORMATION SCIENCE, 2015, 41 (05) : 686 - 704
  • [3] An architecture for distributed real-time large-scale information processing for intelligence analysis
    Santos, E
    [J]. INTELLIGENT COMPUTING: THEORY AND APPLICATIONS II, 2004, 5421 : 161 - 171
  • [4] Large Scale Graph Processing in a Distributed Environment
    Upadhyay, Nitesh
    Patel, Parita
    Cheramangalath, Unnikrishnan
    Srikant, Y. N.
    [J]. EURO-PAR 2017: PARALLEL PROCESSING WORKSHOPS, 2018, 10659 : 465 - 477
  • [5] Reuters Tracer: Toward Automated News Production Using Large Scale Social Media Data
    Liu, Xiaomo
    Nourbakhsh, Armineh
    Li, Quanzhi
    Shah, Sameena
    Martin, Robert
    Duprey, John
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 1483 - 1493
  • [6] Reuters tracer: Toward automated news production using large scale social media data
    Liu, Xiaomo
    Nourbakhsh, Armineh
    Liy, Quanzhi
    Shahz, Sameena
    Martin, Robert
    Duprey, John
    [J]. arXiv, 2017,
  • [7] A fully distributed architecture for large scale workflow enactment
    Silva, RS
    Wainer, J
    Madeira, ERM
    [J]. INTERNATIONAL JOURNAL OF COOPERATIVE INFORMATION SYSTEMS, 2003, 12 (04) : 411 - 440
  • [8] A Distributed Architecture for Large Scale Multimedia Visualization and Surveillance
    Liu, Weijian
    Cai, Caiguan
    Tan, Xiaogang
    Huang, Baohua
    Jiang, Michael
    [J]. PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 1766 - 1770
  • [9] Distributed architecture for large-scale video servers
    Tanaka, K
    Sakamoto, H
    Suzuki, H
    Nishimura, K
    [J]. ICICS - PROCEEDINGS OF 1997 INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING, VOLS 1-3: THEME: TRENDS IN INFORMATION SYSTEMS ENGINEERING AND WIRELESS MULTIMEDIA COMMUNICATIONS, 1997, : 578 - 583
  • [10] Practical Near-Data-Processing Architecture for Large-Scale Distributed Graph Neural Network
    Huang, Linyong
    Zhang, Zhe
    Li, Shuangchen
    Niu, Dimin
    Guan, Yijin
    Zheng, Hongzhong
    Xie, Yuan
    [J]. IEEE ACCESS, 2022, 10 : 46796 - 46807