Near Real-Time Big Data Stream Processing Platform Using Cassandra

被引:0
|
作者
Pal, Gautam [1 ]
Li, Gangmin [2 ]
Atkinson, Katie [1 ]
机构
[1] Univ Liverpool, Dept Comp Sci, Liverpool, Merseyside, England
[2] Xian Jiaotong Liverpool Univ, Dept Comp Sci, Suzhou, Peoples R China
关键词
Real-Time Big Data Analytics; Real-Time Data Ingestion; Cassandra; Datastax;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Users are always impatient to get answers instantly from analytics system. If time to insight exceeds 10s of milliseconds, then the value is lost. Applications such as stock market, sensors, Twitter feed data or fraud detection can't afford to wait. This often means analyzing the inflow of data before it even stored to the database of records. Coupled with zero tolerance for data loss and the challenge gets even more daunting. In real-time Big Data scenario rather waiting for data to be collected as a whole at a long periodic interval, streaming analysis let us identify patterns and make informed decisions based on them-as data start arriving. When data are non-stationary, and patterns change with time, streaming systems adapt itself. This work describes near real-time data storage and processing approaches to analyze streams of data with respect to Cassandra NoSQL datastore. It provides an insight into optimizing Cassandra on a multi data center setup for near Real-Time Responses. The classic trade-off between low-latency and high-accuracy is conceptualized. The theoretical claims are corroborated with several thorough experimental analysis in Apache and Datastax distribution of Cassandra.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Real-time stream processing for Big Data
    Wingerath, Wolfram
    Gessert, Felix
    Friedrich, Steffen
    Ritter, Norbert
    [J]. IT-INFORMATION TECHNOLOGY, 2016, 58 (04): : 186 - 194
  • [2] SpeedStream: A Real-Time Stream Data Processing Platform in The Cloud
    Li Zhao
    Zhang Chuang
    Xu Ke-fu
    [J]. 2015 IEEE 34TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2015,
  • [3] Stream Processing For Near Real-Time Scientific Data Analysis
    Choi, Jong Youl
    Kurc, Tahsin
    Logan, Jeremy
    Wolf, Matthew
    Suchyta, Eric
    Kress, James
    Pugmire, David
    Podhorszki, Norbert
    Byun, Eun-Kyu
    Ainsworth, Mark
    Pwashar, Manish
    Klasky, Scott
    [J]. 2016 NEW YORK SCIENTIFIC DATA SUMMIT (NYSDS), 2016,
  • [4] Real-time intelligent big data processing: technology, platform, and applications
    Tongya Zheng
    Gang Chen
    Xinyu Wang
    Chun Chen
    Xingen Wang
    Sihui Luo
    [J]. Science China Information Sciences, 2019, 62
  • [5] Real-time intelligent big data processing:technology, platform, and applications
    Tongya ZHENG
    Gang CHEN
    Xinyu WANG
    Chun CHEN
    Xingen WANG
    Sihui LUO
    [J]. Science China(Information Sciences), 2019, 62 (08) : 102 - 113
  • [6] Real-time intelligent big data processing: technology, platform, and applications
    Zheng, Tongya
    Chen, Gang
    Wang, Xinyu
    Chen, Chun
    Wang, Xingen
    Luo, Sihui
    [J]. SCIENCE CHINA-INFORMATION SCIENCES, 2019, 62 (08)
  • [7] Real-Time Big Data Stream Processing Using GPU with Spark Over Hadoop Ecosystem
    M. Mazhar Rathore
    Hojae Son
    Awais Ahmad
    Anand Paul
    Gwanggil Jeon
    [J]. International Journal of Parallel Programming, 2018, 46 : 630 - 646
  • [8] Real-Time Big Data Stream Processing Using GPU with Spark Over Hadoop Ecosystem
    Rathore, M. Mazhar
    Son, Hojae
    Ahmad, Awais
    Paul, Anand
    Jeon, Gwanggil
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2018, 46 (03) : 630 - 646
  • [9] A survey on data stream, big data and real-time
    Gomes, Eliza H.A.
    Plentz, Patrícia D.M.
    De Rolt, Carlos R.
    Dantas, Mario A.R.
    [J]. International Journal of Networking and Virtual Organisations, 2019, 20 (02): : 143 - 167
  • [10] Near real-time big-data processing for data driven applications
    Kampars, Janis
    Grabis, Janis
    [J]. 2017 3RD INTERNATIONAL CONFERENCE ON BIG DATA INNOVATIONS AND APPLICATIONS (INNOVATE-DATA), 2017, : 35 - 42