Performance Analysis of Java']Java Virtual Machine for Machine Learning Workloads using Apache Spark

被引:0
|
作者
Hema, N. [1 ]
Srinivasa, K. G. [1 ]
Chidambaram, Saravanan [2 ]
Saraswat, Sandeep [2 ]
Saraswati, Sujoy [2 ]
Ramachandra, Ranganath [2 ]
Huttanagoudar, Jayashree B. [3 ]
机构
[1] MSRIT, Dept CSE, Bangalore 54, Karnataka, India
[2] Hewlett Packard Enterprise, Bangalore 560048, Karnataka, India
[3] RVCE, Dept CSE, Bangalore 59, Karnataka, India
关键词
Big data; Machine Learning (ML); Apache Spark; Hadoop; !text type='Java']Java[!/text] Virtual Machine (JVM);
D O I
10.1145/2980258.2982117
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Now a day's data is growing very rapidly, where processing and analyzing data to get useful information is the main task. There are many big data processing tools and framework such as Hadoop, Hive, Cassandra etc. Spark is one of the fastest big data processing framework in cluster computation. Basic Idea is to analyze the performance of java virtual machine (JVM) [1], by characterizing java virtual machine using SparkBench benchmark on Apache Spark (TM) [2]. Java virtual machine is a core execution platform for spark application. When we run the spark application on java virtual machine, its behavior is affected, which needs to be monitored to analyze the JVM performance. Here we are considering Machine Learning workloads like K-Means, Matrix Factorization and Logistic Regression. Main goal here is to analyze the machine learning workloads end to end across the cluster, with respect to following parameters such as garbage collection, memory such as heap usage, CPU process time. Characterization of JVM is done with spark cluster setup and HDFS is used as storage with distributed Hadoop cluster setup.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Design and performance analysis of a distributed Java']Java virtual machine
    Surdeanu, M
    Moldovan, D
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2002, 13 (06) : 611 - 627
  • [2] Performance Study for Java']Java Virtual Machine In Embedded Systems
    Liu Wenjun
    2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL I, 2010, : 436 - 438
  • [3] A secure Java']Java™ Virtual Machine
    van Doom, L
    USENIX ASSOCIATION PROCEEDINGS OF THE NINTH USENIX SECURITY SYMPOSIUM, 2000, : 19 - 34
  • [4] Java']Java virtual machine performance analysis with Java']Java instruction level parallelism and advanced folding scheme
    Kim, A
    Chang, M
    CONFERENCE PROCEEDINGS OF THE 2002 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE, 2002, : 9 - 15
  • [5] Performance Study for Java']Java Virtual Machine In Embedded Systems
    Liu Wenjun
    2010 INTERNATIONAL CONFERENCE ON BIO-INSPIRED SYSTEMS AND SIGNAL PROCESSING (ICBSSP 2010), 2010, : 188 - 190
  • [6] Program analysis for safety guarantees in a Java']Java virtual machine written in Java']Java
    Maessen, JW
    Sarkar, V
    Grove, D
    ACM SIGPLAN NOTICES, 2001, : 62 - 65
  • [7] Performance Analysis of Machine Learning Techniques on Big Data Using Apache Spark
    Mogha, Garima
    Ahlawat, Khyati
    Singh, Amit Prakash
    DATA SCIENCE AND ANALYTICS, 2018, 799 : 17 - 26
  • [8] An analysis of the garbage collection performance in sun's HotSpot™ Java']Java Virtual Machine
    Dykstra, L
    Srisa-An, W
    Chang, JM
    CONFERENCE PROCEEDINGS OF THE 2002 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE, 2002, : 335 - 339
  • [9] Performance Regression Testing on the Java']Java Virtual Machine using Statistical Test Oracles
    Hewson, Fergus
    Dietrich, Jens
    Marsland, Stephen
    2015 24TH AUSTRALASIAN SOFTWARE ENGINEERING CONFERENCE (ASWEC 2015), 2015, : 18 - 27
  • [10] Analysis and Optimization of Task Granularity on the java']java Virtual Machine
    Rosa, Andrea
    Rosales, Eduardo
    Binder, Walter
    ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2019, 41 (03):