From a300f3b988b8cf1d47c44ad7d9315392af7656bf Mon Sep 17 00:00:00 2001 From: Luis Ponce Date: Wed, 5 Jun 2019 12:54:37 -0500 Subject: [PATCH] Update build hibench readme * docs/build-hibench.md: * Update 2.4 version to specify Spark Version. * Add Specify Hadoop version documentation. * Add Build using JDK 11 documentation. * README.md: * Update Supported Hadoop/Spark releases to hadoop 3.2 and spark 2.4 Signed-off-by: Luis Ponce --- README.md | 10 +++++----- docs/build-hibench.md | 17 ++++++++++++++++- 2 files changed, 21 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index e3c433eb0..7dc09a259 100644 --- a/README.md +++ b/README.md @@ -135,12 +135,12 @@ There are totally 19 workloads in HiBench. The workloads are divided into 6 cate 4. Fixwindow (fixwindow) The workloads performs a window based aggregation. It tests the performance of window operation in the streaming frameworks. - - -### Supported Hadoop/Spark/Flink/Storm/Gearpump releases: ### - - Hadoop: Apache Hadoop 2.x, CDH5, HDP - - Spark: Spark 1.6.x, Spark 2.0.x, Spark 2.1.x, Spark 2.2.x +### Supported Hadoop/Spark releases: ### + - Hadoop: Apache Hadoop 2.x, 3.2, CDH5, HDP + - Spark: Spark 1.6.x, Spark 2.0.x, Spark 2.1.x, Spark 2.2.x, Spark 2.4.x + +### Supported Flink/Storm/Gearpump releases: ### - Flink: 1.0.3 - Storm: 1.0.1 - Gearpump: 0.8.1 diff --git a/docs/build-hibench.md b/docs/build-hibench.md index 4709f90cf..7c44e99ba 100644 --- a/docs/build-hibench.md +++ b/docs/build-hibench.md @@ -28,7 +28,7 @@ Because some Maven plugins cannot support Scala version perfectly, there are som ### Specify Spark Version ### -To specify the spark version, use -Dspark=xxx(1.6, 2.0, 2.1 or 2.2). By default, it builds for spark 2.0 +To specify the spark version, use -Dspark=xxx(1.6, 2.0, 2.1, 2.2 or 2.4). By default, it builds for spark 2.0 mvn -Psparkbench -Dspark=1.6 -Dscala=2.11 clean package tips: @@ -37,6 +37,11 @@ default . For example , if we want use spark2.0 and scala2.11 to build hibench. package` , but for spark2.0 and scala2.10 , we need use the command `mvn -Dspark=2.0 -Dscala=2.10 clean package` . Similarly , the spark1.6 is associated with the scala2.10 by default. +### Specify Hadoop Version ### +To specify the spark version, use -Dhadoop=xxx(3.2). By default, it builds for hadoop 2.4 + + mvn -Psparkbench -Dhadoop=3.2 -Dspark=2.4 clean package + ### Build a single module ### If you are only interested in a single workload in HiBench. You can build a single module. For example, the below command only builds the SQL workloads for Spark. @@ -48,3 +53,13 @@ Supported modules includes: micro, ml(machine learning), sql, websearch, graph, For Spark 2.0 and Spark 2.1, we add the benchmark support for Structured Streaming. This is a new module which cannot be compiled in Spark 1.6. And it won't get compiled by default even if you specify the spark version as 2.0 or 2.1. You must explicitly specify it like this: mvn -Psparkbench -Dmodules -PstructuredStreaming clean package + +### Build using JDK 1.11 +**For Java 11 it is suitable to be built for Spark 2.4 _(Compiled with Scala 2.12)_ and/or Hadoop 3.2 only** + +If you are interested in building using Java 11 indicate that streaming benchmarks won't be compiled, and specify scala, spark and hadoop version as below + + mvn clean package -Psparkbench -Phadoopbench -Dhadoop=3.2 -Dspark=2.4 -Dscala=2.12 -Dexclude-streaming + +Supported frameworks only: hadoopbench, sparkbench (Does not support flinkbench, stormbench, gearpumpbench) +Supported modules includes: micro, ml(machine learning), websearch and graph (does not support streaming, structuredStreaming and sql)