-
-
Notifications
You must be signed in to change notification settings - Fork 358
DatasetSuiteBase
DatasetSuiteBase
enables you to check if two Datasets are equal. It also provides an easy way to get SparkContext
and sqlContext
. DatasetSuiteBase
also extends DataFrameSuiteBase
you can use it to check for DataFrame equality. SparkContext
and SqlContext
are initialized before all testcases, So you can access them inside any test case.
For Java users the same functionality is supported by JavaDatasetSuiteBase
.
You can assert the Datasets equality using method assertDatasetEquals
. This method could be customized by overriding equals
method for the given class type.
Example:
class test extends FunSuite with DatasetSuiteBase {
test("simple test") {
val sqlCtx = sqlContext
import sqlCtx.implicits._
val input1 = sc.parallelize(List(1, 2, 3)).toDS
assertDatasetEquals(input1, input1) // equal
val input2 = sc.parallelize(List(4, 5, 6)).toDS
intercept[org.scalatest.exceptions.TestFailedException] {
assertDatasetEquals(input1, input2) // not equal
}
}
}
When Datasets contains doubles, you can compare them with acceptable tolerance for ex. (5 == 4.999). You can assert that the Datasets approximately equal using method assertDatasetApproximateEquals
.
Example:
class test extends FunSuite with DatasetSuiteBase {
test("simple test") {
val sqlCtx = sqlContext
import sqlCtx.implicits._
val input1 = sc.parallelize(List[(Int, Double)]((1, 1.1), (2, 2.2), (3, 3.3))).toDS
val input2 = sc.parallelize(List[(Int, Double)]((1, 1.2), (2, 2.3), (3, 3.4))).toDS
assertDatasetApproximateEquals(input1, input2, 0.11) // equal
intercept[org.scalatest.exceptions.TestFailedException] {
assertDatasetApproximateEquals(input1, input2, 0.05) // not equal
}
}
}