Skip to content

The missing SparklyR EDA toolkit (for use in R). Quick, efficient, and easy to use

Notifications You must be signed in to change notification settings

GabeChurch/sparkedatools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sparkedatools

The missing SparklyR EDA toolkit (for use in R). Quick, efficient, and easy to use.

Wrap and graph the outputs from the SparkEDA package for Spark.

You will need a few things to get started.

  1. Grab the SparkEDA Jar from my website

  2. Download this repository to your R home or other location Note: You will need devtools installed in R (as this package is not yet up on CRAN

install.packages("devtools")
library("devtools")

Then

devtools::install_github('GabeChurch/sparkedatools')
  1. Edit your SparklyR Configuration (in R)

You need to add the SparkEDA jar for the package to work in R.

conf = spark_config()

This is the important line

#This is the configuration option
conf$'sparklyr.jars.default'= "/system/path/to/sparkeda_2.11-2.07.jar"
sc = spark_connect(master = "yarn-client", config = conf, version = '2.3.2')

The ORDER IS IMPORTANT. Must be BEFORE you have connected and AFTER you have instantiated your spark_config in R.

  1. Enjoy being able to visualize and understand your giant data-sets like never before!

About

The missing SparklyR EDA toolkit (for use in R). Quick, efficient, and easy to use

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages