Skip to content
This repository has been archived by the owner on May 27, 2020. It is now read-only.

Getting started

Alberto Rodriguez edited this page May 11, 2017 · 68 revisions

Contents

The aim of the Getting Started guide is to help Khermes users start playing around with the tool by setting up an environment and an example of data generation use case. The use case will generate data for a pretend music on-line platform. So, we'll be simulating users playing songs by creating a template that includes: song title, artist, genre, album, duration, rating, city and so on...

You have several options to get khermes up & running:

  1. Docker-compose: We have "cooked" a docker-compose file for you that will download and start all the components need to have a full demo running.
  2. Standalone: If you do not want to use docker we have created several scripts that will help you to deploy your khermes cluster locally (you will need a zookeeper and kafka clusters running)

1. Docker-compose

The easiest way to start playing around with khermes is by taking advantage of the docker-compose file that we have created for you. As a requirement you would need to install docker, see the docker official doc for further details: Install docker

Once you have cloned the project and installed docker, execute the following command in the project's root folder:

docker-compose -f docker/landoop-demo-compose.yml up

This command will start the following components:

After the docker images get downloaded (it might take a while) you will have the following environment up & running:

Once everything is running fine follow the next steps to have the khermes environment running on your box:

getting-started-environment

1.1. Setting up Khermes

To find out how to set up khermes please read through the Set up khermes section.

The following sections of the docker-compose tutorial will guide you to configure a kafka connector that will redirect the data produced by khermes to an Elasticsearch cluster and show the information using kibana. If you are not interested in this set-up you might stop reading here.

1.2. Setting up an Elasticsearch connector

Firstly we need to create the Elasticsearch index. Open a terminal and type in the following two commands:

curl -XPUT 'localhost:9200/khermes?pretty' -H 'Content-Type: application/json' -d'
{
    "settings" : {
        "index" : {
            "number_of_shards" : 1, 
            "number_of_replicas" : 1 
        }
    }
}'
curl -X POST "http://localhost:9200/khermes/khermes/_mapping" -d '{
   "khermes" : {
   "properties" : {
       "song" : { "type" : "string"},
       "artist" : { "type" : "string"},
       "album" : { "type" : "string"},
       "genre" : { "type" : "string"},
       "playduration" : { "type" : "integer"},
       "rating" : { "type" : "integer"},
       "user" : { "type" : "string"},
       "usertype" : { "type" : "string"},
       "city" : { "type" : "string"},
       "location" : { "type" : "geo_point"},
       "starttime" : { "type" : "string"}
   }}
}'

And now we should create the Elasticsearch connector. We will use the cool landoop interfaz to do so, go to the following url in your browser:

http://localhost:3030/

You should see the landoop kafka development environment UI. Go to the Kafka Connect section -> Click on New -> Select Elasticsearch in the list of available connectors and copy the following configuration in the text area:

connector.class=com.datamountaineer.streamreactor.connect.elastic.ElasticSinkConnector
type.name=khermes
topics=khermes
tasks.max=1
name=elastic-khermes
connection.url=http://localhost:9200
connect.elastic.sink.kcql=INSERT INTO khermes SELECT song, artist, album, genre, playduration, rating, user, usertype, city, location, starttime FROM khermes

Click on Validate and create button.

After that, the data produced by khermes should be written to the khermes elasticsearch index.

1.3. Showing the data using kibana

Go to the following url using your favourite browser:

http://localhost:5601

You should see the kibana interface. Go to the settings section of kibana and configure your index (khermes).

Let's visualize a cool kibana's tile map, to do so go to the Visualize section -> Tile map -> From a new search -> Select geo coordinates -> Select the location field and click on the Play button. If everything is working fine you should see a map like the following:

getting-started-kibana

2. Standalone

⚠️ To run khermes on the standalone mode you will need a zookeeper and kafka clusters running. To complete this guide we have used Apache Zookeeper 3.4.9 and Apache Kafka 0.10.1.1.

If you want to start a khermes cluster without using docker you firstly need to generate the khermes artifact. To do so, go to the root folder of the project and execute the following command:

mvn clean package

After executing the command move to the scripts directory. In this directory you will find several shell scripts that will help you to start a khermes cluster. Let's start by running a cluster seed executing the following command:

./seed

To check out that the seed is running fine please go to the following url using your favourite browser:

http://localhost:8080/console

And type the ls command within the web console. You should see an output like the following:

khermes-console-seed

Cool! You have a khermes cluster running locally. It's time to add nodes to the cluster. Open a new terminal go to the khermes project's scripts directory and execute the following command:

./nodes

If you go back to the web console and execute ls again you should see how four nodes more have joined the cluster:

Nice! You have now a khermes cluster up & running with one seed and four nodes.

To start producing random data please continue reading the Set-up Khermes section.