Skip to content

Commit

Permalink
Supports OpenSearch V2 through ES_* environment variables (#3765)
Browse files Browse the repository at this point in the history
The Elasticsearch storage now supports OpenSearch as a backed as well.
The implementation relies on `distribution` property that is returned as
part of Elasticsearch / OpenSearch HTTP `GET /` API. Please note that
the storage version is now abstracted as `BaseVersion` with two
implementation: `ElasticsearchVersion` and `OpensearchVersion`.

Although OpenSearch is a fork of Elasticsearch as of 7.10.2, the
projects diverged sufficiently far from each other. Luckily, `Zipkin`
relies on the features that have not been impacted (so far) and work the
same way across both projects.

---------

Signed-off-by: Andriy Redko <[email protected]>
  • Loading branch information
reta authored May 25, 2024
1 parent 00b1355 commit ed594e7
Show file tree
Hide file tree
Showing 40 changed files with 1,041 additions and 297 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,8 @@ Note: This store requires a [job to aggregate](https://github.com/openzipkin/zip

### Elasticsearch
The [Elasticsearch](zipkin-server#elasticsearch-storage) component uses
Elasticsearch 5+ features, but is tested against Elasticsearch 7-8.x.
Elasticsearch 5+ features, but is tested against Elasticsearch 7-8.x and
OpenSearch 2.x.

It stores spans as Zipkin v2 json so that integration with other tools is
straightforward. To help with scale, this uses a combination of custom
Expand Down
2 changes: 1 addition & 1 deletion benchmarks/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<parent>
<groupId>io.zipkin</groupId>
<artifactId>zipkin-parent</artifactId>
<version>3.3.2-SNAPSHOT</version>
<version>3.4.0-SNAPSHOT</version>
</parent>

<artifactId>benchmarks</artifactId>
Expand Down
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

<groupId>io.zipkin</groupId>
<artifactId>zipkin-parent</artifactId>
<version>3.3.2-SNAPSHOT</version>
<version>3.4.0-SNAPSHOT</version>
<packaging>pom</packaging>

<modules>
Expand Down
2 changes: 1 addition & 1 deletion zipkin-collector/activemq/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<parent>
<groupId>io.zipkin.zipkin2</groupId>
<artifactId>zipkin-collector-parent</artifactId>
<version>3.3.2-SNAPSHOT</version>
<version>3.4.0-SNAPSHOT</version>
</parent>

<artifactId>zipkin-collector-activemq</artifactId>
Expand Down
2 changes: 1 addition & 1 deletion zipkin-collector/core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<parent>
<groupId>io.zipkin.zipkin2</groupId>
<artifactId>zipkin-collector-parent</artifactId>
<version>3.3.2-SNAPSHOT</version>
<version>3.4.0-SNAPSHOT</version>
</parent>

<artifactId>zipkin-collector</artifactId>
Expand Down
2 changes: 1 addition & 1 deletion zipkin-collector/kafka/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<parent>
<groupId>io.zipkin.zipkin2</groupId>
<artifactId>zipkin-collector-parent</artifactId>
<version>3.3.2-SNAPSHOT</version>
<version>3.4.0-SNAPSHOT</version>
</parent>

<artifactId>zipkin-collector-kafka</artifactId>
Expand Down
2 changes: 1 addition & 1 deletion zipkin-collector/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<parent>
<groupId>io.zipkin</groupId>
<artifactId>zipkin-parent</artifactId>
<version>3.3.2-SNAPSHOT</version>
<version>3.4.0-SNAPSHOT</version>
</parent>

<groupId>io.zipkin.zipkin2</groupId>
Expand Down
2 changes: 1 addition & 1 deletion zipkin-collector/rabbitmq/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<parent>
<groupId>io.zipkin.zipkin2</groupId>
<artifactId>zipkin-collector-parent</artifactId>
<version>3.3.2-SNAPSHOT</version>
<version>3.4.0-SNAPSHOT</version>
</parent>

<artifactId>zipkin-collector-rabbitmq</artifactId>
Expand Down
2 changes: 1 addition & 1 deletion zipkin-collector/scribe/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<parent>
<groupId>io.zipkin.zipkin2</groupId>
<artifactId>zipkin-collector-parent</artifactId>
<version>3.3.2-SNAPSHOT</version>
<version>3.4.0-SNAPSHOT</version>
</parent>

<artifactId>zipkin-collector-scribe</artifactId>
Expand Down
2 changes: 1 addition & 1 deletion zipkin-junit5/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<parent>
<groupId>io.zipkin</groupId>
<artifactId>zipkin-parent</artifactId>
<version>3.3.2-SNAPSHOT</version>
<version>3.4.0-SNAPSHOT</version>
</parent>

<groupId>io.zipkin.zipkin2</groupId>
Expand Down
2 changes: 1 addition & 1 deletion zipkin-lens/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<parent>
<groupId>io.zipkin</groupId>
<artifactId>zipkin-parent</artifactId>
<version>3.3.2-SNAPSHOT</version>
<version>3.4.0-SNAPSHOT</version>
</parent>

<artifactId>zipkin-lens</artifactId>
Expand Down
19 changes: 10 additions & 9 deletions zipkin-server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -251,15 +251,16 @@ $ STORAGE_TYPE=cassandra3 java -jar zipkin.jar \

### Elasticsearch Storage
Zipkin's [Elasticsearch storage component](../zipkin-storage/elasticsearch)
supports versions 7-8.x and applies when `STORAGE_TYPE` is set to `elasticsearch`
supports versions Elasticsearch 7-8.x and OpenSearch 2.x and applies when
`STORAGE_TYPE` is set to `elasticsearch`

The following apply when `STORAGE_TYPE` is set to `elasticsearch`:

* `ES_HOSTS`: A comma separated list of elasticsearch base urls to connect to ex. http://host:9200.
Defaults to "http://localhost:9200".
* `ES_PIPELINE`: Indicates the ingest pipeline used before spans are indexed. No default.
* `ES_TIMEOUT`: Controls the connect, read and write socket timeouts (in milliseconds) for
Elasticsearch API. Defaults to 10000 (10 seconds)
Elasticsearch / OpenSearch API. Defaults to 10000 (10 seconds)
* `ES_INDEX`: The index prefix to use when generating daily index names. Defaults to zipkin.
* `ES_DATE_SEPARATOR`: The date separator to use when generating daily index names. Defaults to '-'.
* `ES_INDEX_SHARDS`: The number of shards to split the index into. Each shard and its replicas
Expand All @@ -278,9 +279,9 @@ The following apply when `STORAGE_TYPE` is set to `elasticsearch`:
you set this to false, you choose to troubleshoot your own data or
migration problems as opposed to relying on the community for this.
Defaults to true.
* `ES_USERNAME` and `ES_PASSWORD`: Elasticsearch basic authentication, which defaults to empty string.
* `ES_USERNAME` and `ES_PASSWORD`: Elasticsearch / OpenSearch basic authentication, which defaults to empty string.
Use when X-Pack security (formerly Shield) is in place.
* `ES_CREDENTIALS_FILE`: The location of a file containing Elasticsearch basic authentication
* `ES_CREDENTIALS_FILE`: The location of a file containing Elasticsearch / OpenSearch basic authentication
credentials, as properties. The username property is
`zipkin.storage.elasticsearch.username`, password `zipkin.storage.elasticsearch.password`.
This file is reloaded periodically, using `ES_CREDENTIALS_REFRESH_INTERVAL`
Expand All @@ -289,7 +290,7 @@ The following apply when `STORAGE_TYPE` is set to `elasticsearch`:
* `ES_CREDENTIALS_REFRESH_INTERVAL`: Credentials refresh interval in seconds, which defaults to
1 second. This is the maximum amount of time spans will drop due to stale
credentials. Any errors reading the credentials file occur in logs at this rate.
* `ES_HTTP_LOGGING`: When set, controls the volume of HTTP logging of the Elasticsearch API.
* `ES_HTTP_LOGGING`: When set, controls the volume of HTTP logging of the Elasticsearch / OpenSearch API.
Options are BASIC, HEADERS, BODY
* `ES_SSL_NO_VERIFY`: When true, disables the verification of server's key certificate chain.
This is not appropriate for production. Defaults to false.
Expand All @@ -303,13 +304,13 @@ To connect normally:
$ STORAGE_TYPE=elasticsearch ES_HOSTS=http://myhost:9200 java -jar zipkin.jar
```

To log Elasticsearch API requests:
To log Elasticsearch / OpenSearch API requests:
```bash
$ STORAGE_TYPE=elasticsearch ES_HTTP_LOGGING=BASIC java -jar zipkin.jar
```

#### Using a custom Key Store or Trust Store (SSL)
If your Elasticsearch endpoint customized SSL configuration (for example self-signed) certificates,
If your Elasticsearch / OpenSearch endpoint customized SSL configuration (for example self-signed) certificates,
you can use any of the following [subset of JSSE properties](https://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html#T6) to connect.

* javax.net.ssl.keyStore
Expand All @@ -326,13 +327,13 @@ $ STORAGE_TYPE=elasticsearch java $JAVA_OPTS -jar zipkin.jar
```

Under the scenes, these map to properties prefixed `zipkin.storage.elasticsearch.ssl.`, which affect
the Armeria client used to connect to Elasticsearch.
the Armeria client used to connect to Elasticsearch / OpenSearch.

The above properties allow the most common SSL setup to work out of box. If you need more
customization, please make a comment in [this issue](https://github.com/openzipkin/zipkin/issues/2774).

#### Automatic Index Creation
Zipkin will automatically create new indices as needed. Elasticsearch by default [allows](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-creation) automatic creation of said indices, though your local install may have been configured to disallow it. You can verify this in the cluster settings: `action.auto_create_index: false`.
Zipkin will automatically create new indices as needed. Elasticsearch / OpenSearch by default [allows](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-creation) automatic creation of said indices, though your local install may have been configured to disallow it. You can verify this in the cluster settings: `action.auto_create_index: false`.

### Legacy (v1) storage components
The following components are no longer encouraged, but exist to help aid
Expand Down
2 changes: 1 addition & 1 deletion zipkin-server/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<parent>
<groupId>io.zipkin</groupId>
<artifactId>zipkin-parent</artifactId>
<version>3.3.2-SNAPSHOT</version>
<version>3.4.0-SNAPSHOT</version>
</parent>

<artifactId>zipkin-server</artifactId>
Expand Down
2 changes: 1 addition & 1 deletion zipkin-storage/cassandra/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ If you want to see requests and latency, set the logging category
"com.datastax.oss.driver.internal.core.tracker.RequestLogger" to DEBUG.
TRACE includes query values.

See [Request Logger](https://docs.datastax.com/en/developer/java-driver/4.9/manual/core/request_tracker/#request-logger) for more details.
See [Request Logger](https://github.com/apache/cassandra-java-driver/tree/4.x/manual/core/request_tracker#request-logger) for more details.

## Testing
This module conditionally runs integration tests against a local Cassandra instance.
Expand Down
2 changes: 1 addition & 1 deletion zipkin-storage/cassandra/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<parent>
<groupId>io.zipkin.zipkin2</groupId>
<artifactId>zipkin-storage-parent</artifactId>
<version>3.3.2-SNAPSHOT</version>
<version>3.4.0-SNAPSHOT</version>
</parent>

<artifactId>zipkin-storage-cassandra</artifactId>
Expand Down
34 changes: 24 additions & 10 deletions zipkin-storage/elasticsearch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
This is a plugin to the Elasticsearch storage component, which uses
HTTP by way of [Armeria](https://github.com/line/armeria) and
[Jackson](https://github.com/FasterXML/jackson). This uses Elasticsearch 5+
features, but is tested against Elasticsearch 7-8.x.
features, but is tested against Elasticsearch 7-8.x and OpenSearch 2.x.

## Multiple hosts
Most users will supply a DNS name that's mapped to multiple A or AAAA
Expand All @@ -26,9 +26,9 @@ with one difference described below. We add a "timestamp_millis" field
to aid in integration with other tools.

### Timestamps
Zipkin's timestamps are in epoch microseconds, which is not a supported date type in Elasticsearch.
Zipkin's timestamps are in epoch microseconds, which is not a supported date type in Elasticsearch / OpenSearch.
In consideration of tools like like Kibana, this component adds "timestamp_millis" when writing
spans. This is mapped to the Elasticsearch date type, so can be used to any date-based queries.
spans. This is mapped to the Elasticsearch / OpenSearch date type, so can be used to any date-based queries.

## Indexes
Spans are stored into daily indices, for example spans with a timestamp
Expand Down Expand Up @@ -68,8 +68,9 @@ $ curl -s 'localhost:9200/zipkin*span-2017-08-11/_search?q=_q:error=500'

The reason for special casing is around dotted name constraints. Tags
are stored as a dictionary. Some keys include inconsistent number of dots
(ex "error" and "error.message"). Elasticsearch cannot index these as it
inteprets them as fields, and dots in fields imply an object path.
(ex "error" and "error.message"). Elasticsearch / OpenSearch cannot index
these as it inteprets them as fields, and dots in fields imply an object
path.

### Trace Identifiers
Unless `ElasticsearchStorage.Builder.strictTraceId` is set to false,
Expand Down Expand Up @@ -122,9 +123,9 @@ be written, nor analyzed.

### Composable Index Template
Elasticsearch 7.8 introduces [composable templates](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-templates.html) and
deprecates [legacy/v1 templates](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates-v1.html) used in version prior.
Merging of multiple templates with matching index patterns is no longer allowed, and Elasticsearch will return error on PUT of the second template
with matching index pattern and priority. Templates with matching index patterns are required to have different priorities, and Elasticsearch will
deprecates [legacy/v1 templates](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates-v1.html) used in version prior (fully supported by OpenSearch).
Merging of multiple templates with matching index patterns is no longer allowed, and Elasticsearch / OpenSearch will return error on PUT of the second template
with matching index pattern and priority. Templates with matching index patterns are required to have different priorities, and Elasticsearch / OpenSearch will
only use the template with the highest priority. This also means that [secondary template](https://gist.github.com/codefromthecrypt/1af1259102e7a2da1b3c9103565165d7)
is no longer achievable.

Expand All @@ -133,8 +134,16 @@ providing `ES_TEMPLATE_PRIORITY` environment variable.

## Customizing the ingest pipeline

### Elasticsearch

You can setup an [ingest pipeline](https://www.elastic.co/guide/en/elasticsearch/reference/master/pipeline.html) to perform custom processing.

### OpenSearch

You can setup an [ingest pipeline](https://opensearch.org/docs/latest/ingest-pipelines/) to perform custom processing.

### Setting up ingest pipeline

Here's an example, which you'd setup prior to configuring Zipkin to use
it via `ElasticsearchStorage.Builder.pipeline`

Expand Down Expand Up @@ -162,7 +171,12 @@ to reduce load. This is implemented by
[DelayLimiter](../../zipkin/src/main/java/zipkin2/internal/DelayLimiter.java)

## Data retention
Zipkin-server does not handle retention management of the trace data. Use the tools recommended by ElasticSearch to manage data retention, or your cluster
will grow indefinitely!
Zipkin-server does not handle retention management of the trace data. Use the tools recommended by to manage data retention, or your cluster will grow indefinitely!

### Elasticsearch
* [Curator](https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html)
* [Index Lifecycle Management](https://www.elastic.co/guide/en/elasticsearch/reference/7.3/index-lifecycle-management.html)

### OpenSearch
* [Curator](https://github.com/flant/curator-opensearch)
* [Index Lifecycle Management](https://opensearch.org/docs/latest/im-plugin/ism/index/)
4 changes: 2 additions & 2 deletions zipkin-storage/elasticsearch/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@
<parent>
<groupId>io.zipkin.zipkin2</groupId>
<artifactId>zipkin-storage-parent</artifactId>
<version>3.3.2-SNAPSHOT</version>
<version>3.4.0-SNAPSHOT</version>
</parent>

<artifactId>zipkin-storage-elasticsearch</artifactId>
<name>Storage: Elasticsearch (V2)</name>
<name>Storage: Elasticsearch / OpenSearch (V2)</name>

<properties>
<main.basedir>${project.basedir}/../..</main.basedir>
Expand Down
Loading

0 comments on commit ed594e7

Please sign in to comment.