Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snowbridge v3 #1068

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,14 @@ There are several different failures that Snowbridge may hit.

### Target failure

This is where a request to the destination technology fails or is rejected - for example a http 400 response is received. When Snowbridge hits this failure, it will retry 5 times. If all 5 attempts fail, it will be reported as a 'MsgFailed' for monitoring purposes, and will proceed without acking the failed Messages. As long as the source's acking model allows for it, these will be re-processed through Snowbridge again.
This is where a request to the destination technology fails or is rejected - for example a http 400 response is received.

Note that this means failures on the receiving end (eg. if an endpoint is unavailable), mean Snowbridge will continue to attempt to process the data until the issue is fixed.
Retry behaviour for target failures are determined by the retry configuration. You can find details of this in the [configuration section](/docs/destinations/forwarding-events/snowbridge/configuration/retries/index.md).

As of Snowbridge 2.4.2, the kinesis target does not treat kinesis write throughput exceptions as this type of failure. Rather it has an in-built backoff and retry, which will persist until each event in the batch is either successful, or fails for a different reason.

Before verst 3.0.0, Snowbridge treats every kind of target failure the same - it will retry 5 times. If all 5 attempts fail, it will be reported as a 'MsgFailed' for monitoring purposes, and will proceed without acking the failed Messages. As long as the source's acking model allows for it, these will be re-processed through Snowbridge again.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Before verst 3.0.0, Snowbridge treats every kind of target failure the same - it will retry 5 times. If all 5 attempts fail, it will be reported as a 'MsgFailed' for monitoring purposes, and will proceed without acking the failed Messages. As long as the source's acking model allows for it, these will be re-processed through Snowbridge again.
Before version 3.0.0, Snowbridge treats every kind of target failure the same - it will retry 5 times. If all 5 attempts fail, it will be reported as a 'MsgFailed' for monitoring purposes, and will proceed without acking the failed Messages. As long as the source's acking model allows for it, these will be re-processed through Snowbridge again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all 5 attempts fail, it will be reported as a 'MsgFailed' for monitoring purposes,

Is it true though? It sounds like MsgFailed is reported only when ALL 5 attempts fail, but I think it's reported after each write failure, no?


### Oversized data

Targets have limits to the size of a single message. Where the destination technology has a hard limit, targets are hardcoded to that limit. Otherwise, this is a configurable option in the target configuration. When a message's data is above this limit, Snowbridge will produce a [size violation failed event](/docs/understanding-your-pipeline/failed-events/index.md#size-violation), and emit it to the failure target.
Expand All @@ -34,6 +36,8 @@ Writes of oversized messages to the failure target will be recorded with 'Oversi

In the unlikely event that Snowbridge encounters data which is invalid for the target destination (for example empty data is invalid for pubsub), it will create a [generic error failed event](/docs/understanding-your-pipeline/failed-events/index.md#generic-error), emit it to the failure target, and ack the original message.

As of version 3.0.0, the http target may produce invalid failures. This occurs when the a POST request body cannot be formed, when the templating feature's attempts to template data result in an error, or when the response conforms to a response rules configuration which sepcifies that the failure is to be treated as invalid. You can find more details in the [configuration section](/docs/destinations/forwarding-events/snowbridge/configuration/targets/http/index.md).

Transformation failures are also treated as invalid, as described below.

Writes of invalid messages to the failure target will be recorded with 'InvalidMsg' statistics in monitoring. Any failure to write to the failure target will cause a [fatal failure](#fatal-failure).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ https://github.com/snowplow/snowbridge/blob/master/assets/docs/configuration/mon
```hcl reference
https://github.com/snowplow/snowbridge/blob/master/assets/docs/configuration/monitoring/sentry-example.hcl
```
### StatsD stats receiver

### StatsD stats receiver Configuration

```hcl reference
https://github.com/snowplow/snowbridge/blob/master/assets/docs/configuration/monitoring/statsd-example.hcl
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
title: "Retries (beta)"
description: "Configure retry behaviour."
---

:::note
This feature was added in version 3.0.0

This feature is in beta status because we may make breaking changes in future versions.
:::

This feature allows you to configure the retry behaviour when the target encounters a failure in sending the data. There are two types of failure you can define:

A transient failure is a failure which we expect to succeed again on retry. For example some temporary network error, or when we encounter throttling. Typically you would configure a short backoff for this type of failure. When we encounter a transient failure, we keep processing the rest of the data as normal, under the expectation that everyhting is operating as normal. The failed data is retried after a backoff.

A setup failure is one which we don't expect to be immediately resolved, for example an incorrect address, or an invalid API Key. Typically you would configue a long backoff for this type of failure, under the assumption that the issue needs to be fixed with either a configuration change or a change to the target itself (eg. permissions need to be granted). When we encounter a setup error, we stop attempting to process any data, and the whole app waits for the backoff period before trying again. Setup errors will be retried 5 times, before the app crashes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we stop attempting to process any data

We don't do anything explicit to stop processing in this case. Right now setup is pretty much like transient, but with much longer backoff. In the future we might add monitoring/alerts/health toggle for setup errors, but it's not there now.

In practice, if you mark your HTTP response as setup error in config, it probably means nothing gets through and we indeed 'stop' processing anything. But there is no code in Snowbridge that would say stop pulling from source now, we hit setup error!.

Theoretically it's possible to have both: setup and transient simultaneously. Then it means your setup error probably shouldn't be configured as setup error.


As of v3.0.0, only the http target can be configured to return setup errors, via the response rules feature - configuration details for response rules can be found in [the http target configuration section](/docs/destinations/forwarding-events/snowbridge/configuration/targets/http/index.md). For all other targets, all errors returned will be considered transient, and behaviour can be configured using the `tranisent` block of the retry configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
As of v3.0.0, only the http target can be configured to return setup errors, via the response rules feature - configuration details for response rules can be found in [the http target configuration section](/docs/destinations/forwarding-events/snowbridge/configuration/targets/http/index.md). For all other targets, all errors returned will be considered transient, and behaviour can be configured using the `tranisent` block of the retry configuration.
As of v3.0.0, only the http target can be configured to return setup errors, via the response rules feature - configuration details for response rules can be found in [the http target configuration section](/docs/destinations/forwarding-events/snowbridge/configuration/targets/http/index.md). For all other targets, all errors returned will be considered transient, and behaviour can be configured using the `transient` block of the retry configuration.


Retries will be attempted on an exponential backoff - in other words, on each subsequent failure, the backoff time will double. You can configure transient failures to retry indefinitely by setting `max_attempts` to 0.

## Configuration options

```hcl reference
https://github.com/snowplow/snowbridge/blob/master/assets/docs/configuration/retry-example.hcl
```
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Stdin source is the default, and has one optional configuration to set the concu

Stdin source simply treats stdin as the input.

## Configuration
## Configuration options

Here is an example of the minimum required configuration:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,98 @@ Snowbridge supports sending authorized requests to OAuth2 - compliant HTTP targe

Like in the case of basic authentication, we recommend using environment variables for sensitive values.

## Dynamic Headers

:::note
This feature was added in version 2.3.0
:::

When enabled, the dynamic headers attaches a header to the data according to what your transformation provides in the `HTTPHeaders` field of `engineProtocol`. Data is batched according to the dynamic header value before requests are sent.

## Request templating

:::note
This feature was added in version 3.0.0
:::

This feature allows you to provide a [Golang text template](https://pkg.go.dev/text/template) to construct a request body from a batch of data. This feature should be useful in constructing requests to send to an API, for example.

Input data must be valid JSON, any message whose that fails to be marshaled to JSON will be treated as invalid and sent to the failure target. Equally, if an attempt to template a batch of data results in an error, then all messages in the batch will be considred invalid and sent to the failuret target.

Where the dynamic headers feature is enabled, data is split into batches according to the provided header value, and the templater will operate on each batch separately.

### Helper functions

In addition to all base functions available in the Go text/template package, the following custom functions are available for convenience:

`prettyPrint` - Because the input to the templater is a Go data structure, simply providing a reference to an object field won't produce a JSON object in the output of the template. `prettyPrint` converts the data to prettified JSON (by unmarshaling to json). Use it wherever you expect a JSON object in the output. This is compatible with any data type, but it shouldn't be necessary if the data is not an object.

`env` - Allows you to set and refer to an env var in your template. Use it when your request body must contain sensitive data, for example an API key.

### Template example

The following example provides an API key via environment variable, and iterates the batch to provide JSON-formatted data one by one into a new key, inserting a comma before all but the first event.

```hcl reference
https://github.com/snowplow/snowbridge/blob/master/assets/docs/configuration/targets/http-template-full-example.file
```

### Default behaviour, and breaking changes in v3

Where no template is configured, the POST request body will contain an array of JSON containing the data for the whole batch. Data must be valid JSON or it will be considered invalid and sent to the failure target.

Note that this is a breaking change to the pre-v3 default behaviour, in two ways:

1. Previously to v3, we sent data one request per message

This means that where no template is provided, request bodies will be arrays of JSON rather than individual JSON objects.

For example, pre-v3, a request body might look like this:

```
{"foo": "bar"}
```

But it will now look like this:

```
[{"foo": "bar"}]
```

If you need to preserve the previous behaviour (as long as your data is valid JSON), you can set `request_max_messages` to 1, and provide this template:

```go reference
https://github.com/snowplow/snowbridge/blob/master/assets/docs/configuration/targets/http-template-unwrap-example.file
```

2. Non-JSON data is not supported

While the intention was never to support non-JSON data, previously to v3 the request body was simply populated with whatever bytes were found in the message data, regardless of whether it is valid JSON.

From v3 on, only valid JSON will work, otherwise the message will be considered invalid and sent to the failure target.

## Response rules (beta)

:::note
This feature was added in version 3.0.0

This feature is in beta status because we may make breaking changes in future versions.
:::

Response rules allow you to configure how the app deals with failures in sending the data. You can configure a response code and an optional string match on the response body to determine how a failure response is handled. Response codes between 200 and 299 are considered successful, and are not handled by this feature.

There are three categories of failure:

`invalid` means that the data is considered incompatible with the target for some reason. For example, you may have defined a mapping for a given API, but the event being processed happens to have null data for a field that is required by the API. In this instance, retrying the data won't fix the issue, so you would configure an invalid response rule, which identifies responses which indicate this scenario.

Data that matches an invalid response rule is sent to the failure target.

`setup` means that this error is not retryable, but is something which can only be resolved by a change in configuration or a change to the target. An example of this is an authentication failure - retrying will fix the issue, the resolution is to grant the appropriate permissions, or provide the correct API key.

Data that matches a setup response rule is handled by a retey as determined in the `setup` configuration block of [retry configuration](/docs/destinations/forwarding-events/snowbridge/configuration/retries/index.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Data that matches a setup response rule is handled by a retey as determined in the `setup` configuration block of [retry configuration](/docs/destinations/forwarding-events/snowbridge/configuration/retries/index.md).
Data that matches a setup response rule is handled by a retry as determined in the `setup` configuration block of [retry configuration](/docs/destinations/forwarding-events/snowbridge/configuration/retries/index.md).


`transient` errors are everything else - we assume that the issue is temporary and retrying will resolve the problem. An example of this is being throttled by an API because too much data is being sent at once. There is no explicit configuration for transient - rather, anything that is not configured as one of the other types is considered transient.

## Configuration options

Here is an example of the minimum required configuration:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ description: "Write data to an SQS queue."
---

Stdout target doesn't have any configurable options - when configured it simply outputs the messages to stdout.
## Configuration

## Configuration options

Here is an example of the configuration:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ sidebar_position: 500

You can read about our telemetry principles [here](/docs/getting-started-on-community-edition/telemetry/index.md).

## Configuration via file:
## Configuration options

Enabling telemetry:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ This transformation base64 decodes the message's data from a base64 byte array,

`base64Decode` has no options.

Example:
## Configuration options

```hcl reference
https://github.com/snowplow/snowbridge/blob/master/assets/docs/configuration/transformations/builtin/base64Decode-minimal-example.hcl
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ This transformation base64 encodes the message's data to a base 64 byte array.

`base64Encode` has no options.

Example:
## Configuration options

```hcl reference
https://github.com/snowplow/snowbridge/blob/master/assets/docs/configuration/transformations/builtin/base64Encode-minimal-example.hcl
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# jq

:::note
This transformation was added in version 3.0.0
:::

[jq](https://github.com/jqlang/jq) is a lightweight and flexible command-line JSON processor akin to sed,awk,grep, and friends for JSON data. Snowbridge's jq features utilise the [gojq](https://github.com/itchyny/gojq) package, which is a pure go implementation of jq. jq is Turing complete, so these features allow you to configure arbitrary logic upon json data structures.

jq supports formatting values, mathematical operations, boolean comparisons, regex matches, and many more useful features. To get started with jq command, see the [tutorial](https://jqlang.github.io/jq/tutorial/), and [full reference manual](https://jqlang.github.io/jq/manual/). While it is unlikely to meaningfully encounter them, note that there are [some small differences](https://github.com/itchyny/gojq?tab=readme-ov-file#difference-to-jq) between jq and gojq.

`jq` runs a jq command on the message data, and outputs the result of the command. While jq supports multi-element results, commands must output only a single element - this single element can be an array data type.

The provided command must return a boolean result. `false` filters the message out, `true` keeps it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this line shouldn't be here as it's only for filter, right?


If the provided jq command results in an error, the message will be considred invalid, and will be sent to the failure target.

The minimal example here returns the input data as a single element array, and the full example maps the data to a new data structure.

The jq transformation will remove any keys with null values from the data.

## Configuration options

Minimal configuration:

```hcl reference
https://github.com/snowplow/snowbridge/blob/master/assets/docs/configuration/transformations/builtin/jq-minimal-example.hcl
```

Every configuration option:

```hcl reference
https://github.com/snowplow/snowbridge/blob/master/assets/docs/configuration/transformations/builtin/jq-full-example.hcl
```

## Helper functions

```mdx-code-block
import JQHelpersSharedBlock from "./reusable/_jqHelpers.md"

<JQHelpersSharedBlock/>
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# jqFilter

:::note
This transformation was added in version 3.0.0
:::

[jq](https://github.com/jqlang/jq) is a lightweight and flexible command-line JSON processor akin to sed,awk,grep, and friends for JSON data. Snowbridge's jq features utilise the [gojq](https://github.com/itchyny/gojq) package, which is a pure go implementation of jq. jq is Turing complete, so these features allow you to configure arbitrary logic upon json data structures.

jq supports formatting values, mathematical operations, boolean comparisons, regex matches, and many more useful features. To get started with jq command, see the [tutorial](https://jqlang.github.io/jq/tutorial/), and [full reference manual](https://jqlang.github.io/jq/manual/). While it is unlikely to meaningfully encounter them, note that there are [some small differences](https://github.com/itchyny/gojq?tab=readme-ov-file#difference-to-jq) between jq and gojq.

`jqFilter` filters messages based on the output of a jq command which is run against the data. The provided command must return a boolean result. `false` filters the message out, `true` keeps it.

If the provided jq command returns a non-boolean value error, or results in an error, then the message will be considred invalid, and will be sent to the failure target.

This example filters out all data that doesn't have an `app_id` key.

## Configuration options

Minimal configuration:

```hcl reference
https://github.com/snowplow/snowbridge/blob/master/assets/docs/configuration/transformations/builtin/jqFilter-minimal-example.hcl
```

Every configuration option:

```hcl reference
https://github.com/snowplow/snowbridge/blob/master/assets/docs/configuration/transformations/builtin/jqFilter-full-example.hcl
```

## Helper Functions

```mdx-code-block
import JQHelpersSharedBlock from "./reusable/_jqHelpers.md"

<JQHelpersSharedBlock/>
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
In addition to the native functions available in the jq language, the following helper functions are available for use in a jq query:

`epoch` - converts a time.Time to an epoch in seconds, as integer type. jq's native timestamp based functions expect integer input, but the Snowplow Analytics SDK provides base level timestamps as time.Time. This function can be chained with jq native functions to get past this limitation. For example:

```
{ foo: .collector_tstamp | epoch | todateiso8601 }
```

`epochMillis` - converts a time.Time to an epoch in milliseconds, as unsigned integer type. Because of how integers are handled in Go, unsigned integers aren't compatible with jq's native timestamp functions, so the `epoch` function truncates to seconds. This function cannot be chained with native jq functions, but where milliseconds matter for a value, use this function.

```
{ foo: .collector_tstamp | epochMillis }
```
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ Filters can be used in one of two ways, which is determined by the `filter_actio

This example filters out all data whose `platform` value does not match either `web` or `mobile`.

## Configuration options

Minimal configuration:

```hcl reference
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ Filters can be used in one of two ways, which is determined by the `filter_actio

The below example keeps messages which contain `prod` in the `environment` field of the `contexts_com_acme_env_context_1` context. Note that the `contexts_com_acme_env_context_1` context is attached more than once, if _any_ of the values at `dev` don't match `environment`, the message will be kept.

## Configuration options

Minimal configuration:

```hcl reference
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ The path to the field to match against must be provided as a jsonpath (dot notat

Filters can be used in one of two ways, which is determined by the `filter_action` option. `filter_action` determines the behavior of the app when the regex provided evaluates to `true`. If it's set to `"keep"`, the app will complete the remaining transformations and send the message to the destination (unless a subsequent filter determines otherwise). If it's set to `"drop"`, the message will be acked and discarded, without continuing to the next transformation or target.

## Configuration options

This example keeps all events whose `add_to_cart` event data at the `sku` field matches `test-data`.

Minimal configuration:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@

`spEnrichedSetPk`: Specific to Snowplow data. Sets the message's destination partition key to an atomic field from a Snowplow Enriched tsv string. The input data must be a valid Snowplow enriched TSV.

`SpEnrichedSetPk` only takes one option — the field to use for the partition key.

Example:
## Configuration options

`SpEnrichedSetPk` only takes one option — the field to use for the partition key.

```hcl reference
https://github.com/snowplow/snowbridge/blob/master/assets/docs/configuration/transformations/snowplow-builtin/spEnrichedSetPk-minimal-example.hcl
Expand Down
Loading