A Terraform module which deploys the Snowplow Stream Collector on CE. If you want to use a custom image for this deployment you will need to ensure it is based on top of Ubuntu 20.04.
This module by default collects and forwards telemetry information to Snowplow to understand how our applications are being used. No identifying information about your sub-account or account fingerprints are ever forwarded to us - it is very simple information about what modules and applications are deployed and active.
If you wish to subscribe to our mailing list for updates to these modules or security advisories please set the user_provided_id
variable to include a valid email address which we can reach you at.
To disable telemetry simply set variable telemetry_enabled = false
.
For details on what information is collected please see this module: https://github.com/snowplow-devops/terraform-snowplow-telemetry
A collector requires two output PubSub Topics and a Load Balancer which is deployed upstream. The Load Balancer ensures we can easily configure TLS termination later in the setup and provides a simple mechanism for setting up DNS.
module "raw_topic" {
source = "snowplow-devops/pubsub-topic/google"
version = "0.3.0"
name = "raw-topic"
}
module "bad_1_topic" {
source = "snowplow-devops/pubsub-topic/google"
version = "0.3.0"
name = "bad-1-topic"
}
module "collector_pubsub" {
source = "snowplow-devops/collector-pubsub-ce/google"
accept_limited_use_license = true
name = "collector-server"
network = var.network
subnetwork = var.subnetwork
region = var.region
ssh_ip_allowlist = ["0.0.0.0/0"]
ssh_key_pairs = []
topic_project_id = var.project_id
good_topic_name = module.raw_topic.name
bad_topic_name = module.bad_1_topic.name
}
module "collector_lb" {
source = "snowplow-devops/lb/google"
version = "0.3.0"
name = "collector-lb"
instance_group_named_port_http = module.collector_pubsub.named_port_http
instance_group_url = module.collector_pubsub.instance_group_url
health_check_self_link = module.collector_pubsub.health_check_self_link
}
Name | Version |
---|---|
terraform | >= 1.0.0 |
>= 3.44.0 |
Name | Version |
---|---|
>= 3.44.0 |
Name | Source | Version |
---|---|---|
service | snowplow-devops/service-ce/google | 0.1.0 |
telemetry | snowplow-devops/telemetry/snowplow | 0.5.0 |
Name | Type |
---|---|
google_compute_firewall.egress | resource |
google_compute_firewall.ingress | resource |
google_compute_firewall.ingress_ssh | resource |
google_project_iam_member.sa_logging_log_writer | resource |
google_project_iam_member.sa_pubsub_publisher | resource |
google_project_iam_member.sa_pubsub_viewer | resource |
google_service_account.sa | resource |
Name | Description | Type | Default | Required |
---|---|---|---|---|
bad_topic_name | The name of the bad pubsub topic that the collector will insert data into | string |
n/a | yes |
good_topic_name | The name of the good pubsub topic that the collector will insert data into | string |
n/a | yes |
name | A name which will be pre-pended to the resources created | string |
n/a | yes |
network | The name of the network to deploy within | string |
n/a | yes |
project_id | The project ID in which the stack is being deployed | string |
n/a | yes |
region | The name of the region to deploy within | string |
n/a | yes |
topic_project_id | The project ID in which the topics are deployed | string |
n/a | yes |
accept_limited_use_license | Acceptance of the SLULA terms (https://docs.snowplow.io/limited-use-license-1.0/) | bool |
false |
no |
app_version | App version to use. This variable facilitates dev flow, the modules may not work with anything other than the default value. | string |
"3.0.1" |
no |
associate_public_ip_address | Whether to assign a public ip address to this instance; if false this instance must be behind a Cloud NAT to connect to the internet | bool |
true |
no |
byte_limit | The amount of bytes to buffer events before pushing them to PubSub | number |
1000000 |
no |
cookie_domain | Optional first party cookie domain for the collector to set cookies on (e.g. acme.com) | string |
"" |
no |
custom_paths | Optional custom paths that the collector will respond to, typical paths to override are '/com.snowplowanalytics.snowplow/tp2', '/com.snowplowanalytics.iglu/v1' and '/r/tp2'. e.g. { "/custom/path/" : "/com.snowplowanalytics.snowplow/tp2"} | map(string) |
{} |
no |
gcp_logs_enabled | Whether application logs should be reported to GCP Logging | bool |
true |
no |
health_check_path | The path to bind for health checks | string |
"/health" |
no |
ingress_port | The port that the collector will be bound to and expose over HTTP | number |
8080 |
no |
java_opts | Custom JAVA Options | string |
"-XX:InitialRAMPercentage=75 -XX:MaxRAMPercentage=75" |
no |
labels | The labels to append to this resource | map(string) |
{} |
no |
machine_type | The machine type to use | string |
"e2-small" |
no |
network_project_id | The project ID of the shared VPC in which the stack is being deployed | string |
"" |
no |
record_limit | The number of events to buffer before pushing them to PubSub | number |
500 |
no |
ssh_block_project_keys | Whether to block project wide SSH keys | bool |
true |
no |
ssh_ip_allowlist | The list of CIDR ranges to allow SSH traffic from | list(any) |
[ |
no |
ssh_key_pairs | The list of SSH key-pairs to add to the servers | list(object({ |
[] |
no |
subnetwork | The name of the sub-network to deploy within; if populated will override the 'network' setting | string |
"" |
no |
target_size | The number of servers to deploy | number |
1 |
no |
telemetry_enabled | Whether or not to send telemetry information back to Snowplow Analytics Ltd | bool |
true |
no |
time_limit_ms | The amount of time to buffer events before pushing them to PubSub | number |
500 |
no |
ubuntu_20_04_source_image | The source image to use which must be based of of Ubuntu 20.04; by default the latest community version is used | string |
"" |
no |
user_provided_id | An optional unique identifier to identify the telemetry events emitted by this stack | string |
"" |
no |
Name | Description |
---|---|
health_check_id | Identifier for the health check on the instance group |
health_check_self_link | The URL for the health check on the instance group |
instance_group_url | The full URL of the instance group created by the manager |
manager_id | Identifier for the instance group manager |
manager_self_link | The URL for the instance group manager |
named_port_http | The name of the port exposed by the instance group |
named_port_value | The named port value (e.g. 8080) |
Copyright 2021-present Snowplow Analytics Ltd.
Licensed under the Snowplow Limited Use License Agreement. (If you are uncertain how it applies to your use case, check our answers to frequently asked questions.)