Skip to content

Commit

Permalink
Add support for CloudWatch logs (closes #4)
Browse files Browse the repository at this point in the history
  • Loading branch information
jbeemster committed Aug 13, 2021
1 parent e6779ac commit 410f088
Show file tree
Hide file tree
Showing 4 changed files with 67 additions and 10 deletions.
23 changes: 13 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,7 @@ module "pipeline_rds" {
| Name | Type |
|------|------|
| [aws_autoscaling_group.asg](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/autoscaling_group) | resource |
| [aws_cloudwatch_log_group.log_group](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_group) | resource |
| [aws_dynamodb_table.kcl](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/dynamodb_table) | resource |
| [aws_iam_instance_profile.instance_profile](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_instance_profile) | resource |
| [aws_iam_policy.iam_policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
Expand All @@ -216,18 +217,26 @@ module "pipeline_rds" {

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_amazon_linux_2_ami_id"></a> [amazon\_linux\_2\_ami\_id](#input\_amazon\_linux\_2\_ami\_id) | The AMI ID to use which must be based of of Amazon Linux 2; by default the latest community version is used | `string` | `""` | no |
| <a name="input_associate_public_ip_address"></a> [associate\_public\_ip\_address](#input\_associate\_public\_ip\_address) | Whether to assign a public ip address to this instance | `bool` | `true` | no |
| <a name="input_custom_iglu_resolvers"></a> [custom\_iglu\_resolvers](#input\_custom\_iglu\_resolvers) | The custom Iglu Resolvers that will be used by Enrichment to resolve and validate events | <pre>list(object({<br> name = string<br> priority = number<br> uri = string<br> api_key = string<br> vendor_prefixes = list(string)<br> }))</pre> | `[]` | no |
| <a name="input_db_host"></a> [db\_host](#input\_db\_host) | The hostname of the database to connect to | `string` | n/a | yes |
| <a name="input_db_name"></a> [db\_name](#input\_db\_name) | The name of the database to connect to | `string` | n/a | yes |
| <a name="input_db_password"></a> [db\_password](#input\_db\_password) | The password to use to connect to the database | `string` | n/a | yes |
| <a name="input_db_port"></a> [db\_port](#input\_db\_port) | The port the database is running on | `number` | n/a | yes |
| <a name="input_db_sg_id"></a> [db\_sg\_id](#input\_db\_sg\_id) | The ID of the RDS security group that sits downstream of the webserver | `string` | n/a | yes |
| <a name="input_db_username"></a> [db\_username](#input\_db\_username) | The username to use to connect to the database | `string` | n/a | yes |
| <a name="input_in_stream_name"></a> [in\_stream\_name](#input\_in\_stream\_name) | The name of the input kinesis stream that the Enricher will pull data from | `string` | n/a | yes |
| <a name="input_name"></a> [name](#input\_name) | A name which will be pre-pended to the resources created | `string` | n/a | yes |
| <a name="input_purpose"></a> [purpose](#input\_purpose) | The type of data the loader will be pulling which can be one of ENRICHED\_EVENTS or JSON (Note: JSON can be used for loading bad rows) | `string` | n/a | yes |
| <a name="input_schema_name"></a> [schema\_name](#input\_schema\_name) | The database schema to load data into (e.g atomic \| atomic\_bad) | `string` | n/a | yes |
| <a name="input_ssh_key_name"></a> [ssh\_key\_name](#input\_ssh\_key\_name) | The name of the SSH key-pair to attach to all EC2 nodes deployed | `string` | n/a | yes |
| <a name="input_subnet_ids"></a> [subnet\_ids](#input\_subnet\_ids) | The list of subnets to deploy the Postgres Loader across | `list(string)` | n/a | yes |
| <a name="input_vpc_id"></a> [vpc\_id](#input\_vpc\_id) | The VPC to deploy the Postgres Loader within | `string` | n/a | yes |
| <a name="input_amazon_linux_2_ami_id"></a> [amazon\_linux\_2\_ami\_id](#input\_amazon\_linux\_2\_ami\_id) | The AMI ID to use which must be based of of Amazon Linux 2; by default the latest community version is used | `string` | `""` | no |
| <a name="input_associate_public_ip_address"></a> [associate\_public\_ip\_address](#input\_associate\_public\_ip\_address) | Whether to assign a public ip address to this instance | `bool` | `true` | no |
| <a name="input_cloudwatch_logs_enabled"></a> [cloudwatch\_logs\_enabled](#input\_cloudwatch\_logs\_enabled) | Whether application logs should be reported to CloudWatch; by default they are only on the server | `bool` | `false` | no |
| <a name="input_cloudwatch_logs_retention_days"></a> [cloudwatch\_logs\_retention\_days](#input\_cloudwatch\_logs\_retention\_days) | The length of time in days to retain logs for | `number` | `7` | no |
| <a name="input_custom_iglu_resolvers"></a> [custom\_iglu\_resolvers](#input\_custom\_iglu\_resolvers) | The custom Iglu Resolvers that will be used by Enrichment to resolve and validate events | <pre>list(object({<br> name = string<br> priority = number<br> uri = string<br> api_key = string<br> vendor_prefixes = list(string)<br> }))</pre> | `[]` | no |
| <a name="input_default_iglu_resolvers"></a> [default\_iglu\_resolvers](#input\_default\_iglu\_resolvers) | The default Iglu Resolvers that will be used by Enrichment to resolve and validate events | <pre>list(object({<br> name = string<br> priority = number<br> uri = string<br> api_key = string<br> vendor_prefixes = list(string)<br> }))</pre> | <pre>[<br> {<br> "api_key": "",<br> "name": "Iglu Central",<br> "priority": 10,<br> "uri": "http://iglucentral.com",<br> "vendor_prefixes": []<br> },<br> {<br> "api_key": "",<br> "name": "Iglu Central - Mirror 01",<br> "priority": 20,<br> "uri": "http://mirror01.iglucentral.com",<br> "vendor_prefixes": []<br> }<br>]</pre> | no |
| <a name="input_iam_permissions_boundary"></a> [iam\_permissions\_boundary](#input\_iam\_permissions\_boundary) | The permissions boundary ARN to set on IAM roles created | `string` | `""` | no |
| <a name="input_in_stream_name"></a> [in\_stream\_name](#input\_in\_stream\_name) | The name of the input kinesis stream that the Enricher will pull data from | `string` | n/a | yes |
| <a name="input_initial_position"></a> [initial\_position](#input\_initial\_position) | Where to start processing the input Kinesis Stream from (TRIM\_HORIZON or LATEST) | `string` | `"TRIM_HORIZON"` | no |
| <a name="input_instance_type"></a> [instance\_type](#input\_instance\_type) | The instance type to use | `string` | `"t3.micro"` | no |
| <a name="input_kcl_read_max_capacity"></a> [kcl\_read\_max\_capacity](#input\_kcl\_read\_max\_capacity) | The maximum READ capacity for the KCL DynamoDB table | `number` | `10` | no |
Expand All @@ -236,16 +245,10 @@ module "pipeline_rds" {
| <a name="input_kcl_write_min_capacity"></a> [kcl\_write\_min\_capacity](#input\_kcl\_write\_min\_capacity) | The minimum WRITE capacity for the KCL DynamoDB table | `number` | `1` | no |
| <a name="input_max_size"></a> [max\_size](#input\_max\_size) | The maximum number of servers in this server-group | `number` | `2` | no |
| <a name="input_min_size"></a> [min\_size](#input\_min\_size) | The minimum number of servers in this server-group | `number` | `1` | no |
| <a name="input_name"></a> [name](#input\_name) | A name which will be pre-pended to the resources created | `string` | n/a | yes |
| <a name="input_purpose"></a> [purpose](#input\_purpose) | The type of data the loader will be pulling which can be one of ENRICHED\_EVENTS or JSON (Note: JSON can be used for loading bad rows) | `string` | n/a | yes |
| <a name="input_schema_name"></a> [schema\_name](#input\_schema\_name) | The database schema to load data into (e.g atomic \| atomic\_bad) | `string` | n/a | yes |
| <a name="input_ssh_ip_allowlist"></a> [ssh\_ip\_allowlist](#input\_ssh\_ip\_allowlist) | The list of CIDR ranges to allow SSH traffic from | `list(any)` | <pre>[<br> "0.0.0.0/0"<br>]</pre> | no |
| <a name="input_ssh_key_name"></a> [ssh\_key\_name](#input\_ssh\_key\_name) | The name of the SSH key-pair to attach to all EC2 nodes deployed | `string` | n/a | yes |
| <a name="input_subnet_ids"></a> [subnet\_ids](#input\_subnet\_ids) | The list of subnets to deploy the Postgres Loader across | `list(string)` | n/a | yes |
| <a name="input_tags"></a> [tags](#input\_tags) | The tags to append to this resource | `map(string)` | `{}` | no |
| <a name="input_telemetry_enabled"></a> [telemetry\_enabled](#input\_telemetry\_enabled) | Whether or not to send telemetry information back to Snowplow Analytics Ltd | `bool` | `true` | no |
| <a name="input_user_provided_id"></a> [user\_provided\_id](#input\_user\_provided\_id) | An optional unique identifier to identify the telemetry events emitted by this stack | `string` | `""` | no |
| <a name="input_vpc_id"></a> [vpc\_id](#input\_vpc\_id) | The VPC to deploy the Postgres Loader within | `string` | n/a | yes |

## Outputs

Expand Down
27 changes: 27 additions & 0 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ locals {
var.tags,
local.local_tags
)

cloudwatch_log_group_name = "/aws/ec2/${var.name}"
}

data "aws_region" "current" {}
Expand Down Expand Up @@ -90,6 +92,17 @@ module "kcl_autoscaling" {
write_max_capacity = var.kcl_write_max_capacity
}

# --- CloudWatch: Logging

resource "aws_cloudwatch_log_group" "log_group" {
count = var.cloudwatch_logs_enabled ? 1 : 0

name = local.cloudwatch_log_group_name
retention_in_days = var.cloudwatch_logs_retention_days

tags = local.tags
}

# --- IAM: Roles & Permissions

resource "aws_iam_role" "iam_role" {
Expand Down Expand Up @@ -146,6 +159,17 @@ resource "aws_iam_policy" "iam_policy" {
"Resource": [
"${aws_dynamodb_table.kcl.arn}"
]
},
{
"Effect": "Allow",
"Action": [
"logs:PutLogEvents",
"logs:CreateLogStream",
"logs:DescribeLogStreams"
],
"Resource": [
"arn:aws:logs:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:log-group:${local.cloudwatch_log_group_name}:*"
]
}
]
}
Expand Down Expand Up @@ -298,6 +322,9 @@ locals {
schema_name = var.schema_name

telemetry_script = join("", module.telemetry.*.amazon_linux_2_user_data)

cloudwatch_logs_enabled = var.cloudwatch_logs_enabled
cloudwatch_log_group_name = local.cloudwatch_log_group_name
})
}

Expand Down
15 changes: 15 additions & 0 deletions templates/user-data.sh.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ function install_docker_ce() {
sudo usermod -a -G docker ec2-user
}

function get_instance_id() {
curl --silent --location "http://169.254.169.254/latest/meta-data/instance-id/"
}

install_docker_ce

sudo mkdir -p $${CONFIG_DIR}
Expand All @@ -35,6 +39,11 @@ sudo docker run \
-d \
--name create_schema \
--network host \
%{ if cloudwatch_logs_enabled ~}
--log-driver awslogs \
--log-opt awslogs-group=${cloudwatch_log_group_name} \
--log-opt awslogs-stream=$(get_instance_id) \
%{ endif ~}
-v $${CONFIG_DIR}:/snowplow/config \
-e 'PGUSER=${db_username}' \
-e 'PGPASSWORD=${db_password}' \
Expand All @@ -47,8 +56,14 @@ sudo docker run \
--name postgres_loader \
--restart always \
--network host \
%{ if cloudwatch_logs_enabled ~}
--log-driver awslogs \
--log-opt awslogs-group=${cloudwatch_log_group_name} \
--log-opt awslogs-stream=$(get_instance_id) \
%{ else ~}
--log-opt max-size=10m \
--log-opt max-file=5 \
%{ endif ~}
-v $${CONFIG_DIR}:/snowplow/config \
-e 'JAVA_OPTS=-Dorg.slf4j.simpleLogger.defaultLogLevel=info' \
snowplow/snowplow-postgres-loader:${version} \
Expand Down
12 changes: 12 additions & 0 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,18 @@ variable "tags" {
type = map(string)
}

variable "cloudwatch_logs_enabled" {
description = "Whether application logs should be reported to CloudWatch; by default they are only on the server"
default = false
type = bool
}

variable "cloudwatch_logs_retention_days" {
description = "The length of time in days to retain logs for"
default = 7
type = number
}

# --- Configuration options

variable "in_stream_name" {
Expand Down

0 comments on commit 410f088

Please sign in to comment.