Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First version of CODE #2

Merged
merged 3 commits into from
Oct 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Build the manager binary
FROM golang:1.23.0 as builder

WORKDIR /workspace
# Copy the Go Modules manifests
COPY go.mod go.mod
COPY go.sum go.sum
COPY vendor/ vendor/

# Copy the go source
COPY cmd cmd
#COPY api/ api/
COPY pkg/ pkg/

# Build
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -a -o manager ./cmd/traffic-controller

# Use distroless as minimal base image to package the manager binary
# Refer to https://github.com/GoogleContainerTools/distroless for more details
FROM gcr.io/distroless/static:nonroot
WORKDIR /
COPY --from=builder /workspace/manager .
USER nonroot:nonroot

ENTRYPOINT ["/manager"]
64 changes: 64 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@

# Image URL to use all building/pushing image targets
IMG ?= controller:latest
# Produce CRDs that work back to Kubernetes 1.11 (no version conversion)
CRD_OPTIONS ?= "crd:trivialVersions=true"

# Get the currently used golang install path (in GOPATH/bin, unless GOBIN is set)
ifeq (,$(shell go env GOBIN))
GOBIN=$(shell go env GOPATH)/bin
else
GOBIN=$(shell go env GOBIN)
endif

all: manager

# Run tests
test: generate fmt vet manifests
go test ./... -coverprofile cover.out

# Build manager binary
manager: generate fmt vet
go build -o bin/manager main.go

# Run against the configured Kubernetes cluster in ~/.kube/config
run: generate fmt vet manifests
go run ./main.go


# Run go fmt against code
fmt:
go fmt ./...

# Run go vet against code
vet:
go vet ./...

# Generate code
generate: controller-gen
$(CONTROLLER_GEN) object:headerFile="hack/boilerplate.go.txt" paths="./..."

# Build the docker image
docker-build: test
docker build . -t ${IMG}

# Push the docker image
docker-push:
docker push ${IMG}

# find or download controller-gen
# download controller-gen if necessary
controller-gen:
ifeq (, $(shell which controller-gen))
@{ \
set -e ;\
CONTROLLER_GEN_TMP_DIR=$$(mktemp -d) ;\
cd $$CONTROLLER_GEN_TMP_DIR ;\
go mod init tmp ;\
go install sigs.k8s.io/controller-tools/cmd/[email protected] ;\
rm -rf $$CONTROLLER_GEN_TMP_DIR ;\
}
CONTROLLER_GEN=$(GOBIN)/controller-gen
else
CONTROLLER_GEN=$(shell which controller-gen)
endif
114 changes: 113 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,113 @@
# k8s-traffic-controller
# k8s-traffic-controller
An operator that enables flexible multi-cluster routing using DNS.

This operators will listen for changes on External DNS endpoints (currently doing no action) and Ingress objects.

Ingress objects will be filtered by domain (see binding-domain in the next section) and optionally by an annotation (see annotation-filter below).
After being filtered, [Endpoints](https://github.com/kubernetes-sigs/external-dns/blob/master/docs/contributing/crd-source.md) matching the hosts specified inside ingresses will be created. These endpoints will be configured with an specific route53 parameter,
to set their weight. Weight can be provided from:
- Command line interface (using "fake" config backend and specifying a weight)
- Via a DynamoDB table.
- Annotations
- Route53 healthcheck route, re-routing failing healthchecks to other clusters having the same ingress

The AWS DynamoDB table format is given as follows:

|ClusterName| CurrentWeight| DesiredWeight|
|:---| :---| :---|


Where `ClusterName` is the Name of the cluster, `CurrentWeight` is the last weight read/set by the traffic controller and `DesiredWeight` is the target Weight.
Upon changing this last attribute, traffic controller will try to update the External DNS endpoint and will write back the table entry making CurrentWeight = DesiredWeight acknowledging the change.

## Metrics Exposed

|metric name| Help text| type| purpose|
|:---| :---| :---| :---|
|cluster_traffic_controller_ingress_weight_desired|The desired weight of the ingress|Gauge|Exposes the value obtained from the Storage Backend for Desired weight of this cluster.|
|cluster_traffic_controller_ingress_weight_current|The current weight of the cluster|Gauge|Exposes the value obtained from the Storage Backend for Current weight of this cluster.|

In normal working conditions, values exposed in the metrics come from DynamoDB and should be equal. Occasionally they may defer if scraping occurs at the very specific moment of changing the weight, fetching it from DynamoDB but still not applied by the Reconciler.

### Possible alerting
Alert if desired != current for a significant amount of time (+15min)

# How to configure the weights

## DynamoDB

Traffic controller needs to be provided with enough AWS IAM permissions to write on the configured DynamoDB table.

Upon initialization the traffic controller will try to read the Current/Desired Weight from DynamoDB. If an entry does not exist in the table it will be created and set to initial-weight.

Writing to DynamoDB is done by using transactions that lock the table until the operation is finished. If a traffic controller tries to access the table while there is an on going transaction
there will be an exception and the operation will be skipped (Those failed operations won't be rescheduled)

## Route53 HealthCheck

This method would activate or deactivate the traffic to one particular cluster according to the healthiness of the cluster. You need to provide an endpoint in the cluster
for this purpose see official [AWS documentation](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/dns-failover.html) for details

## Annotations

You can further configure the weight for a single Ingress by annotating it. When present, the final weight value will be `cluster_weight*annotation weight`

Use `dns.adevinta.com/traffic-weight` with values 0 - 100 to set the weight.

Here '0' means that the traffic is disabled, whereas '100' means that all the traffic available would go to this cluster. For example, if that cluster is getting the 5% of the total traffic, this application will get this whole 5%.

This annotation can be useful for creating canary deployments, doing migrations, etc. This is an advanced usage and should be fully understood before using in production.


## Examples

In these three cases the application won't receive traffic in the cluster. We are never stopping the traffic in absolute terms, it's just routing.

| cluster-weight | healthcheck | annotation weight | expected outcome |
|:---| :---| :---| :---|
| 100 | DOWN | 100 | no traffic for app |
| 0 | UP | 100| no traffic for app |
| 100 | UP | 0 | no traffic for app |

### Weights in different configurations

If the weight is configured in both dynamoDB and Annotations, the final result will be:

`dnsWeight = weight_in_dynamodb (as %) * weight_in_annotation`

Examples:

| cluster 1 w. | cluster 2 w. | Annotation 1 | Annotation 2 | #1 weight. | #2 weight. | #1 traffic % | #2 traffic % |
| :----------- | :----------- | :----------- | :----------- | :------------- | :------------- | :--- | :--- |
| 75 | 25 | 25 | 75 | 18,75 | 18,75 | 50% | 50% |
| 50 | 50 | 25 | 75 | 12.5 | 37.5 | 25% | 75% |
| 10 | 90 | 50 | 50 | 5 | 45 | 10% | 90% |
| 0 | 100 | 50 | 50 | 0 | 50 | 0% | 100% |
| 5 | 95 | 100 | 0 | 5 | 0 | 100% | 0% |
| 100. |.100. | N/A (empty) | 50. | 100 | 50 | 66%. | 33%. |

# Command line parameters

| Flag | Default Value | Wat? |
|:------|:---------------:|:----|
|metrics-addr| 8080 | Prometheus metrics endpoint port |
|cluster-name| None | Cluster name, used to lookup the right value inside the dynamodb table |
| aws-region | eu-west-1 | AWS Region for Route53 provider |
| `binding-domain` | | Domain for creating DNS entries, domains endpoints not matching this domaing will be skipped|
|backend-type | fake | Config backend to use for configuring dns weight, posible values "fake" "dynamodb"|
|annotation-filter| none | Should an annotation be given, it will be used to filter ingress objects and skip those not matching |
| `table-name` | traffic-controller | DynamoDB table read from dynamodb backend|
|initial-weight| 0 | DNS weight for this cluster, when fake backend is specified this will be the only weight used.|
|enable-leader-election | false| Enable leader election for this controller (if you run more than one instance)|
|dev-mode| false | Enables development mode (useful for testing/developing locally). This will instruct the controller to react to ingresses despite their status is not properly updated, for example, when defining External Load Balancers that require the controller to be run inside a k8s cluster in Amazon|
|annotation-prefix| dns.adevinta.com | The prefix for the `traffic-weight` annotation. The default annotation is `dns.adevinta.com/traffic-weight` |

# Testing

## Prerequisites

By default, integration tests are disabled.
To run integration tests, ensure you have the `RUN_INTEGRATION_TESTS=true` environment variable `export RUN_INTEGRATION_TESTS=true`

Run make test

2 changes: 2 additions & 0 deletions cfn/k8s-traffic-controller/config/dev/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
project_code: k8s-traffic-controller
region: eu-west-1
1 change: 1 addition & 0 deletions cfn/k8s-traffic-controller/config/dev/dynamodb.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
template_path: templates/dynamodb.yaml
2 changes: 2 additions & 0 deletions cfn/k8s-traffic-controller/config/dev/iam-role.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
stack_name: k8s-traffic-controller-iam-role
template_path: templates/iam-role.yaml
104 changes: 104 additions & 0 deletions cfn/k8s-traffic-controller/templates/dynamodb.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
Resources:
DDBTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: "k8s-traffic-controller",
AttributeDefinitions:
-
AttributeName: "ClusterName"
AttributeType: "S"
-
AttributeName: "CurrentWeight"
AttributeType: "N"
-
AttributeName: "DesiredWeight"
AttributeType: "N"
KeySchema:
-
AttributeName: "ClusterName"
KeyType: "HASH"
-
AttributeName: "ClusterName"
KeyType: "RANGE"
ProvisionedThroughput:
ReadCapacityUnits: 5
WriteCapacityUnits: 5
ReadCapacityScalableTarget:
Type: AWS::ApplicationAutoScaling::ScalableTarget
Properties:
MaxCapacity: 15
MinCapacity: 5
ResourceId: !Join
- /
- - table
- !Ref DDBTable
RoleARN: !GetAtt ScalingRole.Arn
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a valid cloudformation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look in the private repo it is exactly the same code

ScalableDimension: dynamodb:table:ReadCapacityUnits
ServiceNamespace: dynamodb
WriteCapacityScalableTarget:
Type: AWS::ApplicationAutoScaling::ScalableTarget
Properties:
MaxCapacity: 15
MinCapacity: 5
ResourceId: !Join
- /
- - table
- !Ref DDBTable
RoleARN: !GetAtt ScalingRole.Arn
ScalableDimension: dynamodb:table:WriteCapacityUnits
ServiceNamespace: dynamodb
ScalingRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Principal:
Service:
- application-autoscaling.amazonaws.com
Action:
- "sts:AssumeRole"
Path: "/"
Policies:
-
PolicyName: "root"
PolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Action:
- "dynamodb:DescribeTable"
- "dynamodb:UpdateTable"
- "cloudwatch:PutMetricAlarm"
- "cloudwatch:DescribeAlarms"
- "cloudwatch:GetMetricStatistics"
- "cloudwatch:SetAlarmState"
- "cloudwatch:DeleteAlarms"
Resource: "*"
WriteScalingPolicy:
Type: AWS::ApplicationAutoScaling::ScalingPolicy
Properties:
PolicyName: WriteAutoScalingPolicy
PolicyType: TargetTrackingScaling
ScalingTargetId: !Ref WriteCapacityScalableTarget
TargetTrackingScalingPolicyConfiguration:
TargetValue: 50.0
ScaleInCooldown: 60
ScaleOutCooldown: 60
PredefinedMetricSpecification:
PredefinedMetricType: DynamoDBWriteCapacityUtilization
ReadScalingPolicy:
Type: AWS::ApplicationAutoScaling::ScalingPolicy
Properties:
PolicyName: ReadAutoScalingPolicy
PolicyType: TargetTrackingScaling
ScalingTargetId: !Ref ReadCapacityScalableTarget
TargetTrackingScalingPolicyConfiguration:
TargetValue: 50.0
ScaleInCooldown: 60
ScaleOutCooldown: 60
PredefinedMetricSpecification:
PredefinedMetricType: DynamoDBReadCapacityUtilization
51 changes: 51 additions & 0 deletions cfn/k8s-traffic-controller/templates/iam-role.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
AWSTemplateFormatVersion: "2010-09-09"

Description:
Dummy IAM Role for the traffic-controller app

Outputs:
TrafficControllerRoleArn:
Value: !GetAtt K8sTrafficControllerRole.Arn
Description: IAM role with required access for traffic-controller
TrafficControllerRoleName:
Value: !Ref K8sTrafficControllerRole
Description: IAM role name with required access for traffic-controller

Resources:
K8sTrafficControllerRole:
Type: "AWS::IAM::Role"
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: "Allow"
Principal:
AWS:
- !Sub "arn:aws:iam::${AWS::AccountId}:root" # Self account
Action:
- "sts:AssumeRole"
Path: "/"
ManagedPolicyArns:
- !Ref 'IAMManagedPolicyTrafficControl'
Policies:
- PolicyName: "AssumeRolePolicyDocument"
PolicyDocument:
Statement:
- Action:
- "iam:GetRole"
Effect: "Allow"
Resource: '*'

IAMManagedPolicyTrafficControl:
Type: AWS::IAM::ManagedPolicy
Properties:
Description: 'IAMManagedPolicyTrafficControl'
Path: /
PolicyDocument:
Version: '2012-10-17'
Statement:
- Action:
- dynamodb:*
Effect: Allow
Resource: !Sub "arn:aws:dynamodb:eu-west-1:${AWS::AccountId}:table/k8s-traffic-controller"

Loading