Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow upgrades on azure without Terraform changes on LBs created from within Kubernetes #3257

Merged
merged 6 commits into from
Oct 9, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion docs/docs/reference/migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,19 @@
This document describes breaking changes and migrations between Constellation releases.
Use [`constellation config migrate`](./cli.md#constellation-config-migrate) to automatically update an old config file to a new format.

## Migrating from Azure's service principal authentication to managed identity authentication

## Migrations to v2.19.0

### Azure

* To allow seamless upgrades on Azure when Kubernetes services of type `LoadBalancer` are deployed, the target
load balancer in which the `cloud-controller-maanger` created the load balancing rules was changed. Instead of using the load balancer,
3u13r marked this conversation as resolved.
Show resolved Hide resolved
created and maintained by the CLI's Terraform code, the `cloud-controller-mananger` now creates its own load balancer in Azure.
3u13r marked this conversation as resolved.
Show resolved Hide resolved
If inside your Constellation there are services of type `LoadBalancer`, please remove them before the upgrade and re-apply them
3u13r marked this conversation as resolved.
Show resolved Hide resolved
afterward.


## Migrating from Azure's service principal authentication to managed identity authentication (during the upgrade to Constellation v2.8.0)

- The `provider.azure.appClientID` and `provider.azure.appClientSecret` fields are no longer supported and should be removed.
- To keep using an existing UAMI, add the `Owner` permission with the scope of your `resourceGroup`.
Expand Down
2 changes: 1 addition & 1 deletion internal/constellation/helm/overrides.go
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,7 @@ func getCCMConfig(azureState state.Azure, serviceAccURI string) ([]byte, error)
ResourceGroup: azureState.ResourceGroup,
LoadBalancerSku: "standard",
SecurityGroupName: azureState.NetworkSecurityGroupName,
LoadBalancerName: azureState.LoadBalancerName,
LoadBalancerName: "kubernetes-lb",
UseInstanceMetadata: true,
VMType: "vmss",
Location: creds.Location,
Expand Down
7 changes: 7 additions & 0 deletions terraform/infrastructure/aws/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,13 @@ locals {

in_cluster_endpoint = aws_lb.front_end.dns_name
out_of_cluster_endpoint = var.internal_load_balancer && var.debug ? module.jump_host[0].ip : local.in_cluster_endpoint
revision = 1
}

# A way to force replacement of resources if the provider does not want to replace them
# see: https://developer.hashicorp.com/terraform/language/resources/terraform-data#example-usage-data-for-replace_triggered_by
resource "terraform_data" "replacement" {
input = local.revision
}

resource "random_id" "uid" {
Expand Down
43 changes: 32 additions & 11 deletions terraform/infrastructure/azure/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@ locals {
{ name = "kubernetes", port = "6443", health_check_protocol = "Https", path = "/readyz", priority = 100 },
{ name = "bootstrapper", port = "9000", health_check_protocol = "Tcp", path = null, priority = 101 },
{ name = "verify", port = "30081", health_check_protocol = "Tcp", path = null, priority = 102 },
{ name = "konnectivity", port = "8132", health_check_protocol = "Tcp", path = null, priority = 103 },
{ name = "recovery", port = "9999", health_check_protocol = "Tcp", path = null, priority = 104 },
{ name = "join", port = "30090", health_check_protocol = "Tcp", path = null, priority = 105 },
var.debug ? [{ name = "debugd", port = "4000", health_check_protocol = "Tcp", path = null, priority = 106 }] : [],
Expand All @@ -53,6 +52,13 @@ locals {

in_cluster_endpoint = var.internal_load_balancer ? azurerm_lb.loadbalancer.frontend_ip_configuration[0].private_ip_address : azurerm_public_ip.loadbalancer_ip[0].ip_address
out_of_cluster_endpoint = var.debug && var.internal_load_balancer ? module.jump_host[0].ip : local.in_cluster_endpoint
revision = 1
}

# A way to force replacement of resources if the provider does not want to replace them
# see: https://developer.hashicorp.com/terraform/language/resources/terraform-data#example-usage-data-for-replace_triggered_by
resource "terraform_data" "replacement" {
input = local.revision
}

resource "random_id" "uid" {
Expand Down Expand Up @@ -223,10 +229,13 @@ resource "azurerm_network_security_group" "security_group" {
tags = local.tags

dynamic "security_rule" {
for_each = concat(
local.ports,
[{ name = "nodeports", port = local.ports_node_range, priority = 200 }]
)
# we keep this rule for one last release since the azurerm provider does not
# support moving security rules that are inlined (like this) to the external resource one.
# Even worse, just defining the azurerm_network_security_group without the
# "security_rule" block will NOT remove all the rules but do nothing.
# TODO(@3u13r): remove the "security_rule" block in the next release after this code has landed.
# So either after 2.19 or after 2.18.X if cherry-picked release.
for_each = [{ name = "konnectivity", priority = 1000, port = 8132 }]
burgerdev marked this conversation as resolved.
Show resolved Hide resolved
content {
name = security_rule.value.name
priority = security_rule.value.priority
Expand All @@ -241,6 +250,24 @@ resource "azurerm_network_security_group" "security_group" {
}
}

resource "azurerm_network_security_rule" "nsg_rule" {
for_each = {
for o in local.ports : o.name => o
}

name = each.value.name
priority = each.value.priority
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = each.value.port
source_address_prefix = "*"
destination_address_prefix = "*"
resource_group_name = var.resource_group
network_security_group_name = azurerm_network_security_group.security_group.name
}

module "scale_set_group" {
source = "./modules/scale_set"
for_each = var.node_groups
Expand Down Expand Up @@ -268,12 +295,6 @@ module "scale_set_group" {
subnet_id = azurerm_subnet.node_subnet.id
backend_address_pool_ids = each.value.role == "control-plane" ? [module.loadbalancer_backend_control_plane.backendpool_id] : []
marketplace_image = var.marketplace_image

# We still depend on the backends, since we are not sure if the VMs inside the VMSS have been
# "updated" to the new version (note: this is the update in Azure which "refreshes" the NICs and not
# our Constellation update).
# TODO(@3u13r): Remove this dependency after v2.18.0 has been released.
depends_on = [module.loadbalancer_backend_worker, azurerm_lb_backend_address_pool.all]
}

module "jump_host" {
Expand Down
1 change: 1 addition & 0 deletions terraform/infrastructure/azure/modules/scale_set/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ resource "azurerm_linux_virtual_machine_scale_set" "scale_set" {
instances, # required. autoscaling modifies the instance count externally
source_image_id, # required. update procedure modifies the image id externally
source_image_reference, # required. update procedure modifies the image reference externally
network_interface[0].ip_configuration[0].load_balancer_backend_address_pool_ids
]
}
}
7 changes: 7 additions & 0 deletions terraform/infrastructure/gcp/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,13 @@ locals {
]
in_cluster_endpoint = var.internal_load_balancer ? google_compute_address.loadbalancer_ip_internal[0].address : google_compute_global_address.loadbalancer_ip[0].address
out_of_cluster_endpoint = var.debug && var.internal_load_balancer ? module.jump_host[0].ip : local.in_cluster_endpoint
revision = 1
}

# A way to force replacement of resources if the provider does not want to replace them
# see: https://developer.hashicorp.com/terraform/language/resources/terraform-data#example-usage-data-for-replace_triggered_by
resource "terraform_data" "replacement" {
input = local.revision
}

resource "random_id" "uid" {
Expand Down
7 changes: 7 additions & 0 deletions terraform/infrastructure/openstack/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,13 @@ locals {
cloudsyaml_path = length(var.openstack_clouds_yaml_path) > 0 ? var.openstack_clouds_yaml_path : "~/.config/openstack/clouds.yaml"
cloudsyaml = yamldecode(file(pathexpand(local.cloudsyaml_path)))
cloudyaml = local.cloudsyaml.clouds[var.cloud]
revision = 1
}

# A way to force replacement of resources if the provider does not want to replace them
# see: https://developer.hashicorp.com/terraform/language/resources/terraform-data#example-usage-data-for-replace_triggered_by
resource "terraform_data" "replacement" {
input = local.revision
}

resource "random_id" "uid" {
Expand Down
7 changes: 7 additions & 0 deletions terraform/infrastructure/qemu/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,13 @@ locals {
cidr_vpc_subnet_nodes = "10.42.0.0/22"
cidr_vpc_subnet_control_planes = "10.42.1.0/24"
cidr_vpc_subnet_worker = "10.42.2.0/24"
revision = 1
}

# A way to force replacement of resources if the provider does not want to replace them
# see: https://developer.hashicorp.com/terraform/language/resources/terraform-data#example-usage-data-for-replace_triggered_by
resource "terraform_data" "replacement" {
input = local.revision
}

resource "random_password" "init_secret" {
Expand Down
Loading