Skip to content

Commit

Permalink
FIX: Reconcile treafik service with canary at 0
Browse files Browse the repository at this point in the history
Setting the weight to 100 on both services makes 50% of the traffic go
to each service. This made our canary enter an infinity loop while
promoting a new version and the traefik service go altered.

The traefik service should not be changed as it is managed by flagger
but getting stuck in an infinity loop is not great. The loop happened
because during promotion with `StepWeightPromotion` when the traefik
service gets reconciled the weights are reset. After that the getroutes
makes [this
calculus](https://github.com/fluxcd/flagger/blob/9a224a0c906354fcfcbc01d4d2df987389301e68/pkg/router/traefik.go#L163-L164)
for the weights which returns 0 for the canary and then it would later
not be able to exit
[this](https://github.com/fluxcd/flagger/blob/v1.36.1/pkg/controller/scheduler.go#L491-L546).

Besides this change do you know why are we treating the weights as
percentages? Should I also change the get routes function to calculate
the percentage based on the weights or it is coded like that because it
is expected that flagger keeps the weights with those constraints?
  • Loading branch information
joaosilva15 committed Jul 31, 2024
1 parent b6ac5e1 commit f4b2c37
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion pkg/router/traefik.go
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ func (tr *TraefikRouter) Reconcile(canary *flaggerv1.Canary) error {
Name: canaryName,
Namespace: canary.Namespace,
Port: canary.Spec.Service.Port,
Weight: 100,
Weight: 0,
},
)
}
Expand Down

0 comments on commit f4b2c37

Please sign in to comment.