Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add health checks to controller and router charts #107

Open
qrkourier opened this issue May 30, 2023 · 2 comments
Open

add health checks to controller and router charts #107

qrkourier opened this issue May 30, 2023 · 2 comments

Comments

@qrkourier
Copy link
Member

K8s will use each type of probe if we make them available. A successful probe means:

  1. startup: the app is finished starting up, go ahead and try the readiness probe
  2. readiness: the app is ready to receive incoming requests, tell the LB to begin forwarding to the pod
  3. liveness: the app is healthy and still has capacity for more requests, keep sending and don't restart me right now

Ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

@qrkourier
Copy link
Member Author

qrkourier commented Mar 4, 2024

There is a router health check configuration (link to reference) that can be added to the router config that we use currently to verify the health of a router, e.g., whether it's connected to the controller ctrlPingCheck.

It doesn't specifically show it's "ready to receive traffic", because that could mean many things, since it could be connected to certain links while others are having issues/down. This is just a general check to ensure the router receives updates from the controller.

The most recent addition to the health check is linkCheck, so you can verify if the router has a link or even a specific link to another router, indicating it's ready to transport traffic.

To add the health check to the ER, you need to add something like this to the ER config:

healthChecks:
  ctrlPingCheck:
    interval: 30s
    timeout: 15s
    initialDelay: 15s
  linkCheck:
    minLinks: 1
    interval: 5s
    initialDelay: 5s

add a web section to the ER like this:

web:
  - name: health-check
    bindPoints:
      - interface: 0.0.0.0:8081
        address: 0.0.0.0:8081
    apis:
      - binding: health-checks

This combination would allow you to GET https://localhost:8081/health-checks

The above would produce something like this as an output:

{
    "data": {
        "checks": [
            {
                "details": null,
                "healthy": true,
                "id": "controllerPing",
                "lastCheckDuration": "4.344µs",
                "lastCheckTime": "2024-02-13T18:40:01Z"
            },
            {
                "details": [
                    {
                        "linkId": "lndLOpwd7yOcSXtCcPwWf",
                        "destRouterId": "j.LOxzd9A",
                        "latency": 3271785.96875,
                        "addresses": {
                            "ack": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:34.199.168.165:61031"
                            },
                            "payload": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:34.199.168.165:65235"
                            }
                        }
                    },
                    {
                        "linkId": "3W72EY2a0inbyCHIdYk6Gd",
                        "destRouterId": "f9fs.nvej",
                        "latency": 84934025.1015625,
                        "addresses": {
                            "ack": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:35.181.192.76:42764"
                            },
                            "payload": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:35.181.192.76:42750"
                            }
                        }
                    },
                    {
                        "linkId": "1yp3sDwqj6CHui4Zmt89wB",
                        "destRouterId": "PKud5nLtj",
                        "latency": 188746151.2265625,
                        "addresses": {
                            "ack": {
                                "localAddr": "tcp:10.19.116.60:58616",
                                "remoteAddr": "tcp:52.66.46.9:443"
                            },
                            "payload": {
                                "localAddr": "tcp:10.19.116.60:52392",
                                "remoteAddr": "tcp:52.66.46.9:443"
                            }
                        }
                    },
                    {
                        "linkId": "5wjStaisGHkT5Xu0fQrfEq",
                        "destRouterId": "bnq85xLt3",
                        "latency": 201215039.2109375,
                        "addresses": {
                            "ack": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:18.61.94.28:48878"
                            },
                            "payload": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:18.61.94.28:48864"
                            }
                        }
                    },
                    {
                        "linkId": "4e0RACw8dsguV43BrAsQfc",
                        "destRouterId": "u6Q1QSPulm",
                        "latency": 3872621.265625,
                        "addresses": {
                            "ack": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:132.145.157.243:48868"
                            },
                            "payload": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:132.145.157.243:48864"
                            }
                        }
                    },
                    {
                        "linkId": "6AmAWcfB50a2OzFvlsr1vn",
                        "destRouterId": "7fTQPzdt7d",
                        "latency": 3430749.8671875,
                        "addresses": {
                            "ack": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:3.217.193.94:50371"
                            },
                            "payload": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:3.217.193.94:25962"
                            }
                        }
                    },
                    {
                        "linkId": "1CnhJJ73e1AiKjoVoRy3tt",
                        "destRouterId": "oWeCqGOcJ",
                        "latency": 2906363.9140625,
                        "addresses": {
                            "ack": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:52.54.127.95:56812"
                            },
                            "payload": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:52.54.127.95:56796"
                            }
                        }
                    },
                    {
                        "linkId": "3vfjSpoNfoYbyqIBbT7ZKx",
                        "destRouterId": "R7nKHgLtj",
                        "latency": 66181343.484375,
                        "addresses": {
                            "ack": {
                                "localAddr": "tcp:10.19.116.60:45646",
                                "remoteAddr": "tcp:44.225.183.166:443"
                            },
                            "payload": {
                                "localAddr": "tcp:10.19.116.60:45640",
                                "remoteAddr": "tcp:44.225.183.166:443"
                            }
                        }
                    },
                    {
                        "linkId": "2MaMaVsqFNtvhRejCpyR7y",
                        "destRouterId": "s3FjWqdlS",
                        "latency": 70628346.7578125,
                        "addresses": {
                            "ack": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:54.77.98.202:57722"
                            },
                            "payload": {
                                "localAddr": "tcp:10.19.116.60:443",
                                "remoteAddr": "tcp:54.77.98.202:57720"
                            }
                        }
                    },
                    {
                        "linkId": "2EVj6GaGBr1KEFXSeypc3i",
                        "destRouterId": "joI2Wqdlb",
                        "latency": 187742628.4765625,
                        "addresses": {
                            "ack": {
                                "localAddr": "tcp:10.19.116.60:38150",
                                "remoteAddr": "tcp:15.207.241.220:443"
                            },
                            "payload": {
                                "localAddr": "tcp:10.19.116.60:38136",
                                "remoteAddr": "tcp:15.207.241.220:443"
                            }
                        }
                    }
                ],
                "healthy": true,
                "id": "link.health",
                "lastCheckDuration": "120.997µs",
                "lastCheckTime": "2024-02-13T18:40:06Z"
            }
        ],
        "healthy": true
    },
    "meta": {}
}

@qrkourier
Copy link
Member Author

Link to controller health-checks reference: https://openziti.io/docs/reference/configuration/controller#healthchecks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant