Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: logging and pf improvements, skip namespaces & non-ClusterIP svcs #249

Merged
merged 3 commits into from
Aug 3, 2023

Conversation

jaredallard
Copy link
Contributor

This PR contains a number of reliability and UX improvements for using
localizer in a larger cluster.

  • When creating a port-forward, we don't pass a stop channel anymore.
    This was previously passed ctx.Done() which would silently stop the
    port-forward on ^C. This caused invalid "use of closed connection"
    logging that was confusing to the user but also possibly could end up
    in a port-forward not being fully stopped.
  • When shutting down, pass a temporary context that has a timeout of 30
    seconds. This ensures that /etc/host modifications, ip pool
    cleanups, and other shutdown functions properly finish instead of
    possibly being terminated midway through during normal shutdown.
  • Enabled delibird to remove tracing/metrics information being sent by
    default in localizer binaries. This is a OSS friendly way of us
    getting telemetry by logging to disk instead, which a user can opt to
    send us when reporting bugs (more on that in future work!)
  • Upgraded all dependencies, including client-go. We still need a
    fork, but thankfully our fork is a single line now so it's much more
    likely we can upstream the change we made (the other change was
    upstreamed a year ago by another user!)
  • Exposes stderr from the port-forwarder to the console with a colored
    prefix. This makes it easier to tell when a port-forward was busted
    but localizer didn't detect it (or if it did, why it failed!)
  • Skipped namespaces by default (kube-system) and enabled the user to
    pass --skip-namespace to provide more (helps Run against multiple namespaces, but not all #212).
  • Skip non-clusterIP services. These aren't addressable in the cluster,
    so we shouldn't create a tunnel for them. This helps with NodePort
    and LoadBalancer service overhead that some larger clusters may
    have.
  • Reduced logging to be easier to read, while keeping a lot of helpful
    logs in the debug level. Defaults to info logging instead of
    debug logging (pass --log-level debug for that!)

This PR contains a number of reliability and UX improvements for using
localizer in a larger cluster.

* When creating a port-forward, we don't pass a stop channel anymore.
  This was previously passed `ctx.Done()` which would silently stop the
  port-forward on ^C. This caused invalid "use of closed connection"
  logging that was confusing to the user but also possibly could end up
  in a port-forward not being fully stopped.
* When shutting down, pass a temporary context that has a timeout of 30
  seconds. This ensures that `/etc/host` modifications, ip pool
  cleanups, and other shutdown functions properly finish instead of
  possibly being terminated midway through during normal shutdown.
* Enabled delibird to remove tracing/metrics information being sent by
  default in localizer binaries. This is a OSS friendly way of us
  getting telemetry by logging to disk instead, which a user can opt to
  send us when reporting bugs (more on that in future work!)
* Upgraded all dependencies, including `client-go`. We still need a
  fork, but thankfully our fork is a single line now so it's much more
  likely we can upstream the change we made (the other change was
  upstreamed a year ago by another user!)
* Exposes `stderr` from the port-forwarder to the console with a colored
  prefix. This makes it easier to tell when a port-forward was busted
  but localizer didn't detect it (or if it did, why it failed!)
* Skipped namespaces by default (`kube-system`) and enabled the user to
  pass `--skip-namespace` to provide more (helps #212).
* Skip non-clusterIP services. These aren't addressable in the cluster,
  so we shouldn't create a tunnel for them. This helps with `NodePort`
  and `LoadBalancer` service overhead that some larger clusters may
  have.
* Reduced logging to be easier to read, while keeping a lot of helpful
  logs in the `debug` level. Defaults to `info` logging instead of
  `debug` logging (pass `--log-level debug` for that!)
@jaredallard jaredallard requested a review from a team as a code owner August 3, 2023 20:10
return nil
}

var skip bool
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inefficent, but I figured this shouldn't be in a hot enough code path that this would matter.

Copy link

@kaldorn kaldorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this mostly makes sense. I have a comment/question.

internal/ssh/client.go Show resolved Hide resolved
internal/server/grpc.go Outdated Show resolved Hide resolved
@jaredallard jaredallard merged commit 09e601e into main Aug 3, 2023
1 check passed
@jaredallard jaredallard deleted the jaredallard/fix/improvements branch August 3, 2023 20:57
@getoutreach-ci-1
Copy link
Contributor

🎉 This PR is included in version 1.15.2 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

@camerondavison
Copy link

It does not seem like the --skip-namespace flag actually got hooked up to anything. is that on purpose?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants