Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hotfix: create new kubernetes client per request #163

Conversation

joshuarli
Copy link

@joshuarli joshuarli commented Dec 29, 2022

If more than 1 job is started simultaneously with the same secretConfig (easy to reproduce - pretty much any stage which has 2 or more jobs, and no k8s secrets referenced at the job level), the client will be reused, but the first job that succeeds will close it, invariably leading to the remainder of client calls failing because the underlying okhttp threadpool is closed.

if more than a job is started simultaneously with the same secretConfig,
the client will be reused, but the first job that succeeds will close it
@joshuarli
Copy link
Author

oops, i was meaning to open this PR on our own fork: getsentry#1

Now that this is open, anyone have any ideas on what a more proper fix would look like? Ideally clients are reused - I'm thinking maybe a job identifier or something could be used in addition to secretConfig.

@chadlwilson
Copy link
Member

chadlwilson commented Dec 30, 2022

The code does seem problematic at first glance, after looking at it. Without having actively tried this plugin more recently, I am a bit surprised that it works at all after the 1st request since it seems it closes the client regardless of success or failure of secrets retrieval? What am I missing?

I'd have thought the client should just not be closed by the caller at all, given it is intended to be re-used. The very heavily used plugin at https://github.com/gocd/kubernetes-elastic-agents doesn't close the client except when the factory decides to clear it out: https://github.com/gocd/kubernetes-elastic-agents/blob/d898615d213ea5467b0e1960064c4227fa10b4f8/src/main/java/cd/go/contrib/elasticagent/KubernetesClientFactory.java#L86-L92.

Have you tried removing the finally block that closes the client?

@joshuarli
Copy link
Author

You aren't missing anything, I just think that that secretConfig is more unique to a stage execution than you might imagine:

SecretConfig secretConfig = request.getConfiguration();

Our feedback loop is kind of laborious but we had a stage with 2+ jobs and only k8s secrets defined at pipeline-level, which resulted in the same secretConfig for each job. Maybe this just isn't a common enough scenario? I was also surprised to not find anything in the issue tracker or gocd google group discussions.

Removing finally is another approach but you'd also have to close before a new client's created otherwise you'd start leaking connections and memory over time.

I think a better fix here would be to only look at the relevant pieces of the "secretConfig" for client reuse, and even better, maintain a mapping so you can cache multiple clients. We're happy with the hotfix for now though, and I know basically 0 java, haha.

@chadlwilson
Copy link
Member

Aight thanks, let me take a closer look at the secret config later and set up a local env. Should be possible to release at least the simple fix so you don't need to maintain a fork.

@chadlwilson
Copy link
Member

Can you have a try with https://github.com/gocd/gocd-kubernetes-based-secrets-plugin/releases/tag/v1.2.2-169 and create a new issue if you see any problems?

Was easily able to replicate the issues here, with simpler setups. Leaking clients is still potentially an issue (#166) but I note the much more heavily used plugin at https://github.com/gocd/kubernetes-elastic-agents suffers from the same issue. At least prior to my change, that plugin didn't suffer from the same "uniqueness" challenges as this one on its config (which I have changed here to mitigate leakage).

Can only conclude this plugin isn't so frequently used :-/

@chadlwilson chadlwilson closed this Jan 8, 2023
@joshuarli joshuarli deleted the hotfix-create-new-client-for-each-request branch January 24, 2023 05:47
@joshuarli
Copy link
Author

@chadlwilson sorry for the delay - been meaning to try out your new release. Works on my end! Thanks for your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants