Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance problems with shared_client option #148

Open
oliveigah opened this issue Jul 18, 2024 · 1 comment
Open

Performance problems with shared_client option #148

oliveigah opened this issue Jul 18, 2024 · 1 comment

Comments

@oliveigah
Copy link
Contributor

oliveigah commented Jul 18, 2024

Sometime ago I've implemented the shared_client option (#133) and at that time my uderstanding was that it would almost never impact performance but I was wrong, it can severely impact performance if your pipeline produces broadway messages from multiple topics/partitions.

I'm using this feature on some high throughput topics on production and it is all fine.

But recently I added it to a broadway pipeline that is responsible for multiple topics and each one of them has multiple partitions and the performance dropped enourmously.

On my investigations afaict it boiled down to the kafka_protocol connection implementation which genserver's call the connection pid for each fetch request blocking the broadway producer using it on the pool event and since multiple producers uses the same connection the producers block each other.

One way to workaround this is split your topics one per pipeline but depending on how you run it on production the same pipeline may be responsible for many partitions and you can still run on some performance impact.

The performance is most noticible if you have high thorughput topics alongside with very low volume topics, because of batch waiting times. (This was my case)

In summary, the current implementation of shared client can drastically reduce connection and memory usage but contrary of what I thought the perfomance impact can be quite high!

I'm not sure how can I tackle this problem and I'm quite sure that it is not possible to be solved on the broadway kafka's side as it seems an implementation detail inside brod/kafka_protocol that is causing this.

I'll try to research on brod if I can find a way to do it different here but I have very little free time right now so it may be slow.

I'm sharing this to help others that may have the same problem and also gather some ideas on how can we approach it. Thanks!

@josevalim
Copy link
Member

Can you please send a PR to the docs with some notes? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants