-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
engine: fix several file descriptor leaks. #8393
Conversation
Signed-off-by: Phillip Whelan <[email protected]>
Signed-off-by: Phillip Whelan <[email protected]>
Signed-off-by: Phillip Whelan <[email protected]>
Signed-off-by: Phillip Whelan <[email protected]>
Signed-off-by: Phillip Whelan <[email protected]>
Signed-off-by: Phillip Whelan <[email protected]>
Signed-off-by: Phillip Whelan <[email protected]>
Signed-off-by: Phillip Whelan <[email protected]>
Signed-off-by: Phillip Whelan <[email protected]>
@@ -1135,6 +1135,12 @@ int flb_engine_shutdown(struct flb_config *config) | |||
flb_hs_destroy(config->http_ctx); | |||
} | |||
#endif | |||
if (config->evl) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems to me that before restart the event loop is not being destroyed, I would suggest to identify the root cause for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look into this. This specific piece of code was added in to mitigate crashes in the following tests:
- flb-rt-config_map_opts
- flb-rt-custom_calyptia_test
- flb-rt-filter_stdout
- case_insensitive_name
The other flb-rt-filter_stdout
test, json_multiple
, does not crash in the same manner as the other test.
This is most likely due to the manner in which these tests use the fluent-bit API to instantiate the pipelines they use. The case_insensitive_name
test executes flb_destroy
without first calling flb_stop
. Upon trying to call flb_stop
in the test it crashes when attempting to call pthread_join
inside flb_stop
. This is most likely due to the fact that the test also does not call flb_start
to initialize the thread.
The easiest way out of this would be to simply leave the check there. Another alternative would be to move the deletion of the event channel into flb_stop, where it might be better placed. This would of course go against the symmetry it has with the channel being created inside flb_engine.c
in the flb_engine_start
function. If we move the destruction of the channel to flb_stop
we should also probably move its creation flb_start
. At the moment I have no idea what the consequences of this would be. The most obvious consequence would be that it would be linked to ctx->event_channel
inside the flb_ctx_t
instead of to config->event_thread_init
, which is linked to the configuration.
This could also be due to the fact that flb_config_init
sets config->is_running
to TRUE
instead of where I would expect it, in flb_engine_start
. If it is set there instead flb_engine_shutdown
would not get called in flb_destroy
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving config->is_running
into flb_engine_started
avoids the SIGSEGV caused by destroying the ch_self_events
channel but causes the memory used by all the custom, input, output and filter plugins to be leaked when using in several tests. The deallocation of plugins could be moved to flb_config_exit
or similar, but that seems to me to be a bit ouf of scope. If that is the approach we want to take I can open a new PR later with the code to do so.
Signed-off-by: Phillip Whelan <[email protected]>
Summary
This is a reworking of the past PR #8371 which includes all these fixes in a single commit. This PR has been with commits spanning several files but only when it is the same logical component, ie: in a signle commit I did all the changes to allow freeing the timer used by the scheduler.
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.