Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

engine: fix several file descriptor leaks. #8393

Merged
merged 10 commits into from
Jan 22, 2024
1 change: 1 addition & 0 deletions include/fluent-bit/flb_log.h
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,7 @@ static inline int flb_log_suppress_check(int log_suppress_interval, const char *
#endif

int flb_log_worker_init(struct flb_worker *worker);
int flb_log_worker_destroy(struct flb_worker *worker);
int flb_errno_print(int errnum, const char *file, int line);

#ifdef __FLB_FILENAME__
Expand Down
3 changes: 1 addition & 2 deletions src/flb_config.c
Original file line number Diff line number Diff line change
Expand Up @@ -417,8 +417,7 @@ void flb_config_exit(struct flb_config *config)

/* Pipe */
if (config->ch_data[0]) {
mk_event_closesocket(config->ch_data[0]);
mk_event_closesocket(config->ch_data[1]);
flb_pipe_destroy(config->ch_data);
}

/* Channel manager */
Expand Down
12 changes: 9 additions & 3 deletions src/flb_engine.c
Original file line number Diff line number Diff line change
Expand Up @@ -726,9 +726,9 @@ int flb_engine_start(struct flb_config *config)
* to the local event loop 'evl'.
*/
ret = mk_event_channel_create(config->evl,
&config->ch_self_events[0],
&config->ch_self_events[1],
&config->event_thread_init);
&config->ch_self_events[0],
&config->ch_self_events[1],
&config->event_thread_init);
if (ret == -1) {
flb_error("[engine] could not create engine thread channel");
return -1;
Expand Down Expand Up @@ -1135,6 +1135,12 @@ int flb_engine_shutdown(struct flb_config *config)
flb_hs_destroy(config->http_ctx);
}
#endif
if (config->evl) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems to me that before restart the event loop is not being destroyed, I would suggest to identify the root cause for this

Copy link
Contributor Author

@pwhelan pwhelan Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look into this. This specific piece of code was added in to mitigate crashes in the following tests:

  • flb-rt-config_map_opts
  • flb-rt-custom_calyptia_test
  • flb-rt-filter_stdout
    • case_insensitive_name

The other flb-rt-filter_stdout test, json_multiple, does not crash in the same manner as the other test.

This is most likely due to the manner in which these tests use the fluent-bit API to instantiate the pipelines they use. The case_insensitive_name test executes flb_destroy without first calling flb_stop. Upon trying to call flb_stop in the test it crashes when attempting to call pthread_join inside flb_stop. This is most likely due to the fact that the test also does not call flb_start to initialize the thread.

The easiest way out of this would be to simply leave the check there. Another alternative would be to move the deletion of the event channel into flb_stop, where it might be better placed. This would of course go against the symmetry it has with the channel being created inside flb_engine.c in the flb_engine_start function. If we move the destruction of the channel to flb_stop we should also probably move its creation flb_start. At the moment I have no idea what the consequences of this would be. The most obvious consequence would be that it would be linked to ctx->event_channel inside the flb_ctx_t instead of to config->event_thread_init, which is linked to the configuration.

This could also be due to the fact that flb_config_init sets config->is_running to TRUE instead of where I would expect it, in flb_engine_start. If it is set there instead flb_engine_shutdown would not get called in flb_destroy.

Copy link
Contributor Author

@pwhelan pwhelan Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving config->is_running into flb_engine_started avoids the SIGSEGV caused by destroying the ch_self_events channel but causes the memory used by all the custom, input, output and filter plugins to be leaked when using in several tests. The deallocation of plugins could be moved to flb_config_exit or similar, but that seems to me to be a bit ouf of scope. If that is the approach we want to take I can open a new PR later with the code to do so.

mk_event_channel_destroy(config->evl,
config->ch_self_events[0],
config->ch_self_events[1],
&config->event_thread_init);
}

return 0;
}
Expand Down
14 changes: 10 additions & 4 deletions src/flb_log.c
Original file line number Diff line number Diff line change
Expand Up @@ -303,6 +303,12 @@ int flb_log_cache_check_suppress(struct flb_log_cache *cache, char *msg_buf, siz
return FLB_TRUE;
}

int flb_log_worker_destroy(struct flb_worker *worker)
{
flb_pipe_destroy(worker->log);
return 0;
}

int flb_log_worker_init(struct flb_worker *worker)
{
int ret;
Expand All @@ -321,16 +327,14 @@ int flb_log_worker_init(struct flb_worker *worker)
ret = mk_event_add(log->evl, worker->log[0],
FLB_LOG_EVENT, MK_EVENT_READ, &worker->event);
if (ret == -1) {
close(worker->log[0]);
close(worker->log[1]);
flb_pipe_destroy(worker->log);
return -1;
}

/* Log cache to reduce noise */
cache = flb_log_cache_create(10, FLB_LOG_CACHE_ENTRIES);
if (!cache) {
close(worker->log[0]);
close(worker->log[1]);
flb_pipe_destroy(worker->log);
return -1;
}
worker->log_cache = cache;
Expand Down Expand Up @@ -688,7 +692,9 @@ int flb_log_destroy(struct flb_log *log, struct flb_config *config)
flb_pipe_destroy(log->ch_mng);
if (log->worker->log_cache) {
flb_log_cache_destroy(log->worker->log_cache);
log->worker->log_cache = NULL;
}
flb_log_worker_destroy(log->worker);
flb_free(log->worker);
flb_free(log);

Expand Down
10 changes: 10 additions & 0 deletions src/flb_output_thread.c
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,10 @@ static void output_thread(void *data)
}
}

mk_event_channel_destroy(th_ins->evl,
th_ins->ch_thread_events[0],
th_ins->ch_thread_events[1],
&event_local);
/*
* Final cleanup, destroy all resources associated with:
*
Expand All @@ -363,6 +367,12 @@ static void output_thread(void *data)
if (params) {
flb_free(params);
}

mk_event_channel_destroy(th_ins->evl,
th_ins->ch_parent_events[0],
th_ins->ch_parent_events[1],
th_ins);

mk_event_loop_destroy(th_ins->evl);
flb_bucket_queue_destroy(th_ins->evl_bktq);

Expand Down
1 change: 1 addition & 0 deletions src/flb_scheduler.c
Original file line number Diff line number Diff line change
Expand Up @@ -603,6 +603,7 @@ int flb_sched_destroy(struct flb_sched *sched)
/* Delete timers */
mk_list_foreach_safe(head, tmp, &sched->timers) {
timer = mk_list_entry(head, struct flb_sched_timer, _head);
mk_event_timeout_destroy(sched->evl, &timer->event);
flb_sched_timer_destroy(timer);
c++;
}
Expand Down
2 changes: 2 additions & 0 deletions src/flb_worker.c
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,9 @@ void flb_worker_destroy(struct flb_worker *worker)

if (worker->log_cache) {
flb_log_cache_destroy(worker->log_cache);
worker->log_cache = NULL;
}
flb_log_worker_destroy(worker);

mk_list_del(&worker->_head);
flb_free(worker);
Expand Down
Loading