[newrelic-logging] Default resource limits cause out of memory errors #1500
Labels
bug
Categorizes issue or PR as related to a bug.
triage/pending
Issue or PR is pending for triage and prioritization.
Description
An issue has been opened about this before, and the reporter was instructed to ensure that they had upgraded their chart such that memory limit config on the input was present.
helm-charts/charts/newrelic-logging/values.yaml
Line 104 in ab2d1ba
We have been struggling with OOM errors and restarts on our pods despite having this config present, and upping the memory allowances of the pod. We have about 50 pods per node.
The helm config provided for this was:
Versions
Helm v3.14.4
Kubernetes (AKS) 1.29.2
Chart: nri-bundle-5.0.81
FluentBit: newrelic/newrelic-fluentbit-output:2.0.0
What happened?
The fluentbit pods were repeatedly killed for using more memory than it's limit, which is set very low. It's CPU was never highly utilised, which does not suggest that the memory increase was due to throttling / not being able to keep up.
What you expected to happen?
The fluentbit should have little to no restarts, and it should never reach 1.5GB of memory used per container.
How to reproduce it?
Using the same versions as listed above, and the same helm values.yaml, deploy an AKS cluster with 50 production workloads per node (2vcpu 8gb) and observe whether there are memory issues.
The text was updated successfully, but these errors were encountered: