fluentd-Kubernetes-sumologic, a container to ship logs to SumoLogic
This is a fluentd container, designed to run as a Kubernetes DaemonSet. It will run an instance of this container on each physical underlying host in the cluster. The goal is to pull all the kubelet, docker daemon and container logs from the host then to ship them off to SumoLogic in json or text format.
First things first, you need a HTTP collector in SumoLogic that the container can send logs to via HTTP.
In Sumo, Manage -> Collection -> Add Collector -> Hosted Collector
Then you need to add a source to that collector, which would be a new HTTP source
. This will give you a unique URL that can receive logs.
More details here: http://help.sumologic.com/Send_Data/Sources/HTTP_Source
Save the collector url (created above) as a secret in Kubernetes.
kubectl create secret generic sumologic --from-literal=collector-url=<INSERT_HTTP_URL>
And finally, you need to deploy the container. I will presume you have your own CI/CD setup. See the sample Kubernetes DaemonSet in fluentd.daemonset.yaml
kubectl create -f fluentd.daemonset.yaml
The following options can be configured as environment variables on the DaemonSet
FLUSH_INTERVAL
- How frequently to push logs to SumoLogic (default5s
)NUM_THREADS
- Increase number of http threads to Sumo. May be required in heavy logging clusters (default1
)SOURCE_NAME
- Set the_sourceName
metadata field in SumoLogic. (Default"%{namespace}.%{pod}.%{container}"
)SOURCE_CATEGORY
- Set the__sourceCategory
metadata field in SumoLogic. (Default"%{namespace}/%{pod_name}"
)SOURCE_CATEGORY_REPLACE_DASH
- Used to replace-
with another character. (default/
).- For example a Pod called
travel-nginx-3629474229-dirmo
within namespaceapp
will show in SumoLogic with_sourceCategory=app/travel/nginx
- For example a Pod called
LOG_FORMAT
- Format to post logs into Sumo.json
ortext
(defaultjson
)- text - Logs will appear in SumoLogic in text format
- json - Logs will appear in SumoLogic in json format.
- json_merge - Same as json but if the container logs in json format to stdout it will merge in the container json log at the root level and remove the
log
field.
KUBERNETES_META
- Include or exclude Kubernetes metadata such as namespace and pod_name if using json log format. (defaulttrue
)EXCLUDE_PATH
- Files matching this pattern will be ignored by the in_tail plugin, and will not be sent to Kubernetes or Sumo Logic. This can be a comma seperated list as well. See in_tail doc for more information.- For example, setting EXCLUDE_PATH to the following would ignore all files matching /var/log/containers/*.log
...
env:
- name: EXCLUDE_PATH
value: "[\"/var/log/containers/*.log\"]"
EXCLUDE_NAMESPACE_REGEX
- A Regex pattern for namespaces. All matching namespaces will be excluded from Sumo Logic. The logs will still be sent to FluentD.EXCLUDE_POD_REGEX
- A Regex pattern for pods. All matching pods will be excluded from Sumo Logic. The logs will still be sent to FluentD.EXCLUDE_CONTAINER_REGEX
- A Regex pattern for containers. All matching containers will be excluded from Sumo Logic. The logs will still be sent to FluentD.EXCLUDE_HOST_REGEX
- A Regex pattern for hosts. All matching hosts will be excluded from Sumo Logic. The logs will still be sent to FluentD.
The following table show which environment variables affect fluent sources
Environment Variable | Containers | Docker | Kubernetes |
---|---|---|---|
EXCLUDE_CONTAINER_REGEX |
✔ | ✘ | ✘ |
EXCLUDE_HOST_REGEX |
✔ | ✘ | ✘ |
EXCLUDE_NAMESPACE_REGEX |
✔ | ✘ | ✔ |
EXCLUDE_PATH |
✔ | ✔ | ✔ |
EXCLUDE_POD_REGEX |
✔ | ✘ | ✘ |
The LOG_FORMAT
, SOURCE_CATEGORY
and SOURCE_NAME
can be overridden per pod using annotations. For example
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx
spec:
replicas: 1
selector:
app: mywebsite
template:
metadata:
name: nginx
labels:
app: mywebsite
annotations:
sumologic.com/format: "text"
sumologic.com/sourceCategory: "mywebsite/nginx"
sumologic.com/sourceName: "mywebsite_nginx"
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
You can also use the sumologic.com/exclude
annotation to exclude data from Sumo Logic. This data is still sent to FluentD, but will not make it to Sumo.
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx
spec:
replicas: 1
selector:
app: mywebsite
template:
metadata:
name: nginx
labels:
app: mywebsite
annotations:
sumologic.com/format: "text"
sumologic.com/sourceCategory: "mywebsite/nginx"
sumologic.com/sourceName: "mywebsite_nginx"
sumologic.com/exclude: "true"
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
Simple as that really, your logs should be getting streamed to SumoLogic in json or text format with the appropriate metadata. If using json
format you can auto extract fields, for example _sourceCategory=some/app | json auto