-
Notifications
You must be signed in to change notification settings - Fork 65
Missing collector for scheduled (success|failure) events #68
Comments
@keir-rex can you provide the following information?
These logs did exist at some point, very possible they have been tweaked in a new release or things have changed in the underlying logging of the scheduler so this will help me figure out what is going on. |
sumologic-k8s-apiI rebuilt your image to also hit
fluentd-kubernetes-sumologicis basically vanilla
kubectl version:
|
@keir-rex thanks for the info. So this appears to be a change in 1.9.x. I have a 1.8 cluster and a 1.9 cluster and the schedule is not producing the same logs. Will try to track down to the source and work on remediation for this. |
Cheers @frankreno let me know if there's anything I can help with |
@keir-rex still no response from the folks on the scheduling team for k8s. So I do not have a good answer as to why this changed and how to remedy yet. I found the code where the log used to be generated and see no changes to account for this, so just means the change is not coming from the scheduler, but somewhere else. Will keep you updated. Long term, we are working on a new metrics collection strategy for Kubernetes not using heapster which will allow us to collect from many more data sources and provide insights into this. Let's keep this issue open until we solve it one of those ways... |
Sounds good @frankreno. I'll throw together something which does de-duping of events since we need that anyway. Could you comment on my second query on my initial post? Cheers |
@keir-rex that's right. I see [218, 42, 205, 363 and 374] as code, 'event' as a resource, and 'go' as resource_action. Although, I have to revisit these to make sure these are proper naming conventions |
Primary Concern
I'd like some help to understand whether or not I've missed something when following the README and the guides on help.sumologic.com ... kubernetes.
I seem to have most dashboards working with the exception of scheduler related panels like
Kubernetes - Overview -> Pods Scheduled By Namespace
which is driven by the following query:The problem is that the line this query is driven by is not logged by the scheduler but emitted as an event. The only piece from the documentation which I can see which would be able to push this to sumo is the sumologic-k8s-api script which is noticeably lacking any calls the
v1/api/events
as well as the role for calling that.I've tested a fix which would add these log lines and can submit it as a PR against sumologic-k8s-api but I feel like I've missed something obvious.
Secondary concern
I see some of the panels are driven by queries which extract fields which don't fill me with confidence that I've got things configured correctly:
Kubernetes - Controller Manager -> Event Severity Trend
using the following query:Which matches this log line:
Where
resource_action, resource_code
would matchgo
and439
respectively. Is this correct?The text was updated successfully, but these errors were encountered: