Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better auto interval handling #15940

Merged
merged 1 commit into from
Jul 21, 2023
Merged

Conversation

luk-kaminski
Copy link
Contributor

@luk-kaminski luk-kaminski commented Jul 12, 2023

Description

Fixes #14906.
Current approach did not work well with auto interval and long time ranges (like "All time"), especially for relatively fresh systems.
The biggest interval was 1 month, while "All time" means around 50 years in GL... So all data used to land in 1 bucket.
The implementation has been changed, so that AutoDateHistogramAggregationBuilder is used. We allow ES/OS to select interval, do not do it ourselves for auto interval, just providing some clue for number of expected buckets, using scaling field.
/nocl

Motivation and Context

See #14906.

How Has This Been Tested?

Manually and with new unit tests.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactoring (non-breaking change)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.

@luk-kaminski luk-kaminski marked this pull request as ready for review July 13, 2023 09:44
@luk-kaminski luk-kaminski requested review from dennisoelkers, janheise and todvora and removed request for dennisoelkers July 13, 2023 09:44
Copy link
Contributor

@todvora todvora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks OK to me. I tried the change locally and noticed some surprisingly skinny bars :-) Maybe we should think about the default count of the bars?

image

@dennisoelkers
Copy link
Member

@todvora: The thing is that the auto interval calculates the width of bars differently. Previously, each bar was time range / desired number of buckets. Now with the ES/OS auto interval it takes time range of _data available / desired number of buckets`, so if you have large gaps in your data, buckets will get much smaller.

@luk-kaminski
Copy link
Contributor Author

@todvora @dennisoelkers
I have noticed that so far we tended to have around 30-50 bars on a chart by default, with auto scale of x1.
After my change it is more like 100. I will decrease the constants so that the users will have a similar number of bars in the default scenario.

@dennisoelkers
Copy link
Member

@luk-kaminski: This is the rationale for the current implementation: https://github.com/Graylog2/graylog-plugin-enterprise/pull/358

@luk-kaminski
Copy link
Contributor Author

luk-kaminski commented Jul 14, 2023

@dennisoelkers - Thanks for that!
I will decrease the number of bars even more.

But the "data gap" problem that @todvora has bothered me quite a lot - it will be a surprising side effect of the change for some people...
The problem is with "front gap" only. When you ask for data for the last day, but have data only for the last 1 hour, all bars (25) will be assigned for this 1 hour, making the bars look thin.
The gap in the middle does not cause the problems - there will be "zero" bars.

@luk-kaminski
Copy link
Contributor Author

luk-kaminski commented Jul 14, 2023

The discussion continues on slack.

My suggestion is to experiment with minimal_interval: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-autodatehistogram-aggregation.html#_minimum_interval_parameter

The idea is - if someone searches for the data in the last year, he won't need 25 bars, representing 1 minute each, if he has data for only last 25 minutes. He will be probably ok with one bar for the last day. So maybe we should set minimal_interval, based on time range that was used?

The other option is... auto zooming in the FE. If you select "All time" but have data for the last hour, you will be zoomed to the last hour, with 25 bars, 2-3 minutes each.

Unit tests
Auto values only for all messages
@janheise janheise removed their request for review July 21, 2023 08:40
@dennisoelkers dennisoelkers merged commit eacf53f into master Jul 21, 2023
2 checks passed
@dennisoelkers dennisoelkers deleted the fix/auto_interval_handling branch July 21, 2023 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Message Count widget display is inconsistent for search range All Time and default interval Auto
3 participants