Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spikes in statistics averages when HA restarts #119738

Open
Dtrotmw opened this issue Jun 15, 2024 · 16 comments · May be fixed by #128796 or #124644
Open

Spikes in statistics averages when HA restarts #119738

Dtrotmw opened this issue Jun 15, 2024 · 16 comments · May be fixed by #128796 or #124644

Comments

@Dtrotmw
Copy link

Dtrotmw commented Jun 15, 2024

The problem

I have created a number of statistics entities with long time periods (days or weeks) which are getting corrupted when HA restarts. Looking at their graphs I see large spikes around the time of a restart. E.g.:
Screenshot (14)

The source sensor does not show any discontinuity at the restart:
Screenshot (15)

What version of Home Assistant Core has the issue?

core-2024.6.2

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Statistics

Link to integration documentation on our website

No response

Diagnostics information

No response

Example YAML snippet

- platform: statistics
    name: wind_60hr
    unique_id: wind_60hr
    entity_id: sensor.wind_display
    state_characteristic: average_linear
    precision: 2
    max_age:
      hours: 60

Anything in the logs that might be useful for us?

No response

Additional information

Exactly this issue has been previously raised #89000 but went stale
I have reported details https://community.home-assistant.io/t/corruption-in-long-period-statistics-sensors-at-ha-re-starts/739113

@home-assistant
Copy link

Hey there @ThomDietrich, mind taking a look at this issue as it has been labeled with an integration (statistics) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of statistics can trigger bot actions by commenting:

  • @home-assistant close Closes the issue.
  • @home-assistant rename Awesome new title Renames the issue.
  • @home-assistant reopen Reopen the issue.
  • @home-assistant unassign statistics Removes the current integration label and assignees on the issue, add the integration domain after the command.
  • @home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue.
  • @home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


statistics documentation
statistics source
(message by IssueLinks)

@erkr
Copy link

erkr commented Jul 13, 2024

I also have this sometimes after a restart But not for every restart. First time I noticed was after my update to 2024.7.
But (for me) it also can happen when just reloading the statistics yaml in Dev Tools (as reported by someone else in the stale issue).

That a reload can trigger this, basically excludes a missing source sensor as a cause. During reload the source remained present.

Just an example, where the source sensor remained nearly zero, but the statistics daily average sensor spikes between +31k and -31k when I reloaded the statistics yaml config. Note I reloaded 3-4 times in a row but just one spiked:

image

Sensor config:

  - platform: statistics
    name: "PM2.5 daily average"
    unique_id: pm2_5_daily_average
    entity_id: sensor.apollo_air_1_c1c714_pm_2_5_m_weight_concentration
    state_characteristic: average_linear
    max_age:
      hours: 24

At the same time, a 15 minute average sensor for the same source sensor spiked as well. But lower values and first negative and then positive:
image

Update: Someone in the forums suggested that the spike size seems to correlate with max_age. That is what I see as well

@attila123456
Copy link

Seems somewhat related so I'll add here: I see spikes all over the place, without HA restart, in different stats (linear average, minimum), where apparently 2 different values are recorded by statistics at the same timestamp:

image

@unfug-at-github
Copy link

I have also seen those spikes and didn't see any reason for them in my source data. I'm pretty sure that this happens because an undefined value is incorrectly being interpreted as MAX_INT or something and then getting into the calculation when the system is restarted.

I moved away from the statistics / average_step because of this. Currently I am using filters instead, which at least don't give me any spikes.

Both options have the annoying habit of becoming "undefined" when the value is stable for a long time. Currently, I am countering that with a template that is switching between the computed average and the actual source value depending on how long the value remains stable.

This is pretty annoying because it means copying the same code again and again. It would be great if this functionality would be part of the integration itself.

Sensor:

- platform: filter
  name: "inverter_grid_export_avg_30"
  entity_id: sensor.inverter_grid_export
  filters:
    - filter: time_simple_moving_average
      window_size: 00:00:30

Template:

- trigger:
    - platform: state
      entity_id:
        - sensor.inverter_grid_export_avg_30
- sensor:
    - name: "inverter_grid_export_avg"
      unit_of_measurement: 'W'
      state_class: measurement
      icon: mdi:transmission-tower-export
      availability: "{{ (states('sensor.inverter_grid_export_avg_30') | is_number) or (states('sensor.inverter_grid_export') | is_number) }}"
      state: >   
        {% set avg_30 = states('sensor.inverter_grid_export_avg_30') | float(0) %}
        {% set current = states('sensor.inverter_grid_export') | float(0) %}
        {% if (state_attr('sensor.inverter_grid_export', 'last_changed') == none) %}
          {% set last_changed = 0 %}
        {% else %}
          {% set last_changed = (now() - states.sensor.inverter_grid_export.last_changed).total_seconds() | int(0) %}
        {% endif %}
        {% if (last_changed <= 30) %} 
          {{ avg_30 }}
        {% else %}
          {{ current }}
        {% endif %}

@Zamtakk
Copy link

Zamtakk commented Aug 18, 2024

I am also facing this problem. I would love to not have to solve this in convoluted ways with filters and templates.
image

Any way to bring this thread to the attention of some devs?

@Zamtakk
Copy link

Zamtakk commented Aug 19, 2024

I just noticed that it is only my statistic helpers that have a value measured in kW. The statistics that use W, deg C or PPM are not affected by the restart.

@unfug-at-github
Copy link

I have found a solution that will trigger a computation of the statistics values, even if the input value is stable. It was also mentioned in another thread here somewhere.

# adding this will make sure the value is changing and hence the statistics are recomputed attributes: update_trigger: "{{ now().microsecond }}"

Adding such an attribute will change the state of a sensor whenever it is computed. Any statistic based on that sensor would then also be recomputed. I like this solution better than having to create a lot of otherwise useless sensors and it produces a lot less code.

This is definitely a workaround for the issue that statistics are not recomputed when the actual value is stable for longer times.

I'm still using statistics in a few places and I don't see the spikes anymore (min and max, not average). I'm not sure whether this is just a coincidence, since I made a couple of other changes as well.

@ThomDietrich
Copy link
Contributor

Hey everyone,
I am the main developer of the statistics component and while I would love to resolve this issue, I am not in the position to invest any significant work the next couple of months... Sadly. I would love to review a PR by one of you guys though. Just fyi, I think the first change that should be implemented is to switch from the change event to the newly introduced update event. That was indeed introduced because myself and others asked for it.

Btw I was planning to release a "custom component" to accelerate the development of a "statistics beta" component.

@unfug-at-github
Copy link

I have started to fix the issues and will soon create a merge request.

I have solved or am on the was of solving the following issues:

  • the time based average functions are now including the values before and after the last change (average step / linear), this is in particular relevant when values stay stable for a longer time
  • many functions that currently require at least two values to work, can now work with a single value as well (e.g. simple average)
  • the peaks will be a thing of the past (they happen because the value sequence is not correctly ordered according to time stamps)
  • I will also try to add a refresh trigger, so that the average can be recomputed even if the inputs didn't change

@Dtrotmw
Copy link
Author

Dtrotmw commented Aug 24, 2024 via email

@unfug-at-github
Copy link

I have provided a pull request that will fix the issues: #124644

@unfug-at-github
Copy link

Thank you for your efforts – it would be so good to get back to using this integration. Whilst you are looking at it, is there any way to define a period for the statistic; by both its start (done), but also by its end or length in time (only sample size currently allowed)? Cheers, David Inwood

Hi David,
you can basically achieve that already by combining two statistics:

Let's say you want the average from -30 sec to -20 sec, you can compute both the average for 30 sec (av30) and for 20 sec (av20) and then combine them like this:

((av30 * 30) - (av20 * 20)) / (30-20).

@Dtrotmw
Copy link
Author

Dtrotmw commented Aug 26, 2024 via email

@unfug-at-github
Copy link

The spikes were created by race conditions. The component has an internal queue that is supposed to be sorted by time. During the startup phase values are loaded from the database and put into the queue. If during this phase a regular update is coming in, this will mess up the order of the queue, which then leads to incorrect averages being computed.

Let's hope they will accept the proposed pull request and the issue will be gone.

@Zamtakk
Copy link

Zamtakk commented Sep 12, 2024

I am really looking forward for this PR to come through, thank you very much for fixing this issue which has been bothering me for a while! Your work is appreciated!

@jorgemarmor
Copy link

I'm also looking forward this to be improved. Right now these spikes are corrupting significantly my measurements of energy consumption. See just one example:
Screenshot_2024-09-22-11-33-14-566_io homeassistant companion android

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment