Spikes in statistics averages when HA restarts #119738

Dtrotmw · 2024-06-15T13:01:29Z

The problem

I have created a number of statistics entities with long time periods (days or weeks) which are getting corrupted when HA restarts. Looking at their graphs I see large spikes around the time of a restart. E.g.:

The source sensor does not show any discontinuity at the restart:

What version of Home Assistant Core has the issue?

core-2024.6.2

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Statistics

Link to integration documentation on our website

No response

Diagnostics information

No response

Example YAML snippet

- platform: statistics
    name: wind_60hr
    unique_id: wind_60hr
    entity_id: sensor.wind_display
    state_characteristic: average_linear
    precision: 2
    max_age:
      hours: 60

Anything in the logs that might be useful for us?

No response

Additional information

Exactly this issue has been previously raised #89000 but went stale
I have reported details https://community.home-assistant.io/t/corruption-in-long-period-statistics-sensors-at-ha-re-starts/739113

home-assistant · 2024-06-15T13:23:43Z

Hey there @ThomDietrich, mind taking a look at this issue as it has been labeled with an integration (statistics) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of statistics can trigger bot actions by commenting:

@home-assistant close Closes the issue.
@home-assistant rename Awesome new title Renames the issue.
@home-assistant reopen Reopen the issue.
@home-assistant unassign statistics Removes the current integration label and assignees on the issue, add the integration domain after the command.
@home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue.
@home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

_{^{(message by CodeOwnersMention)}}

statistics documentation
statistics source
_{^{(message by IssueLinks)}}

erkr · 2024-07-13T11:43:18Z

I also have this sometimes after a restart But not for every restart. First time I noticed was after my update to 2024.7.
But (for me) it also can happen when just reloading the statistics yaml in Dev Tools (as reported by someone else in the stale issue).

That a reload can trigger this, basically excludes a missing source sensor as a cause. During reload the source remained present.

Just an example, where the source sensor remained nearly zero, but the statistics daily average sensor spikes between +31k and -31k when I reloaded the statistics yaml config. Note I reloaded 3-4 times in a row but just one spiked:

Sensor config:

  - platform: statistics
    name: "PM2.5 daily average"
    unique_id: pm2_5_daily_average
    entity_id: sensor.apollo_air_1_c1c714_pm_2_5_m_weight_concentration
    state_characteristic: average_linear
    max_age:
      hours: 24

At the same time, a 15 minute average sensor for the same source sensor spiked as well. But lower values and first negative and then positive:

Update: Someone in the forums suggested that the spike size seems to correlate with max_age. That is what I see as well

attila123456 · 2024-07-16T11:49:12Z

Seems somewhat related so I'll add here: I see spikes all over the place, without HA restart, in different stats (linear average, minimum), where apparently 2 different values are recorded by statistics at the same timestamp:

unfug-at-github · 2024-08-14T05:06:21Z

I have also seen those spikes and didn't see any reason for them in my source data. I'm pretty sure that this happens because an undefined value is incorrectly being interpreted as MAX_INT or something and then getting into the calculation when the system is restarted.

I moved away from the statistics / average_step because of this. Currently I am using filters instead, which at least don't give me any spikes.

Both options have the annoying habit of becoming "undefined" when the value is stable for a long time. Currently, I am countering that with a template that is switching between the computed average and the actual source value depending on how long the value remains stable.

This is pretty annoying because it means copying the same code again and again. It would be great if this functionality would be part of the integration itself.

Sensor:

- platform: filter
  name: "inverter_grid_export_avg_30"
  entity_id: sensor.inverter_grid_export
  filters:
    - filter: time_simple_moving_average
      window_size: 00:00:30

Template:

- trigger:
    - platform: state
      entity_id:
        - sensor.inverter_grid_export_avg_30
- sensor:
    - name: "inverter_grid_export_avg"
      unit_of_measurement: 'W'
      state_class: measurement
      icon: mdi:transmission-tower-export
      availability: "{{ (states('sensor.inverter_grid_export_avg_30') | is_number) or (states('sensor.inverter_grid_export') | is_number) }}"
      state: >   
        {% set avg_30 = states('sensor.inverter_grid_export_avg_30') | float(0) %}
        {% set current = states('sensor.inverter_grid_export') | float(0) %}
        {% if (state_attr('sensor.inverter_grid_export', 'last_changed') == none) %}
          {% set last_changed = 0 %}
        {% else %}
          {% set last_changed = (now() - states.sensor.inverter_grid_export.last_changed).total_seconds() | int(0) %}
        {% endif %}
        {% if (last_changed <= 30) %} 
          {{ avg_30 }}
        {% else %}
          {{ current }}
        {% endif %}

Zamtakk · 2024-08-18T11:16:28Z

I am also facing this problem. I would love to not have to solve this in convoluted ways with filters and templates.

Any way to bring this thread to the attention of some devs?

Zamtakk · 2024-08-19T12:31:40Z

I just noticed that it is only my statistic helpers that have a value measured in kW. The statistics that use W, deg C or PPM are not affected by the restart.

unfug-at-github · 2024-08-22T07:13:25Z

I have found a solution that will trigger a computation of the statistics values, even if the input value is stable. It was also mentioned in another thread here somewhere.

# adding this will make sure the value is changing and hence the statistics are recomputed attributes: update_trigger: "{{ now().microsecond }}"

Adding such an attribute will change the state of a sensor whenever it is computed. Any statistic based on that sensor would then also be recomputed. I like this solution better than having to create a lot of otherwise useless sensors and it produces a lot less code.

This is definitely a workaround for the issue that statistics are not recomputed when the actual value is stable for longer times.

I'm still using statistics in a few places and I don't see the spikes anymore (min and max, not average). I'm not sure whether this is just a coincidence, since I made a couple of other changes as well.

ThomDietrich · 2024-08-22T07:36:42Z

Hey everyone,
I am the main developer of the statistics component and while I would love to resolve this issue, I am not in the position to invest any significant work the next couple of months... Sadly. I would love to review a PR by one of you guys though. Just fyi, I think the first change that should be implemented is to switch from the change event to the newly introduced update event. That was indeed introduced because myself and others asked for it.

Btw I was planning to release a "custom component" to accelerate the development of a "statistics beta" component.

unfug-at-github · 2024-08-24T09:46:35Z

I have started to fix the issues and will soon create a merge request.

I have solved or am on the was of solving the following issues:

the time based average functions are now including the values before and after the last change (average step / linear), this is in particular relevant when values stay stable for a longer time
many functions that currently require at least two values to work, can now work with a single value as well (e.g. simple average)
the peaks will be a thing of the past (they happen because the value sequence is not correctly ordered according to time stamps)
I will also try to add a refresh trigger, so that the average can be recomputed even if the inputs didn't change

Dtrotmw · 2024-08-24T12:28:01Z

Thank you for your efforts – it would be so good to get back to using this integration. Whilst you are looking at it, is there any way to define a period for the statistic; by both its start (done), but also by its end or length in time (only sample size currently allowed)? Cheers, David Inwood From: unfug-at-github ***@***.***> Sent: Saturday, August 24, 2024 10:47 AM To: home-assistant/core ***@***.***> Cc: Dtrotmw ***@***.***>; Author ***@***.***> Subject: Re: [home-assistant/core] Spikes in statistics averages when HA restarts (Issue #119738) I have started to fix the issues and will soon create a merge request. I have solved or am on the was of solving the following issues: * the time based average functions are now including the values before and after the last change (average step / linear), this is in particular relevant when values stay stable for a longer time * many functions that currently require at least two values to work, can now work with a single value as well (e.g. simple average) * the peaks will be a thing of the past (they happen because the value sequence is not correctly ordered according to time stamps) * I will also try to add a refresh trigger, so that the average can be recomputed even if the inputs didn't change — Reply to this email directly, view it on GitHub <#119738 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/A3IC4W2LTSJ4T26SB3OOQ23ZTBJBDAVCNFSM6AAAAABJLVYWCSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBYGI3TSNZXGE> . You are receiving this because you authored the thread. <https://github.com/notifications/beacon/A3IC4W5EY2PAZ47DZVAFWRLZTBJBDA5CNFSM6AAAAABJLVYWCSWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUJSWG5W.gif> Message ID: ***@***.*** ***@***.***> >

unfug-at-github · 2024-08-26T14:33:52Z

I have provided a pull request that will fix the issues: #124644

unfug-at-github · 2024-08-26T15:06:42Z

Thank you for your efforts – it would be so good to get back to using this integration. Whilst you are looking at it, is there any way to define a period for the statistic; by both its start (done), but also by its end or length in time (only sample size currently allowed)? Cheers, David Inwood

Hi David,
you can basically achieve that already by combining two statistics:

Let's say you want the average from -30 sec to -20 sec, you can compute both the average for 30 sec (av30) and for 20 sec (av20) and then combine them like this:

((av30 * 30) - (av20 * 20)) / (30-20).

Dtrotmw · 2024-08-26T15:17:18Z

Thank you so much for replying. I feel such a fool! You’d never know I had a good degree, albeit 5 decades ago… And thank you again for addressing the problem I’ve been having with spikes etc for so long. It is much appreciated. From: unfug-at-github ***@***.***> Sent: Monday, August 26, 2024 4:07 PM To: home-assistant/core ***@***.***> Cc: Dtrotmw ***@***.***>; Author ***@***.***> Subject: Re: [home-assistant/core] Spikes in statistics averages when HA restarts (Issue #119738) Thank you for your efforts – it would be so good to get back to using this integration. Whilst you are looking at it, is there any way to define a period for the statistic; by both its start (done), but also by its end or length in time (only sample size currently allowed)? Cheers, David Inwood Hi David, you can basically achieve that already by combining two statistics: Let's say you want the average from -30 sec to -20 sec, you can compute both the average for 30 sec (av30) and for 20 sec (av20) and then combine them like this: ((av30 * 30) - (av20 * 20)) / (30-20). — Reply to this email directly, view it on GitHub <#119738 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/A3IC4W27Z57EMYADNS7LVOLZTNABTAVCNFSM6AAAAABJLVYWCSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJQGQ2DEOBVGM> . You are receiving this because you authored the thread. <https://github.com/notifications/beacon/A3IC4WZUYCDYAGQCWJIS57TZTNABTA5CNFSM6AAAAABJLVYWCSWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUJW2HWK.gif> Message ID: ***@***.*** ***@***.***> >

unfug-at-github · 2024-08-28T04:44:12Z

The spikes were created by race conditions. The component has an internal queue that is supposed to be sorted by time. During the startup phase values are loaded from the database and put into the queue. If during this phase a regular update is coming in, this will mess up the order of the queue, which then leads to incorrect averages being computed.

Let's hope they will accept the proposed pull request and the issue will be gone.

Zamtakk · 2024-09-12T12:25:38Z

I am really looking forward for this PR to come through, thank you very much for fixing this issue which has been bothering me for a while! Your work is appreciated!

jorgemarmor · 2024-09-22T09:37:46Z

I'm also looking forward this to be improved. Right now these spikes are corrupting significantly my measurements of energy consumption. See just one example:

mib1185 added the integration: statistics label Jun 15, 2024

This was referenced Aug 26, 2024

Fix statistics incl time based averages #124635

Closed

Fix time-based statistics (peaks, incorrect values) #124644

Draft

This was referenced Oct 2, 2024

Fix spikes caused by incorrect queue order in statistics integration #127268

Closed

Unrealistic Energy Spikes #124420

Open

unfug-at-github linked a pull request Oct 19, 2024 that will close this issue

Fix issues with statistics caused by race conditions #128796

Open

19 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spikes in statistics averages when HA restarts #119738

Spikes in statistics averages when HA restarts #119738

Dtrotmw commented Jun 15, 2024 •

edited by mib1185

Loading

home-assistant bot commented Jun 15, 2024

erkr commented Jul 13, 2024 •

edited

Loading

attila123456 commented Jul 16, 2024

unfug-at-github commented Aug 14, 2024

Zamtakk commented Aug 18, 2024

Zamtakk commented Aug 19, 2024

unfug-at-github commented Aug 22, 2024

ThomDietrich commented Aug 22, 2024

unfug-at-github commented Aug 24, 2024

Dtrotmw commented Aug 24, 2024 via email

unfug-at-github commented Aug 26, 2024

unfug-at-github commented Aug 26, 2024

Dtrotmw commented Aug 26, 2024 via email

unfug-at-github commented Aug 28, 2024

Zamtakk commented Sep 12, 2024

jorgemarmor commented Sep 22, 2024

Spikes in statistics averages when HA restarts #119738

Spikes in statistics averages when HA restarts #119738

Comments

Dtrotmw commented Jun 15, 2024 • edited by mib1185 Loading

The problem

What version of Home Assistant Core has the issue?

What was the last working version of Home Assistant Core?

What type of installation are you running?

Integration causing the issue

Link to integration documentation on our website

Diagnostics information

Example YAML snippet

Anything in the logs that might be useful for us?

Additional information

home-assistant bot commented Jun 15, 2024

erkr commented Jul 13, 2024 • edited Loading

attila123456 commented Jul 16, 2024

unfug-at-github commented Aug 14, 2024

Zamtakk commented Aug 18, 2024

Zamtakk commented Aug 19, 2024

unfug-at-github commented Aug 22, 2024

ThomDietrich commented Aug 22, 2024

unfug-at-github commented Aug 24, 2024

Dtrotmw commented Aug 24, 2024 via email

unfug-at-github commented Aug 26, 2024

unfug-at-github commented Aug 26, 2024

Dtrotmw commented Aug 26, 2024 via email

unfug-at-github commented Aug 28, 2024

Zamtakk commented Sep 12, 2024

jorgemarmor commented Sep 22, 2024

Dtrotmw commented Jun 15, 2024 •

edited by mib1185

Loading

erkr commented Jul 13, 2024 •

edited

Loading