Restarting cache nodes can exhaust disk space #672

bryanlb · 2023-09-11T16:23:57Z

Describe the bug

In the event a cache node is restarted but not recreated/destroyed, the previously cached data can still exist on disk. This can cause out-of-disk errors as these old chunks have no reference but still take up disk space.

To Reproduce

Restart a running cache node, and after reboot observe the KALDB_CACHE_DATA_DIR has chunks that were created prior to the reboot time.

Expected behavior

The simplest solution here would be to delete the existing chunks on boot. Note that we should consider what happens when a user attempts to map the data directory to a shared folder. We may consider storing all chunks in a sub-directory of the configured data directory, and then deleting/re-creating that specific folder.

The text was updated successfully, but these errors were encountered:

bryanlb · 2024-09-09T15:28:11Z

This is still an issue, however we're currently working around this by clearing the directory in our entrypoint.sh script with something like the following, where we use /astra_data as the data directory:

# Temporary fix for reboots leaving cache assets behind
echo "Clearing /astra_data directory"
rm -rfv /astra_data/*

bryanlb added the bug Something isn't working label Sep 11, 2023

bryanlb added this to the Improve support for longer retention milestone Sep 11, 2023

bryanlb removed this from the [24Q3] Improve support for longer retention milestone Oct 30, 2023

github-actions bot added the Stale label Sep 8, 2024

slackhq deleted a comment from github-actions bot Sep 9, 2024

bryanlb removed the Stale label Sep 9, 2024

bryanlb added the operations label Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restarting cache nodes can exhaust disk space #672

Restarting cache nodes can exhaust disk space #672

bryanlb commented Sep 11, 2023 •

edited

Loading

bryanlb commented Sep 9, 2024

Restarting cache nodes can exhaust disk space #672

Restarting cache nodes can exhaust disk space #672

Comments

bryanlb commented Sep 11, 2023 • edited Loading

Describe the bug

To Reproduce

Expected behavior

bryanlb commented Sep 9, 2024

bryanlb commented Sep 11, 2023 •

edited

Loading