Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restarting cache nodes can exhaust disk space #672

Open
bryanlb opened this issue Sep 11, 2023 · 1 comment
Open

Restarting cache nodes can exhaust disk space #672

bryanlb opened this issue Sep 11, 2023 · 1 comment
Labels
bug Something isn't working operations

Comments

@bryanlb
Copy link
Contributor

bryanlb commented Sep 11, 2023

Describe the bug

In the event a cache node is restarted but not recreated/destroyed, the previously cached data can still exist on disk. This can cause out-of-disk errors as these old chunks have no reference but still take up disk space.

To Reproduce

Restart a running cache node, and after reboot observe the KALDB_CACHE_DATA_DIR has chunks that were created prior to the reboot time.

Expected behavior

The simplest solution here would be to delete the existing chunks on boot. Note that we should consider what happens when a user attempts to map the data directory to a shared folder. We may consider storing all chunks in a sub-directory of the configured data directory, and then deleting/re-creating that specific folder.

@bryanlb bryanlb added the bug Something isn't working label Sep 11, 2023
@github-actions github-actions bot added the Stale label Sep 8, 2024
@slackhq slackhq deleted a comment from github-actions bot Sep 9, 2024
@bryanlb bryanlb removed the Stale label Sep 9, 2024
@bryanlb
Copy link
Contributor Author

bryanlb commented Sep 9, 2024

This is still an issue, however we're currently working around this by clearing the directory in our entrypoint.sh script with something like the following, where we use /astra_data as the data directory:

# Temporary fix for reboots leaving cache assets behind
echo "Clearing /astra_data directory"
rm -rfv /astra_data/*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working operations
Projects
None yet
Development

No branches or pull requests

1 participant