Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: move harbor backup schedule and make it configurable #2310

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions config/sc-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -403,6 +403,7 @@ harbor:
backup:
enabled: true
retentionDays: 7
schedule: "30 0 * * *"
ephemeralBackupStore:
enabled: false
storageSize: 10Gi
Expand Down
7 changes: 7 additions & 0 deletions config/schemas/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1693,6 +1693,13 @@ properties:
`RetentionDays` defines how old a backup should be before deleting it.
default: 7
type: number
schedule:
title: Schedule for backup job
description: |-
`schedule` defines when the backup job for harbor will run.
This should be set to run shortly after velero backups in wc, in order to ensure that images needed for velero backups are backed up in harbor.
Comment on lines +1699 to +1700
Copy link
Contributor

@simonklb simonklb Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`schedule` defines when the backup job for harbor will run.
This should be set to run shortly after velero backups in wc, in order to ensure that images needed for velero backups are backed up in harbor.
`schedule` defines when the backup job for Harbor will run.
This should be set to run shortly after Velero backups in the workload cluster, in order to ensure that images needed for Velero backups are backed up in Harbor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this not also depend on how long time the Velero backup takes to complete?

It would be nice if they were triggered in sequence rather than on two different schedules. Issue worthy?

Copy link
Contributor Author

@viktor-f viktor-f Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does definitely depend on that. This mostly reflects that most velero backups should be done within 30 mins, it seemed like a sane default.
It would be very nice if that was triggered in sequence, but I'm not sure that it's worth making some controller that would be able to fix that. Especially since this is spread out across two different kubernetes clusters, harbor backups in sc and velero backups in wc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are so hesitant to create new issues to prevent flooding the backlog (which I totally understand) could we find some way of still keep "nice to haves" around? I think even if this isn't something we want to add to the current backlog this is something we want to solve in the future, i.e. that when you create a backup of your cluster all images that are currently in use should also be backed up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can bring that up to see what we can do 👍

default: "30 0 * * *"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be nice to have a pattern for this in the schema

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did manage to find this pattern that seems to work in all general cases, but there might be some edge cases where it fails. But it is very complicated.

^((?<![\d\-\*])((\*\/)?([0-5]?[0-9])((\,|\-|\/)([0-5]?[0-9]))*|\*)[^\S\r\n]+((\*\/)?((2[0-3]|1[0-9]|[0-9]|00))((\,|\-|\/)(2[0-3]|1[0-9]|[0-9]|00))*|\*)[^\S\r\n]+((\*\/)?([1-9]|[12][0-9]|3[01])((\,|\-|\/)([1-9]|[12][0-9]|3[01]))*|\*)[^\S\r\n]+((\*\/)?([1-9]|1[0-2])((\,|\-|\/)([1-9]|1[0-2]))*|\*|(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec))[^\S\r\n]+((\*\/)?[0-6]((\,|\-|\/)[0-6])*|\*|00|(sun|mon|tue|wed|thu|fri|sat)))$|^@(annually|yearly|monthly|weekly|daily|hourly|reboot)$

Is this something we want, or is it just confusing to add this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Kubernetes spec just has this:

        "schedule": {
          "description": "The schedule in Cron format, see https://en.wikipedia.org/wiki/Cron.",
          "type": "string"
        },

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we really want something this seems less complicated:
/(@(annually|yearly|monthly|weekly|daily|hourly|reboot))|(@every (\d+(ns|us|µs|ms|s|m|h))+)|((((\d+,)+\d+|(\d+(\/|-)\d+)|\d+|\*) ?){5,7})/
Source: https://stackoverflow.com/questions/14203122/create-a-regular-expression-for-cron-statement

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm it does indeed look a bit confusing, maybe just adding a link on formatting in the description is good enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That link is also what I looked at. The particular one that you showed does at least seem to miss the possibility of adding mon-sun for the week days.

I think we should either add the complicated one or just link to the wikipedia page like anders suggested.

type: string
type: object
core:
title: Core Config
Expand Down
2 changes: 1 addition & 1 deletion helmfile.d/charts/harbor/harbor-backup/values.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
pgHostname: harbor-database
dbPassword: ""
retentionDays: 7
schedule: "0 0 * * *"
schedule: "30 0 * * *"
s3:
enabled: false
accessKey: ""
Expand Down
1 change: 1 addition & 0 deletions helmfile.d/values/harbor/harbor-backup.yaml.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

dbPassword: {{ .Values.harbor.internal.databasePassword }}
retentionDays: {{ .Values.harbor.backup.retentionDays }}
schedule: {{ .Values.harbor.backup.schedule }}
{{ if eq .Values.objectStorage.type "s3" -}}
s3:
enabled: true
Expand Down