Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: move harbor backup schedule and make it configurable #2310

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

viktor-f
Copy link
Contributor

Warning

This is a public repository, ensure not to disclose:

  • personal data beyond what is necessary for interacting with this pull request, nor
  • business confidential information, such as customer names.

What kind of PR is this?

Required: Mark one of the following that is applicable:

  • kind/feature
  • kind/improvement
  • kind/deprecation
  • kind/documentation
  • kind/clean-up
  • kind/bug
  • kind/other

Optional: Mark one or more of the following that are applicable:

Important

Breaking changes should be marked kind/admin-change or kind/dev-change depending on type
Critical security fixes should be marked with kind/security

  • kind/admin-change
  • kind/dev-change
  • kind/security
  • kind/adr

What does this PR do / why do we need this PR?

Default harbor backup schedule moved to after defaut velero schedule. This should ensure that images needed for resources in velero backup are present in harbor backup.

Also made the schedule configurable.

Information to reviewers

Checklist

  • Proper commit message prefix on all commits
  • Change checks:
    • The change is transparent
    • The change is disruptive
    • The change requires no migration steps
    • The change requires migration steps
    • The change upgrades CRDs
    • The change updates the config and the schema
  • Documentation checks:
  • Metrics checks:
    • The metrics are still exposed and present in Grafana after the change
    • The metrics names didn't change (Grafana dashboards and Prometheus alerts are not affected)
    • The metrics names did change (Grafana dashboards and Prometheus alerts were fixed)
  • Logs checks:
    • The logs do not show any errors after the change
  • Pod Security Policy checks:
    • Any changed pod is covered by Pod Security Admission
    • Any changed pod is covered by Gatekeeper Pod Security Policies
    • The change does not cause any pods to be blocked by Pod Security Admission or Policies
  • Network Policy checks:
    • Any changed pod is covered by Network Policies
    • The change does not cause any dropped packets in the NetworkPolicy Dashboard
  • Audit checks:
    • The change does not cause any unnecessary Kubernetes audit events
    • The change requires changes to Kubernetes audit policy
  • Falco checks:
    • The change does not cause any alerts to be generated by Falco
  • Bug checks:
    • The bug fix is covered by regression tests

Default harbor backup schedule moved to after defaut velero schedule.
This should ensure that images needed for resources in velero backup are
present in harbor backup.
@viktor-f viktor-f added the kind/improvement Improvement of existing features, e.g. code cleanup or optimizations. label Oct 14, 2024
@viktor-f viktor-f self-assigned this Oct 14, 2024
Comment on lines +1699 to +1700
`schedule` defines when the backup job for harbor will run.
This should be set to run shortly after velero backups in wc, in order to ensure that images needed for velero backups are backed up in harbor.
Copy link
Contributor

@simonklb simonklb Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`schedule` defines when the backup job for harbor will run.
This should be set to run shortly after velero backups in wc, in order to ensure that images needed for velero backups are backed up in harbor.
`schedule` defines when the backup job for Harbor will run.
This should be set to run shortly after Velero backups in the workload cluster, in order to ensure that images needed for Velero backups are backed up in Harbor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this not also depend on how long time the Velero backup takes to complete?

It would be nice if they were triggered in sequence rather than on two different schedules. Issue worthy?

Copy link
Contributor Author

@viktor-f viktor-f Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does definitely depend on that. This mostly reflects that most velero backups should be done within 30 mins, it seemed like a sane default.
It would be very nice if that was triggered in sequence, but I'm not sure that it's worth making some controller that would be able to fix that. Especially since this is spread out across two different kubernetes clusters, harbor backups in sc and velero backups in wc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are so hesitant to create new issues to prevent flooding the backlog (which I totally understand) could we find some way of still keep "nice to haves" around? I think even if this isn't something we want to add to the current backlog this is something we want to solve in the future, i.e. that when you create a backup of your cluster all images that are currently in use should also be backed up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can bring that up to see what we can do 👍

description: |-
`schedule` defines when the backup job for harbor will run.
This should be set to run shortly after velero backups in wc, in order to ensure that images needed for velero backups are backed up in harbor.
default: "30 0 * * *"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be nice to have a pattern for this in the schema

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did manage to find this pattern that seems to work in all general cases, but there might be some edge cases where it fails. But it is very complicated.

^((?<![\d\-\*])((\*\/)?([0-5]?[0-9])((\,|\-|\/)([0-5]?[0-9]))*|\*)[^\S\r\n]+((\*\/)?((2[0-3]|1[0-9]|[0-9]|00))((\,|\-|\/)(2[0-3]|1[0-9]|[0-9]|00))*|\*)[^\S\r\n]+((\*\/)?([1-9]|[12][0-9]|3[01])((\,|\-|\/)([1-9]|[12][0-9]|3[01]))*|\*)[^\S\r\n]+((\*\/)?([1-9]|1[0-2])((\,|\-|\/)([1-9]|1[0-2]))*|\*|(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec))[^\S\r\n]+((\*\/)?[0-6]((\,|\-|\/)[0-6])*|\*|00|(sun|mon|tue|wed|thu|fri|sat)))$|^@(annually|yearly|monthly|weekly|daily|hourly|reboot)$

Is this something we want, or is it just confusing to add this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Kubernetes spec just has this:

        "schedule": {
          "description": "The schedule in Cron format, see https://en.wikipedia.org/wiki/Cron.",
          "type": "string"
        },

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we really want something this seems less complicated:
/(@(annually|yearly|monthly|weekly|daily|hourly|reboot))|(@every (\d+(ns|us|µs|ms|s|m|h))+)|((((\d+,)+\d+|(\d+(\/|-)\d+)|\d+|\*) ?){5,7})/
Source: https://stackoverflow.com/questions/14203122/create-a-regular-expression-for-cron-statement

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm it does indeed look a bit confusing, maybe just adding a link on formatting in the description is good enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That link is also what I looked at. The particular one that you showed does at least seem to miss the possibility of adding mon-sun for the week days.

I think we should either add the complicated one or just link to the wikipedia page like anders suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/improvement Improvement of existing features, e.g. code cleanup or optimizations.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants