Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Networker Backup Software Support as a Cloud Provider module #617

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

tomglx
Copy link

@tomglx tomglx commented Jul 1, 2022

Hi,
we've wanted to benefit from the strengths of barman over pg_basebackup and still to be able to use our networker backup
system as a central point for storing all of our backups. Although networker provides some support for PostgreSQL, this
uses pg_basebackup and therefore lacks by far the convenience of barman.

At this point, barman has no plugin interface for connecting third party backup software products. Oracles Recovery Manager
for example does. The cloud provider interface has a similar intention, but expects somewhat that behaves more like an external
file server. In lack of an alternative, we used this interface to connect to networker while keeping the necessary changes to the
main barman code base as little as possible.

Networker has no REST or streaming interface. Therefore the module uses a local staging area on the PostgreSQL server for
storing the backup data before saving them with regular filesystem backups. In addition, networker needs root privileges
when performing automatic recovers. We try to circumvent this restriction by using sudo.

We've omitted the creation of specific unit tests, because not everyone has a networker server on hand. We've also created
a wrapper script for integrating backups with networker scheduling and one for easier/faster database restoration. These
aren't part of this request, but can be added at a later time.

We're open for your comments and your input.

Regards,
Thomas

root and others added 8 commits June 28, 2022 14:40
…ellEMC networker as a backup target.

Although networker is on-premise software, there is currently no other api to add an interface to a
third party backup software to barman.
update manual pages for Networker Storage Cloud Extension
add sudo as prefix in calls for Networker recover operations as non-root
user
add delete_objects support for Networker backups
specification
add documentation about networker archival handling and differences
Copy link
Contributor

@mikewallace1979 mikewallace1979 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the PR and taking the time to include updates to the documentation. Having discussed this internally there are a few reasons why we're hesitant to merge this in its current form:

Firstly, as you mention in relation to unit tests, testing is difficult without a NetWorker instance. As well as unit tests, Barman also has a suite of integration tests (unfortunately not currently open source) which verify end-to-end functionality against each cloud provider (we use Azurite and Minio for Azure and S3 while the Google backend runs against a real GCS instance), and it's not clear how we would test the NetWorker cloud provider as part of these.

Secondly there's the issue of support and maintenance. The Barman developers would need to handle any support requests related to the NetWorker backend - this is not something we could confidently commit to, mainly due to the lack of availability of a NetWorker instance for reproducing issues but also due to a lack of familiarity with NetWorker. Maintenance would also be difficult as we would not be able to test future Barman versions against NetWorker.

Thirdly the use of the barman cloud streaming backup to write the backup to tar files on the local disk is at odds with the design goals of barman cloud. While it's a clever approach in that it allows NetWorker to be hooked in while minimising the changes required to existing Barman code, CloudBackupUploader and associated plumbing is intended to stream backups directly to cloud storage without requiring additional space to stage the backup files (beyond the buffered chunks which are flushed during the upload process). Given Barman already contains backup methods for copying PostgreSQL data into other directories (either on the same or another server) this change would bring in some elements of duplicated functionality.

The typical way to integrate something like NetWorker would be to use barman backup with a post backup hook script - such a script could then execute save to write the contents of the backup directory to NetWorker. Having said that, this PR clearly adds some practical functionality beyond the hook script approach by allowing backups to be compressed into a batch of tar files and uploaded to NetWorker without needing to keep any data on the local disk once the NetWorker upload is complete.

One possible way forward is to rework the PR so that it does not deal with NetWorker at all. This could be achieved by adding a local cloud provider (--cloud-provider=local?) which writes the backup to a local path - this could then be called in a wrapper script which uses the NetWorker CLI to save the files. If it were implemented like this then no NetWorker-specific code would be added to Barman and the support/test/maintenance issues would go away - we would simply need to verify that barman-cloud can write the backup to the local path and that it can recover from a backup if it is available at that path.

@tomglx
Copy link
Author

tomglx commented Aug 2, 2022

Thanks very much for your efforts. I anticipated all of your objections. I've tried to deal with the existing possibilities of the cloud interface as best as I could. Most of the problems for me lie in the recover process. This includes the listing of available backups.
barman requests the backup.info file of every backup. Meaning that all of these have to be recovered. So this is the single most expensive process when working with an external backup solution. So how would you achieve this when one uses the barman-cloud-backup-list command, with your local approach?
Today we already use a wrapper script to create a networker barman full backup. Because we're using the networker agent and
scheduling to start the process. So starting the networker save command after barman isn't that difficult. But providing the
necessary barman backup key is. Without it, recovery attempts will fail. Additionally we're creating a single backup per barman
file, to make sure that we can recover the backup.info file separately and didn't have to recover the entire db backup. Saving an entire local output directory just won't cut it. Networker automatic Batch recovery is unable to restore only a single file from a backup. This is only possible in dialog mode. Because of all of this, our approach is to emulate single file access handling as close as possible.
So creating a local file CP is only half of the solution. Key would be a good pre and post command handling of barman itself. Not from a wrapper script around barman. Including placeholders for all of the available barman parameters, like key, tags, filename and so on.
Our entire motivation to create such a solution with barman is to ease recovery. Preparing all files necessary and starting it. Especially, when tablespaces are in use. Networker provides support for postgres backups itself, but when recovering you're pretty much on your own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants