Skip to content
mbacovsky edited this page Jul 23, 2018 · 3 revisions

Table of contents

Backup

Note: We assume all the foreman-maintain commands are run as a root user.

Backup utility in foreman-maintain offers three backup strategies

  • Offline - The whole Katello instance will be turned off completely for the entire backup. See more
  • Online - Backing up the repositories can take an extensive amount of time. You can perform a backup while online. In order for this procedure to succeed, you must not change or update the repositories database until the backup procedure is complete. Thus, you must avoid publishing, adding, or deleting content views, promoting content view versions, adding, changing, or deleting sync-plans, and adding, deleting, or syncing repositories during this time. See more
  • Snapshot - The whole Katello instance will be turned off only for time necessary to create and mount logical volume snapshots. The backup is done from the snapshots after the instance is turned on again. This allows to minimize the maintenance window necessary for doing the backup. See more

Incremental backup

Incremental backups can be used to only store the changes since the last backup:

First take a full backup (it is applicable for all strategies offline will be used in the examples):

# foreman-maintain backup offline /tmp/backups

(This will create a new directory, /tmp/backups/katello-backup-YYYY-MM-DD-hh-mm-ss)

Take 1st incremental backup (this will create a new directory under /tmp/backups to house the new backup, just like the full backup directory):

# foreman-maintain backup offline --incremental /tmp/backup/FULL_BACKUP_DIR /tmp/backups

Take 2nd incremental backup (again, this will create a new directory under /tmp/backups to house this second incremental backup):

# foreman-maintain backup offline --incremental /tmp/backup/FIRST_INCREMENTAL_BACKUP_DIR /tmp/backups

Branching/rebasing incremental backups

Should you choose to take a new incremental backup from, say, the full backup so you don’t need too many files when restoring those backups, simply point the command to the full backup directory, and this newest backup directory will be incremental in relation to the full backup not to the 2nd incremental backup:

# foreman-maintain backup offline --incremental /tmp/backup/FULL_BACKUP_DIR /tmp/backups

An example with full backup on Sunday and incremental backup for all other weekdays would look like:

#!/bin/bash -e
export PATH=/sbin:/bin:/usr/sbin:/usr/bin
DESTINATION=/var/backup
if [[ $(date +%w) == 0 ]]; then
  foreman-maintain backup offline $DESTINATION
else
  LAST=$(ls -td -- $DESTINATION/*/ | head -n 1)
  foreman-maintain backup offline --incremental "$LAST" $DESTINATION
fi
exit 0

Note that the foreman-maintain requires /sbin and /usr/sbin directories to be in PATH which is not always the case for cron.

Skip Pulp repositories

There may be situations in which you want to see a system without its repository information. You can skip backing up the Pulp database with the following option:

  # foreman-maintain backup <online|offline|snapshot> --skip-pulp-content /tmp/backup

--skip-pulp-content skips backing up /var/lib/pulp, this option is for debugging purposes or if you plan to copy /var/lib/pulp in other ways, such as rsync or shared storage. You will not have a complete backup if you use this option.

Backup with remote databases

For local databases (PgSQL, MongoDB) we prefer to archive the whole database data directory when the service is down. With the databases located on remote servers it is not possible and we store dump of the database instead. The dump is performed when all the other Katello related services are down and the server is in the maintenance mode so the data consistency is preserved.

For snapshot strategy the dumps needs to be performed during the time when the services are down and when the snapshots are usually created and it will prolong the downtime necessary for the backup.

Steps and skipping parts of backup

In general every foreman-maintain command consists of one or more steps. Steps matching certain criteria become execution scenario for the command. It is possible to explicitly exclude certain steps from the scenario using --whitelist <coma-separated list of step labels> option. E.g. lets do online backup without metadata:

# foreman-maintain backup online --whitelist backup-metadata  -y /tmp/some/dir
Starting backup: 2018-05-03 03:20:35 +0000
Running preparation steps required to run the next scenarios
================================================================================
Make sure Foreman DB is up: 
/ Checking connection to the Foreman DB                               [OK]      
--------------------------------------------------------------------------------
Make sure Mongo DB is up: 
- Checking connection to the Mongo DB                                 [OK]      
--------------------------------------------------------------------------------
Checks whether the tools for Mongo DB are installed:                  [OK]
--------------------------------------------------------------------------------
Make sure Candlepin DB is up: 
\ Checking connection to the Candlepin DB                             [OK]      
--------------------------------------------------------------------------------


Running Backup
================================================================================
Data consistency warning: 
*** WARNING: The online backup is intended for making a copy of the data
*** for debugging purposes only. The backup routine can not ensure 100% consistency while the
*** backup is taking place as there is a chance there may be data mismatch between
*** Mongo and Postgres databases while the services are live. If you wish to utilize the online backup
*** for production use you need to ensure that there are no modifications occurring during
*** your backup run.

                                                                      [OK]      
--------------------------------------------------------------------------------
Prepare backup Directory: 
Creating backup folder /tmp/some/dir/katello-backup-2018-05-03-03-20-35
                                                                      [OK]
--------------------------------------------------------------------------------
Check if the directory exists and is writable:                        [OK]
--------------------------------------------------------------------------------
Generate metadata:                                                    [SKIPPED]
--------------------------------------------------------------------------------
Backup config files: 
\ Collecting config files to backup                                   [OK]      
--------------------------------------------------------------------------------
Backup Pulp data: 
- Collecting Pulp data                                                [OK]      
--------------------------------------------------------------------------------
Backup Mongo online: 
| Getting dump of Mongo DB                                            [OK]      
--------------------------------------------------------------------------------
Backup Postgres global objects online:                                [OK]
--------------------------------------------------------------------------------
Backup Candlepin database online: 
\ Getting Candlepin DB dump                                           [OK]      
--------------------------------------------------------------------------------
Backup Foreman database online: 
- Getting Foreman DB dump                                             [OK]      
--------------------------------------------------------------------------------
Compress backup data to save space:                                   [OK]
--------------------------------------------------------------------------------

Done with backup: 2018-05-03 03:20:42 +0000
**** BACKUP Complete, contents can be found in: /tmp/some/dir/katello-backup-2018-05-03-03-20-35 ****

You can see the metadata step was skipped.

To see a list of available step labels use:

foreman-maintain advanced procedure run -h

In some cases it make sense to run one individual step, e.g. collect config files for debugging purposes:

# foreman-maintain advanced procedure run backup-config-files --backup-dir /tmp/backup
Running ForemanMaintain::Scenario
================================================================================
Backup config files: 
\ Collecting config files to backup                                   [OK]      
--------------------------------------------------------------------------------

Directory names

By default the backup is stored in subdirectory named <katello|foreman|satellite|capsule>-backup-YYYY-MM-DD-hh-mm-ss in a directory provided on a command line.

If you need to set the directory name yourself use switch --preserve-directory and the backup will be stored in the directory you provide on the command line. Be aware that user postgres needs write access to that directory if you have local PgSQL database.

foreman-maintain backup online -y --preserve-directory /tmp/my_backup_dir
Starting backup: 2018-05-03 04:03:22 +0000
Running preparation steps required to run the next scenarios
================================================================================
... LIST SHORTENED ...

Done with backup: 2018-05-03 04:03:41 +0000
**** BACKUP Complete, contents can be found in: /tmp/my_backup_dir ****

Note there was no subdir created.

Also note that when using --preserve-directory no data are removed from it when the backup fail.

Offline backup

Offline backup is safe way to backup your server for later restoring. All the services are down during the backup and the instance is in a maintenance mode thus not accessible from outside. The instance is started back into normal after the backup finished.

Possible options are:

# foreman-maintain backup offline -h
Usage:
    foreman-maintain backup offline [OPTIONS] BACKUP_DIR

Parameters:
    BACKUP_DIR                    Path to backup dir

Options:
    -y, --assumeyes               Automatically answer yes for all questions
    -w, --whitelist whitelist     Comma-separated list of labels of steps to be skipped
    -f, --force                   Force steps that would be skipped as they were already run
    -s, --skip-pulp-content       Do not backup Pulp content
    -p, --preserve-directory      Do not create a time-stamped subdirectory
    -t, --split-pulp-tar SPLIT_SIZE Split pulp data into files of a specified size, i.e. (100M, 50G). See '--tape-length' in 'info tar' for all sizes
    -i, --incremental PREVIOUS_BACKUP_DIR Backup changes since previous backup
    --features FEATURES           Foreman Proxy features to include in the backup. Valid features are tftp, dns, dhcp, openscap, and all. (comma-separated list)
    --include-db-dumps            Also dump full database schema before offline backup
    -h, --help                    print help

As the archives of Pulp data can be large it may be useful to split it into volumes using --split-pulp-tar options.

There is also --include-db-dumps option to take extra dumps of the DBs and store them with the backup. Note the dumps are done with all the service running so the data integrity among DBs is not ensured. If you want to rely on the dumps during restore, make sure all the services but local DBs are down prior running the backup.

Online backup

The online backup is intended for making a copy of the data for debugging purposes only. The backup routine can not ensure 100% consistency while the backup is taking place as there is a chance there may be data mismatch between Mongo and Postgres databases while the services are live.

If you wish to utilize the online backup for production use you need to ensure that there are no modifications occurring during your backup run.

During this backup a dump of the underlying databases (PgSQL, Mongo) is done. While Pulp repositories are archived we check for a changes in the data. When something changed the backup is re-run automatically so make sure there no repo syncs, promotions or other changes to the Pulp content performed during the backup.

Available options can be shown with the --help (-h) switch.

# foreman-maintain backup online -h                                                            
Usage:                                                                                                                                                                                               
    foreman-maintain backup online [OPTIONS] BACKUP_DIR                                                                                                                                              
                                                                                                                                                                                                     
Parameters:                                                                                                                                                                                          
    BACKUP_DIR                    Path to backup dir                                                                                                                                                 
                                                                                                                                                                                                     
Options:
    -y, --assumeyes               Automatically answer yes for all questions
    -w, --whitelist whitelist     Comma-separated list of labels of steps to be skipped
    -f, --force                   Force steps that would be skipped as they were already run
    -s, --skip-pulp-content       Do not backup Pulp content
    -p, --preserve-directory      Do not create a time-stamped subdirectory
    -t, --split-pulp-tar SPLIT_SIZE Split pulp data into files of a specified size, i.e. (100M, 50G). See '--tape-length' in 'info tar' for all sizes
    -i, --incremental PREVIOUS_BACKUP_DIR Backup changes since previous backup
    --features FEATURES           Foreman Proxy features to include in the backup. Valid features are tftp, dns, dhcp, openscap, and all. (comma-separated list)
    -h, --help                    print help

Sample run of online backup:

# foreman-maintain backup online -y /tmp/some/dir        
Starting backup: 2018-04-30 20:20:45 +0000
Running preparation steps required to run the next scenarios
================================================================================
Make sure Foreman DB is up: 
/ Checking connection to the Foreman DB                               [OK]      
--------------------------------------------------------------------------------
Make sure Mongo DB is up: 
- Checking connection to the Mongo DB                                 [OK]      
--------------------------------------------------------------------------------
Checks whether the tools for Mongo DB are installed:                  [OK]
--------------------------------------------------------------------------------
Make sure Candlepin DB is up: 
| Checking connection to the Candlepin DB                             [OK]      
--------------------------------------------------------------------------------


Running Backup
================================================================================
Data consistency warning: 
*** WARNING: The online backup is intended for making a copy of the data
*** for debugging purposes only. The backup routine can not ensure 100% consistency while the
*** backup is taking place as there is a chance there may be data mismatch between
*** Mongo and Postgres databases while the services are live. If you wish to utilize the online backup
*** for production use you need to ensure that there are no modifications occurring during
*** your backup run.

                                                                      [OK]      
--------------------------------------------------------------------------------
Prepare backup Directory: 
Creating backup folder /tmp/some/dir/katello-backup-2018-04-30-20-20-45
                                                                      [OK]
--------------------------------------------------------------------------------
Check if the directory exists and is writable:                        [OK]
--------------------------------------------------------------------------------
Generate metadata: 
- Saving metadata to metadata.yml                                     [OK]      
--------------------------------------------------------------------------------
Backup config files: 
- Collecting config files to backup                                   [OK]      
--------------------------------------------------------------------------------
Backup Pulp data: 
\ Collecting Pulp data                                                [OK]      
--------------------------------------------------------------------------------
Backup Mongo online: 
/ Getting dump of Mongo DB                                            [OK]      
--------------------------------------------------------------------------------
Backup Postgres global objects online:                                [OK]
--------------------------------------------------------------------------------
Backup Candlepin database online: 
/ Getting Candlepin DB dump                                           [OK]      
--------------------------------------------------------------------------------
Backup Foreman database online: 
/ Getting Foreman DB dump                                             [OK]      
--------------------------------------------------------------------------------
Compress backup data to save space:                                   [OK]
--------------------------------------------------------------------------------

Done with backup: 2018-04-30 20:21:09 +0000
**** BACKUP Complete, contents can be found in: /tmp/some/dir/katello-backup-2018-04-30-20-20-45 ****

Backup from snapshots

This kind of backup is similar to offline backup but it minimizes the down time necessary to perform the backup by doing the backup from disk snapshots. Taking the snapshots is the only time when the instance needs to be down while the rest of the backup can be performed with services up and accessible.

It is recommended to mount the snapshots to different logical volume than where the DBs are. Otherwise the snapshot size will be at least the size of the actual database. This is checked during the backup and warning is raised.

Possible options are:

# foreman-maintain backup snapshot -h
Usage:
    foreman-maintain backup snapshot [OPTIONS] BACKUP_DIR

Parameters:
    BACKUP_DIR                    Path to backup dir

Options:
    -y, --assumeyes               Automatically answer yes for all questions
    -w, --whitelist whitelist     Comma-separated list of labels of steps to be skipped
    -f, --force                   Force steps that would be skipped as they were already run
    -s, --skip-pulp-content       Do not backup Pulp content
    -p, --preserve-directory      Do not create a time-stamped subdirectory
    -t, --split-pulp-tar SPLIT_SIZE Split pulp data into files of a specified size, i.e. (100M, 50G). See '--tape-length' in 'info tar' for all sizes
    -i, --incremental PREVIOUS_BACKUP_DIR Backup changes since previous backup
    --features FEATURES           Foreman Proxy features to include in the backup. Valid features are tftp, dns, dhcp, openscap, and all. (comma-separated list)
    -d, --snapshot-mount-dir SNAPSHOT_MOUNT_DIR Override default directory ('/var/snap/') where the snapshots will be mounted (default: "/var/snap/")
    -b, --snapshot-block-size SNAPSHOT_BLOCK_SIZE Override default block size (2G) (default: "2G")
    -h, --help                    print help

Backup security

A backup can contain sensitive information such as hostnames and ssh keys. It is recommended that you store backups in a secure location or encrypt them.

Note: documentation based on katello docs