Google Cloud Platform Snapshot Backup System
The Google Cloud Platform (GCP) provides a Compute Engine service for creating and maintaining robust, scalable, and high-performance virtual machine instances. A provided feature allows for any instance disk to have a delta-based snapshot generated, even while the instance they are attached to is actively running. GCP stores multiple copies of each snapshot redundantly across multiple locations with automatic checksums to ensure the integrity of the data, making this a useful feature for periodic backups. Even with all this capability, due to its manual process, it falls short of a proper backup system.
The GSnapUp
utility provides the missing scheduling overlay for the GCP
snapshot functionality; to produce a rounded feature-set, for a solid backup
system. Setup is as simple as installing the utility on a system (in or with
access to) the GCP, configuring the instances and disks to maintain, and
configure a system CRON task to keep its hart beating.
Prior to utilizing the GSnapUp
utility, the system it's running on requires
the GCP SDK to be installed and
initialized, as well as PHP 5.6+ to be available.
After installing the gcloud
utility, set the GCP project (example):
$ gcloud config set project project01
Next you need to provide authorization with a user account
or a service account
. This step can be difficult, make sure the account you associate has
the necessary permissions to access the project you need it to.
Run the login sub-command with a valid user account (example):
$ gcloud auth login [email protected]
You need to have a service account created and associated with the necessary projects. From the GCP dashboard, access IAM & Admin > Service accounts, create a new service account or use an existing one with it's associated key file. If you don't have a copy of the key file, create a new one and download the provided file. Run the activate-service-account sub-command with the service account and key file (example):
$ gcloud auth activate-service-account [email protected] --key-file=./project-name-769c5768547b.json
The GSnapUp
utility provides several commands for creating and maintaining the
configuration file used to interact with the GCP instance disks, as well as
running the snapshots on said disks.
Command | Description |
---|---|
init |
Initialize configuration |
instance:add |
Add instance to configuration |
instance:available |
List available GCloud instances |
instance:disable |
Disable instance in configuration |
instance:enable |
Enable instance in configuration |
instance:list |
List configured instances |
instance:remove |
Remove instance from configuration |
instance:update |
Update instance in configuration |
scheduled |
Run scheduled GCloud snapshot backups |
A detailed output of all the commands can be obtained by running:
$ gsnapup list
Help for each command may be obtained by using the help
command:
$ gsnapup help init
The configuration consists of the main
, instances
, and disks
sections,
each containing a series of keys. The main
section (being the root of the
configuration) contains an instances
key housing a list of all the
configured GCP instances. Each of the instance
s in the configuration
contains a disks
key housing a list of all the configured disks for that
GCP instance.
Each instance configured in the instances
list uses a token key which is
used to reference that instance
when dealing with GSnapUp
. The disks
configured in the disks
list also utilizes a token key, the same way the
instances
list does. These token keys make it friendlier to interact with
this utility, providing a saner naming convention than what is common with the
GCP instance and disk naming.
Example:
{
...
"instances": {
"instanceToken": {
...
"disks": {
"diskToken": {
...
},
...
}
},
...
}
}
Most of the non-section keys can be used in any of the sections (providing a
cascading-style configuration) which allows for fine-grain control of the
utility. The values set in a disk
sub-section will override the values set in
its parent instance
sub-section, which override the values set in the main
section. This style of configuration provides the most flexible implementation,
and allows for a verity of use-cases.
The simplest example would be the enabled
key. If the main
sections
enabled
value is set to 'true
', an entire instance
may be disabled by
setting that instances enabled
value to 'false
', or a single disk may be
disabled by setting that disks enabled
value to 'false
'. Additionally, if
the main
sections enabled
is set to 'true
', an instance has several disks
but its enabled
is set to 'false
', any of its disks sub-sections could have
their enabled
set to 'true
' and that disk would be considered enabled.
Sections | Main | Instances | Disks |
---|---|---|---|
Used in | No | No | No |
This is the root section of the configuration, all other sections and keys reside inside this section.
Sections | Main | Instances | Disks |
---|---|---|---|
Used in | Yes | No | No |
This section exists directly inside the main
section and may contain one or
more instance
sub-sections containing settings representing a GCP instance.
Sections | Main | Instances | Disks |
---|---|---|---|
Used in | No | Yes | No |
This section exists directly inside an instances
sub-section and may contain
one or more disk
sub-sections containing settings representing a GCP disk.
The keys that aren't specifically for sections or sub-sections (as stated previously) can be used in any section, unless stated otherwise. More details about each key is detailed further down below, but for a quick reference the following is a table of the different key names and the sections they are used in:
Key | Main | Instances | Disks |
---|---|---|---|
cron |
Yes | Yes | Yes |
datePattern |
Yes | Yes | Yes |
deviceName |
No | No | Yes |
enabled |
Yes | Yes | Yes |
instanceName |
No | Yes | No |
snapshotPattern |
Yes | Yes | Yes |
timePattern |
Yes | Yes | Yes |
timezone |
Yes | Yes | Yes |
zone |
No | Yes | No |
The keys used in the configuration file provide the flexibility of the utility, below are the specifics on what they are used for and how to set them.
Sections | Main | Instances | Disks |
---|---|---|---|
Used in | Yes | Yes | Yes |
Provide a cron expression
for the scheduled
command scheduler to apply when
determining if a disk should be snapshot. The expression consists of space
separated values representing minute, hour, day of month, month, and
day of week as follow:
┌────────── minute (0 - 59)
│ ┌──────── hour (0 - 23)
│ │ ┌────── day of month (1 - 31)
│ │ │ ┌──── month (1 - 12)
│ │ │ │ ┌── day of week (0 - 6) (Sunday to Saturday)
* * * * *
Asterisks are used to represent "any" value, dashes (ex: 1-4) are ues to defin ranges, commas (ex: 3,6,9) are used to separated items in a list, and slashes (ex: */3) are used to define steps. More information on using CRON in it's Wiki page.
Example:
"cron": "0 */6 * * *"
Sections | Main | Instances | Disks |
---|---|---|---|
Used in | Yes | Yes | Yes |
The format of the date pattern to use in the snapshotPattern as %date%
. A
full list of characters to use can be found in the PHP date function
documentation.
Example:
"datePattern": "Y-m-d"
Sections | Main | Instances | Disks |
---|---|---|---|
Used in | No | No | Yes |
Provides the name of the disk as it is specified on GCP.
Example:
"deviceName": "vol-60108108585--dev-sda-f8482e1f"
Sections | Main | Instances | Disks |
---|---|---|---|
Used in | Yes | Yes | Yes |
Set the state of disk snapshot for when calling the scheduled
command.
Example:
"enabled": false
Sections | Main | Instances | Disks |
---|---|---|---|
Used in | No | Yes | No |
Provides the name of the instance as it is specified on GCP.
Example:
"instanceName": "cust01-lg49js3g"
Sections | Main | Instances | Disks |
---|---|---|---|
Used in | Yes | Yes | Yes |
Pattern to use when naming snapshots. This value may consist of arbitrary text and one or more of the following placeholders:
Placeholder | Key Value |
---|---|
%vm% |
instanceToken |
%disk% |
diskToken |
%date% |
datePattern |
%time% |
timePattern |
Each placeholder is replaced with the value of its associated key value,
with exception to the %date%
and %time%
placeholders which are processed
first.
Example:
"snapshotPattern": "%vm%-%disk%-%date%-%time%"
Sections | Main | Instances | Disks |
---|---|---|---|
Used in | Yes | Yes | Yes |
Format of the time pattern to use in the snapshotPattern as %time%
. A
full list of characters to use can be found in the PHP date function
documentation.
Example:
"timePattern": "H-i-s"
Sections | Main | Instances | Disks |
---|---|---|---|
Used in | Yes | Yes | Yes |
Timezone of the instance should be considered to be in when running the
scheduled command. The value is used to compare against the cron expression
for each disk
. A list of valid values can be found in the PHP
valid timezones documentation.
Example:
"timezone": "America\/Los_Angeles"
Sections | Main | Instances | Disks |
---|---|---|---|
Used in | No | Yes | No |
Provides the zone of the instance as it is specified on GCP.
Example:
"zone": "us-east3-a"
{
"enabled": true,
"timezone": "America\/New_York",
"cron": "0 5 * * *",
"datePattern": "Y-m-d",
"timePattern": "H-i-s",
"snapshotPattern": "%vm%-%disk%-%date%",
"instances": {
"internalA": {
"instanceName": "int01-f3j9kn41",
"zone": "us-east3-a",
"disks": {
"os": {
"deviceName": "vol-60108108585--dev-sda-f8482e1f"
}
}
},
"customerA": {
"instanceName": "cust01-lg49js3g",
"zone": "us-east3-b",
"snapshotPattern": "%vm%-%disk%-disk-%date%-%time%",
"disks": {
"os": {
"enabled": false,
"deviceName": "vol-80868051015--dev-sda-281ff84e",
"cron": "1 30 * * *"
},
"data": {
"deviceName": "vol-80868051015--dev-sdc-c39caab0",
"cron": "0 6,18 * * *"
}
}
},
"customerB": {
"instanceName": "cust02-0b527a33",
"zone": "us-west1-d",
"enabled": false,
"timezone": "America\/Los_Angeles",
"cron": "0 *\/4 * * *",
"disks": {
"os": {
"deviceName": "vol-91927545157--dev-sda-5e423448",
"cron": "0 4,16 * * *",
"snapshotPattern": "%vm%-%disk%-disk-%date%-%time%"
},
"data1": {
"deviceName": "vol-91927545157--dev-sdb-8f824fe1",
"snapshotPattern": "%vm%-%disk%-disk-%date%-%time%"
},
"data2": {
"enabled": true,
"deviceName": "vol-91927545157--dev-sdc-e2926e47",
"snapshotPattern": "%vm%-%disk%-disk-%date%-%time%"
}
}
}
}
}
Scripting gcloud
commands
Google Cloud Platform gcloud
reference