Skip to content

Commit d4476fa

Browse files
authored
docs: add backup concept, guide, and configuration (#98)
1 parent d1001c8 commit d4476fa

File tree

4 files changed

+76
-1
lines changed

4 files changed

+76
-1
lines changed

docs/docs/concepts/overview.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -457,7 +457,13 @@ After passing the validation, Replay will clear task instances of the requested
457457
up and run. Replay will frequently check the status of each task from the scheduler (every 5 minutes) to track if
458458
each task is still in progress, failed, or succeeded.
459459
460-
Optimus also provides Replay Dry Run to simulate all the impacted tasks without actually re-running the tasks.
460+
Optimus also provides Backup to duplicate a resource that can be perfectly used before running Replay. Optimus accepts
461+
which datastore and resource that needs to be backed up and users have a choice to also back up the downstream resources
462+
within the same project. Where the backup result will be located, and the expiry detail can be configured in the project
463+
configuration.
464+
465+
Both Replay and Backup are provided with Dry Run to simulate all the impacted tasks or resources without actually re-running
466+
the tasks or backing up the resources.
461467
462468
## Monitoring & Alerting
463469

docs/docs/getting-started/configuration.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ datastore:
1919
type: bigquery
2020
# path where resource spec for BQ are stored
2121
path: "bq"
22+
# backup configurations of a datastore
23+
backup: {}
2224

2325
# project variables usable in specifications
2426
config:

docs/docs/guides/backup.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
---
2+
id: backup
3+
title: Backup Resources
4+
---
5+
6+
Backup is a common prerequisite step to be done before re-running or modifying a resource. Currently, Optimus supports
7+
backup for BigQuery tables and provides dependency resolution, so backup can be also done to all the downstream tables
8+
as long as it is registered in Optimus and within the same project.
9+
10+
## Configuring backup details
11+
12+
Several configurations can be set to have the backup result in your project as your preference. Here are the
13+
available configurations for BigQuery datastore.
14+
15+
Configuration key | Description | Default |
16+
------------------|------------------------------------------|----------------|
17+
ttl | Time to live in duration | 720h |
18+
prefix | Prefix of the result table name | backup |
19+
dataset | Where the table result should be located | optimus_backup |
20+
21+
These values can be set in the project [configuration](../getting-started/configuration.md).
22+
23+
24+
## Run a backup
25+
26+
To start a backup, run the following command:
27+
28+
```shell
29+
$ optimus backup resource --project sample-project --namespace sample-namespace
30+
```
31+
32+
After you run the command, prompts will be shown. You will need to answer the questions.
33+
34+
```
35+
$ optimus backup resource --project sample-project --namespace sample-namespace
36+
? Select supported datastore? bigquery
37+
? Why is this backup needed? backfill due to business logic change
38+
? Backup downstream? Yes
39+
```
40+
41+
You will be shown a list of resources that will be backed up, including the downstream resources (if you chose to do so).
42+
You can confirm to proceed if the list is as expected, and please wait until the backup is finished.
43+
44+
Once the backup is finished, the list of backup results along with where it is located will be shown.
45+
46+
47+
## Get list of backups
48+
49+
List of recent backups of a project can be checked using this sub command:
50+
51+
```shell
52+
$ optimus backup list --project sample-project
53+
```
54+
55+
Recent backup ID including the resource, when it was created, what is the description or purpose of the backup will be
56+
shown. Backup ID is used as a postfix in backup result name, thus you can find those results in the datastore
57+
(for example BigQuery) using the backup ID. However, keep in mind that these backup results have expiry time set.
58+
59+
## Run a backup dry run
60+
61+
A dry run is also available to simulate all the resources that can be backed up without actually doing it. Example of dry
62+
run usage:
63+
64+
```shell
65+
$ optimus backup resource --project sample-project --namespace sample-namespace --dry-run
66+
```

docs/sidebars.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ module.exports = {
3737
"guides/organising-specifications",
3838
"guides/optimus-serve",
3939
"guides/task-bq2bq",
40+
"guides/backup",
4041
"guides/replay"
4142
],
4243
},

0 commit comments

Comments
 (0)