-
Notifications
You must be signed in to change notification settings - Fork 249
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Yongqiang YANG
committed
Dec 25, 2024
1 parent
4e3e7f4
commit f779cc6
Showing
11 changed files
with
995 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
--- | ||
{ | ||
"title": "Backup", | ||
"language": "en" | ||
} | ||
--- | ||
|
||
<!-- | ||
Licensed to the Apache Software Foundation (ASF) under one | ||
or more contributor license agreements. See the NOTICE file | ||
distributed with this work for additional information | ||
regarding copyright ownership. The ASF licenses this file | ||
to you under the Apache License, Version 2.0 (the | ||
"License"); you may not use this file except in compliance | ||
with the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, | ||
software distributed under the License is distributed on an | ||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations | ||
under the License. | ||
--> | ||
|
||
Doris supports backing up the current data in the form of files to the remote storage system. Afterwards, you can restore data from the remote storage system to any Doris cluster through the restore command. Through this function, Doris can support periodic snapshot backup of data. You can also use this function to migrate data between different clusters. | ||
|
||
This feature requires Doris version 0.8.2+ | ||
|
||
## Permission Requirements | ||
|
||
1. Operations related to backup and recovery are currently only allowed to be performed by users with ADMIN privileges. | ||
|
||
|
||
## 1. Create a repository | ||
|
||
You can create a repository according to (create-repository.md). | ||
|
||
## 2. Backup tables or db | ||
|
||
### Option 1: Backup table example_tbl under example_db | ||
|
||
```sql | ||
BACKUP SNAPSHOT example_db.snapshot_label1 | ||
TO example_repo | ||
ON (example_tbl) | ||
PROPERTIES ("type" = "full"); | ||
``` | ||
|
||
### Option 2: Backup example_db, the p1, p2 partitions of the table example_tbl, and the table example_tbl2 | ||
|
||
```sql | ||
BACKUP SNAPSHOT example_db.snapshot_label2 | ||
TO example_repo | ||
ON | ||
( | ||
example_tbl PARTITION (p1,p2), | ||
example_tbl2 | ||
); | ||
``` | ||
|
||
## 3. View the execution of the most recent backup job | ||
|
||
```sql | ||
mysql> show BACKUP\G; | ||
*************************** 1. row *************************** | ||
JobId: 17891847 | ||
SnapshotName: snapshot_label1 | ||
DbName: example_db | ||
State: FINISHED | ||
BackupObjs: [default_cluster:example_db.example_tbl] | ||
CreateTime: 2022-04-08 15:52:29 | ||
SnapshotFinishedTime: 2022-04-08 15:52:32 | ||
UploadFinishedTime: 2022-04-08 15:52:38 | ||
FinishedTime: 2022-04-08 15:52:44 | ||
UnfinishedTasks: | ||
Progress: | ||
TaskErrMsg: | ||
Status: [OK] | ||
Timeout: 86400 | ||
1 row in set (0.01 sec) | ||
``` | ||
|
||
## 4. View existing backups in remote repositories | ||
|
||
```sql | ||
mysql> SHOW SNAPSHOT ON example_repo WHERE SNAPSHOT = "snapshot_label1"; | ||
+-----------------+---------------------+--------+ | ||
| Snapshot | Timestamp | Status | | ||
+-----------------+---------------------+--------+ | ||
| snapshot_label1 | 2022-04-08-15-52-29 | OK | | ||
+-----------------+---------------------+--------+ | ||
1 row in set (0.15 sec) | ||
``` | ||
|
||
## More Help | ||
|
||
For more detailed syntax and best practices used by BACKUP, please refer to the [BACKUP](../../sql-manual/sql-statements/data-modification/backup-and-restore/BACKUP.md) command manual. |
167 changes: 167 additions & 0 deletions
167
docs/admin-manual/data-admin/backup-restore/create-repository.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,167 @@ | ||
--- | ||
{ | ||
"title": "Preparing Backup Storage", | ||
"language": "en" | ||
} | ||
--- | ||
|
||
<!-- | ||
Licensed to the Apache Software Foundation (ASF) under one | ||
or more contributor license agreements. See the NOTICE file | ||
distributed with this work for additional information | ||
regarding copyright ownership. The ASF licenses this file | ||
to you under the Apache License, Version 2.0 (the | ||
"License"); you may not use this file except in compliance | ||
with the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, | ||
software distributed under the License is distributed on an | ||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations | ||
under the License. | ||
--> | ||
|
||
## Overview | ||
|
||
In Doris, a **repository** is a remote storage location used for backing up and restoring data. Repositories support various storage systems including **S3**, **Azure**, **GCP**, **OSS**, **COS**, **MinIO**, **HDFS**, and other storages compatible with S3. This guide walks you through the steps of creating a repository to use for backup and restore operations in Doris. | ||
|
||
## Permission Requirements | ||
|
||
- Only users with **ADMIN** privileges are allowed to create repositories for backup and restore operations. | ||
|
||
## Supported Storage Systems | ||
|
||
- **S3** | ||
- **Azure** | ||
- **GCP** | ||
- **OSS** | ||
- **COS** | ||
- **MinIO** | ||
- **HDFS** | ||
- Other storages compatible with S3 | ||
|
||
## Creating a Repository for S3 | ||
|
||
<!-- | ||
suites/backup_restore/test_create_and_drop_repository.groovy | ||
--> | ||
|
||
To create a repository for S3 storage, use the following SQL command: | ||
|
||
```sql | ||
CREATE REPOSITORY `s3_repo` | ||
WITH S3 | ||
ON LOCATION "s3://bucket_name/s3_repo" | ||
PROPERTIES | ||
( | ||
"s3.endpoint" = "s3.us-east-1.amazonaws.com", | ||
"s3.region" = "us-east-1", | ||
"s3.access_key" = "ak", | ||
"s3.secret_key" = "sk" | ||
); | ||
``` | ||
|
||
- Replace bucket_name with the name of your S3 bucket. | ||
- Provide the appropriate endpoint, access key, secret key, and region for your S3 setup. | ||
|
||
## Creating a Repository for Azure | ||
|
||
To create a repository for Azure storage, use the following SQL command: | ||
|
||
```sql | ||
CREATE REPOSITORY `azure_repo` | ||
WITH S3 | ||
ON LOCATION "s3://bucket_name/azure_repo" | ||
PROPERTIES | ||
( | ||
"s3.endpoint" = "selectdbcloudtestwestus3.blob.core.windows.net", | ||
"s3.region" = "dummy_region", | ||
"s3.access_key" = "ak", | ||
"s3.secret_key" = "sk", | ||
"provider" = "AZURE" | ||
); | ||
``` | ||
|
||
- Replace bucket_name and container with your Azure container information. | ||
- Provide your Azure storage account and key for authentication. | ||
|
||
## Creating a Repository for GCP | ||
|
||
To create a repository for Google Cloud Platform (GCP) storage, use the following SQL command: | ||
|
||
```sql | ||
CREATE REPOSITORY `gcp_repo` | ||
WITH S3 | ||
ON LOCATION "s3://bucket_name/backup/gcp_repo" | ||
PROPERTIES | ||
( | ||
"s3.endpoint" = "storage.googleapis.com", | ||
"s3.region" = "US-WEST2", | ||
"s3.access_key" = "ak", | ||
"s3.secret_key" = "sk" | ||
); | ||
``` | ||
|
||
- Replace bucket_name with the name of your GCP bucket. | ||
- Provide your GCP endpoint, region, access key, and secret key. | ||
|
||
## Creating a Repository for OSS (Alibaba Cloud Object Storage Service) | ||
|
||
To create a repository for OSS, use the following SQL command: | ||
|
||
```sql | ||
CREATE REPOSITORY `oss_repo` | ||
WITH S3 | ||
ON LOCATION "s3://bucket_name/oss_repo" | ||
PROPERTIES | ||
( | ||
"s3.endpoint" = "oss.aliyuncs.com", | ||
"s3.region" = "cn-hangzhou", | ||
"s3.access_key" = "ak", | ||
"s3.secret_key" = "sk" | ||
); | ||
``` | ||
- Replace bucket_name with the name of your OSS bucket. | ||
- Provide your OSS access key, secret key, and endpoint. | ||
|
||
## Creating a Repository for MinIO | ||
|
||
To create a repository for MinIO storage, use the following SQL command: | ||
|
||
```sql | ||
CREATE REPOSITORY `minio_repo` | ||
WITH S3 | ||
ON LOCATION "s3://bucket_name/minio_repo" | ||
PROPERTIES | ||
( | ||
"s3.endpoint" = "yourminio.com", | ||
"s3.region" = "dummy-region", | ||
"s3.access_key" = "ak", | ||
"s3.secret_key" = "sk", | ||
"use_path_style" = "true" | ||
); | ||
``` | ||
|
||
- Replace bucket_name with the name of your MinIO bucket. | ||
- Provide your MinIO access key, secret key, and endpoint. | ||
|
||
## Creating a Repository for HDFS | ||
|
||
```sql | ||
CREATE REPOSITORY `hdfs_repo` | ||
WITH hdfs | ||
ON LOCATION "/prefix_path/hdfs_repo" | ||
PROPERTIES | ||
( | ||
"fs.defaultFS" = "hdfs://127.0.0.1:9000", | ||
"hadoop.username" = "doris-test" | ||
) | ||
``` | ||
|
||
- Replace prefix_path with the real path. | ||
- Provide your hdfs endpoint and username. | ||
|
||
For more detailed usage instructions and examples, refer to the CREATE REPOSITORY documentation (../../sql-manual/sql-statements/data-modification/backup-and-restore/CREATE-REPOSITORY). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
--- | ||
{ | ||
"title": "Backup and Restore Overview", | ||
"language": "en" | ||
} | ||
--- | ||
|
||
<!-- | ||
Licensed to the Apache Software Foundation (ASF) under one | ||
or more contributor license agreements. See the NOTICE file | ||
distributed with this work for additional information | ||
regarding copyright ownership. The ASF licenses this file | ||
to you under the Apache License, Version 2.0 (the | ||
"License"); you may not use this file except in compliance | ||
with the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, | ||
software distributed under the License is distributed on an | ||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations | ||
under the License. | ||
--> | ||
|
||
## Introduction | ||
|
||
Apache Doris provides robust support for backup and restore operations. These features enable users to back up data from tables or entire databases to remote storage systems and restore it as needed. The system supports snapshot-based backups, which capture the state of the data at a particular point in time, and these snapshots can be stored in remote repositories like HDFS, S3, and MinIO. | ||
|
||
Backup and restore operations are crucial for disaster recovery, data migration between clusters, and ensuring data integrity over time. | ||
|
||
## Requirements | ||
|
||
- **ADMIN Privileges**: Only users with **ADMIN** privileges are authorized to perform backup and restore operations. This ensures secure handling of sensitive data and prevents unauthorized access to backup processes. | ||
|
||
- Doris version 0.8.2 or higher. | ||
|
||
## Key Concepts | ||
|
||
1. **Snapshot**: | ||
A snapshot is a point-in-time capture of the data in a table or partition. It is an efficient operation, as it only creates a hard link to the existing data files. | ||
|
||
2. **Repository**: | ||
A remote storage location where the backup files are stored. Supported repositories include HDFS, S3, MinIO and other object storages. | ||
|
||
3. **Backup Operation**: | ||
A backup operation involves creating a snapshot of a table or partition, uploading the snapshot files to a remote repository, and storing the metadata related to the backup. | ||
|
||
4. **Restore Operation**: | ||
A restore operation involves downloading the backup from a remote repository and restoring it to a Doris cluster. | ||
|
||
## Key Features | ||
|
||
1. **Backup Data**: | ||
Doris allows you to back up data from a table, partition, or an entire database by creating snapshots. The data is backed up in file format and stored on remote storage systems like HDFS, S3, or other compatible systems via the broker process. | ||
|
||
2. **Restore Data**: | ||
You can restore the backup data from a remote repository to any Doris cluster. This includes full database restores, full table restores, and partition-level restores, allowing for flexibility in recovering data. | ||
|
||
3. **Snapshot Management**: | ||
Data is backed up in the form of snapshots. These snapshots are uploaded to remote storage systems and can be later restored as needed. The restore process involves downloading snapshot files and mapping them to local metadata to make them effective. | ||
|
||
4. **Data Migration**: | ||
In addition to backup and restore, this functionality enables data migration between different Doris clusters. You can back up data to a remote storage system and restore it to another Doris cluster, helping in cluster migration scenarios. | ||
|
||
5. **Replication Control**: | ||
When restoring data, you can specify the number of replicas for the restored data to ensure redundancy and fault tolerance. | ||
|
||
## Not Supported Features | ||
|
||
While Doris provides powerful backup and restore capabilities, there are some limitations and unsupported features in certain scenarios: | ||
|
||
1. **Async Materialized View (MTMV) Not Supported**: | ||
Doris currently does not support backing up or restoring tables that are associated with **Async Materialized Views (MTMV)**. If such views are involved, the backup or restore operations may not work as expected, and users may encounter issues related to consistency or data integrity during the process. | ||
|
||
2. **Tables with Storage Policy Not Supported**: | ||
Tables that have a **storage policy** defined (e.g., tables configured with custom storage settings) are **not supported** for backup and restore operations. These tables may encounter issues during backup or restore, as their storage configurations may conflict with the snapshot process. | ||
|
||
3. **Incremental Backup**: | ||
At present, Doris only supports full backups. Incremental backups (where only the changed data since the last backup is stored) are not yet supported, although this may be included in future versions. | ||
|
||
4. **Colocate With Property**: | ||
During a backup or restore operation, Doris will not preserve the `colocate_with` property of tables. This may require reconfiguring the colocated tables after restoring them. | ||
|
||
5. **Dynamic Partition Support**: | ||
While dynamic partitioning is supported in Doris, the dynamic partition attribute will be disabled during backup. When restoring data, this attribute needs to be manually enabled using the `ALTER TABLE` command. | ||
|
||
For detailed usage instructions, please refer to the backup and restore user guides. | ||
|
Oops, something went wrong.