Skip to content

Commit

Permalink
Merge branch 'main' into data-mover-ms-smoking-test
Browse files Browse the repository at this point in the history
  • Loading branch information
Lyndon-Li committed Aug 13, 2024
2 parents 4dea3a4 + 07c03a8 commit 3c0948c
Show file tree
Hide file tree
Showing 33 changed files with 1,269 additions and 609 deletions.
1 change: 1 addition & 0 deletions changelogs/unreleased/8085-Lyndon-Li
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
According to design #7576, after node-agent restarts, if a DU/DD is in InProgress status, re-capture the data mover ms pod and continue the execution
1 change: 1 addition & 0 deletions changelogs/unreleased/8093-Lyndon-Li
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix issue #7620, add backup repository configuration implementation and support cacheLimit configuration for Kopia repo
1 change: 1 addition & 0 deletions changelogs/unreleased/8096-Lyndon-Li
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix issue #8072, add the warning messages for restic deprecation
7 changes: 7 additions & 0 deletions config/crd/v1/bases/velero.io_backuprepositories.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,13 @@ spec:
description: MaintenanceFrequency is how often maintenance should
be run.
type: string
repositoryConfig:
additionalProperties:
type: string
description: RepositoryConfig is for repository-specific configuration
fields.
nullable: true
type: object
repositoryType:
description: RepositoryType indicates the type of the backend repository
enum:
Expand Down
2 changes: 1 addition & 1 deletion config/crd/v1/crds/crds.go

Large diffs are not rendered by default.

53 changes: 10 additions & 43 deletions design/backup-repo-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,68 +86,35 @@ For any reason, if the configMap doesn't effect, nothing is specified to the bac
The BackupRepository configMap supports backup repository type specific configurations, even though users can only specify one configMap.
So in the configMap struct, multiple entries are supported, indexed by the backup repository type. During the backup repository creation, the configMap is searched by the repository type.

Below are the struct for the configMap:
``` golang
type RepoConfig struct {
CacheLimitMB int `json:"cacheLimitMB,omitempty"`
EnableCompression int `json:"enableCompression,omitempty"`
}

type RepoConfigs struct {
Configs map[string]RepoConfig `json:"configs"`
}
```

### Configurations

With the above mechanisms, any kind of configuration could be added. Here list the configurations defined at present:
```cacheLimitMB```: specifies the size limit(in MB) for the local data cache. The more data is cached locally, the less data may be downloaded from the backup storage, so the better performance may be achieved. Practically, users can specify any size that is smaller than the free space so that the disk space won't run out. This parameter is for each repository connection, that is, users could change it before connecting to the repository. If a backup repository doesn't use local cache, this parameter will be ignored. For Kopia repository, this parameter is supported.
```enableCompression```: specifies to enable/disable compression for a backup repsotiory. Most of the backup repositories support the data compression feature, if it is not supported by a backup repository, this parameter is ignored. Most of the backup repositories support to dynamically enable/disable compression, so this parameter is defined to be used whenever creating a write connection to the backup repository, if the dynamically changing is not supported, this parameter will be hornored only when initializing the backup repository. For Kopia repository, this parameter is supported and can be dynamically modified.

### Sample
Below is an example of the BackupRepository configMap with the configurations:
json format:
```json
{
"configs": {
"repo-type-1": {
"cacheLimitMB": 2048,
"enableCompression": true
},
"repo-type-2": {
"cacheLimitMB": 1024,
"enableCompression": false
}
}
}
```
yaml format:
Below is an example of the BackupRepository configMap with the configurations:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: <config-name>
namespace: velero
data:
configs: |
<repository-type-1>: |
{
"repo-type-1": {
"cacheLimitMB": 2048,
"enableCompression": true
},
"repo-type-2": {
"cacheLimitMB": 1024,
"enableCompression": false
}
}
"cacheLimitMB": 2048,
"enableCompression": true
}
<repository-type-2>: |
{
"cacheLimitMB": 1,
"enableCompression": false
}
```

To create the configMap, users need to save something like the above sample to a file and then run below commands:
```
kubectl create cm <config-name> -n velero --from-file=<json file name>
```
Or
```
kubectl apply -f <yaml file name>
```

Expand Down
5 changes: 5 additions & 0 deletions pkg/apis/velero/v1/backup_repository_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,11 @@ type BackupRepositorySpec struct {

// MaintenanceFrequency is how often maintenance should be run.
MaintenanceFrequency metav1.Duration `json:"maintenanceFrequency"`

// RepositoryConfig is for repository-specific configuration fields.
// +optional
// +nullable
RepositoryConfig map[string]string `json:"repositoryConfig,omitempty"`
}

// BackupRepositoryPhase represents the lifecycle phase of a BackupRepository.
Expand Down
9 changes: 8 additions & 1 deletion pkg/apis/velero/v1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions pkg/builder/data_download_builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ package builder
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"

"github.com/vmware-tanzu/velero/pkg/apis/velero/shared"
velerov2alpha1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v2alpha1"
)

Expand Down Expand Up @@ -122,3 +123,21 @@ func (d *DataDownloadBuilder) StartTimestamp(startTime *metav1.Time) *DataDownlo
d.object.Status.StartTimestamp = startTime
return d
}

// CompletionTimestamp sets the DataDownload's StartTimestamp.
func (d *DataDownloadBuilder) CompletionTimestamp(completionTimestamp *metav1.Time) *DataDownloadBuilder {
d.object.Status.CompletionTimestamp = completionTimestamp
return d
}

// Progress sets the DataDownload's Progress.
func (d *DataDownloadBuilder) Progress(progress shared.DataMoveOperationProgress) *DataDownloadBuilder {
d.object.Status.Progress = progress
return d
}

// Node sets the DataDownload's Node.
func (d *DataDownloadBuilder) Node(node string) *DataDownloadBuilder {
d.object.Status.Node = node
return d
}
7 changes: 7 additions & 0 deletions pkg/builder/data_upload_builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,14 @@ func (d *DataUploadBuilder) Labels(labels map[string]string) *DataUploadBuilder
return d
}

// Progress sets the DataUpload's Progress.
func (d *DataUploadBuilder) Progress(progress shared.DataMoveOperationProgress) *DataUploadBuilder {
d.object.Status.Progress = progress
return d
}

// Node sets the DataUpload's Node.
func (d *DataUploadBuilder) Node(node string) *DataUploadBuilder {
d.object.Status.Node = node
return d
}
4 changes: 3 additions & 1 deletion pkg/cmd/cli/install/install.go
Original file line number Diff line number Diff line change
Expand Up @@ -364,8 +364,10 @@ func (o *Options) Validate(c *cobra.Command, args []string, f client.Factory) er
return err
}

if err := uploader.ValidateUploaderType(o.UploaderType); err != nil {
if msg, err := uploader.ValidateUploaderType(o.UploaderType); err != nil {
return err
} else if msg != "" {
fmt.Printf("⚠️ %s\n", msg)
}

// If we're only installing CRDs, we can skip the rest of the validation.
Expand Down
41 changes: 14 additions & 27 deletions pkg/cmd/cli/nodeagent/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -293,17 +293,29 @@ func (s *nodeAgentServer) run() {
loadAffinity = s.dataPathConfigs.LoadAffinity[0]
}
dataUploadReconciler := controller.NewDataUploadReconciler(s.mgr.GetClient(), s.mgr, s.kubeClient, s.csiSnapshotClient.SnapshotV1(), s.dataPathMgr, loadAffinity, repoEnsurer, clock.RealClock{}, credentialGetter, s.nodeName, s.fileSystem, s.config.dataMoverPrepareTimeout, s.logger, s.metrics)
s.attemptDataUploadResume(dataUploadReconciler)
if err = dataUploadReconciler.SetupWithManager(s.mgr); err != nil {
s.logger.WithError(err).Fatal("Unable to create the data upload controller")
}

dataDownloadReconciler := controller.NewDataDownloadReconciler(s.mgr.GetClient(), s.mgr, s.kubeClient, s.dataPathMgr, repoEnsurer, credentialGetter, s.nodeName, s.config.dataMoverPrepareTimeout, s.logger, s.metrics)
s.attemptDataDownloadResume(dataDownloadReconciler)
if err = dataDownloadReconciler.SetupWithManager(s.mgr); err != nil {
s.logger.WithError(err).Fatal("Unable to create the data download controller")
}

go func() {
s.mgr.GetCache().WaitForCacheSync(s.ctx)

if err := dataUploadReconciler.AttemptDataUploadResume(s.ctx, s.mgr.GetClient(), s.logger.WithField("node", s.nodeName), s.namespace); err != nil {
s.logger.WithError(errors.WithStack(err)).Error("failed to attempt data upload resume")
}

if err := dataDownloadReconciler.AttemptDataDownloadResume(s.ctx, s.mgr.GetClient(), s.logger.WithField("node", s.nodeName), s.namespace); err != nil {
s.logger.WithError(errors.WithStack(err)).Error("failed to attempt data download resume")
}

s.logger.Info("Attempt complete to resume dataUploads and dataDownloads")
}()

s.logger.Info("Controllers starting...")

if err := s.mgr.Start(ctrl.SetupSignalHandler()); err != nil {
Expand Down Expand Up @@ -373,31 +385,6 @@ func (s *nodeAgentServer) markInProgressCRsFailed() {
s.markInProgressPVRsFailed(client)
}

func (s *nodeAgentServer) attemptDataUploadResume(r *controller.DataUploadReconciler) {
// the function is called before starting the controller manager, the embedded client isn't ready to use, so create a new one here
client, err := ctrlclient.New(s.mgr.GetConfig(), ctrlclient.Options{Scheme: s.mgr.GetScheme()})
if err != nil {
s.logger.WithError(errors.WithStack(err)).Error("failed to create client")
return
}
if err := r.AttemptDataUploadResume(s.ctx, client, s.logger.WithField("node", s.nodeName), s.namespace); err != nil {
s.logger.WithError(errors.WithStack(err)).Error("failed to attempt data upload resume")
}
}

func (s *nodeAgentServer) attemptDataDownloadResume(r *controller.DataDownloadReconciler) {
// the function is called before starting the controller manager, the embedded client isn't ready to use, so create a new one here
client, err := ctrlclient.New(s.mgr.GetConfig(), ctrlclient.Options{Scheme: s.mgr.GetScheme()})
if err != nil {
s.logger.WithError(errors.WithStack(err)).Error("failed to create client")
return
}

if err := r.AttemptDataDownloadResume(s.ctx, client, s.logger.WithField("node", s.nodeName), s.namespace); err != nil {
s.logger.WithError(errors.WithStack(err)).Error("failed to attempt data download resume")
}
}

func (s *nodeAgentServer) markInProgressPVBsFailed(client ctrlclient.Client) {
pvbs := &velerov1api.PodVolumeBackupList{}
if err := client.List(s.ctx, pvbs, &ctrlclient.ListOptions{Namespace: s.namespace}); err != nil {
Expand Down
29 changes: 23 additions & 6 deletions pkg/cmd/server/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,7 @@ type serverConfig struct {
disableInformerCache bool
scheduleSkipImmediately bool
maintenanceCfg repository.MaintenanceConfig
backukpRepoConfig string
}

func NewCommand(f client.Factory) *cobra.Command {
Expand Down Expand Up @@ -253,6 +254,8 @@ func NewCommand(f client.Factory) *cobra.Command {
command.Flags().StringVar(&config.maintenanceCfg.CPULimit, "maintenance-job-cpu-limit", config.maintenanceCfg.CPULimit, "CPU limit for maintenance job. Default is no limit.")
command.Flags().StringVar(&config.maintenanceCfg.MemLimit, "maintenance-job-mem-limit", config.maintenanceCfg.MemLimit, "Memory limit for maintenance job. Default is no limit.")

command.Flags().StringVar(&config.backukpRepoConfig, "backup-repository-config", config.backukpRepoConfig, "The name of configMap containing backup repository configurations.")

// maintenance job log setting inherited from velero server
config.maintenanceCfg.FormatFlag = config.formatFlag
config.maintenanceCfg.LogLevelFlag = logLevelFlag
Expand Down Expand Up @@ -288,8 +291,10 @@ type server struct {
}

func newServer(f client.Factory, config serverConfig, logger *logrus.Logger) (*server, error) {
if err := uploader.ValidateUploaderType(config.uploaderType); err != nil {
if msg, err := uploader.ValidateUploaderType(config.uploaderType); err != nil {
return nil, err
} else if msg != "" {
logger.Warn(msg)
}

if config.clientQPS < 0.0 {
Expand Down Expand Up @@ -876,7 +881,7 @@ func (s *server) runControllers(defaultVolumeSnapshotLocations map[string]string
}

if _, ok := enabledRuntimeControllers[controller.BackupRepo]; ok {
if err := controller.NewBackupRepoReconciler(s.namespace, s.logger, s.mgr.GetClient(), s.config.repoMaintenanceFrequency, s.repoManager).SetupWithManager(s.mgr); err != nil {
if err := controller.NewBackupRepoReconciler(s.namespace, s.logger, s.mgr.GetClient(), s.config.repoMaintenanceFrequency, s.config.backukpRepoConfig, s.repoManager).SetupWithManager(s.mgr); err != nil {
s.logger.Fatal(err, "unable to create controller", "controller", controller.BackupRepo)
}
}
Expand Down Expand Up @@ -1148,9 +1153,15 @@ func markDataUploadsCancel(ctx context.Context, client ctrlclient.Client, backup
du.Status.Phase == velerov2alpha1api.DataUploadPhaseNew ||
du.Status.Phase == "" {
err := controller.UpdateDataUploadWithRetry(ctx, client, types.NamespacedName{Namespace: du.Namespace, Name: du.Name}, log.WithField("dataupload", du.Name),
func(dataUpload *velerov2alpha1api.DataUpload) {
func(dataUpload *velerov2alpha1api.DataUpload) bool {
if dataUpload.Spec.Cancel {
return false
}

dataUpload.Spec.Cancel = true
dataUpload.Status.Message = fmt.Sprintf("found a dataupload with status %q during the velero server starting, mark it as cancel", du.Status.Phase)
dataUpload.Status.Message = fmt.Sprintf("Dataupload is in status %q during the velero server starting, mark it as cancel", du.Status.Phase)

return true
})

if err != nil {
Expand Down Expand Up @@ -1183,9 +1194,15 @@ func markDataDownloadsCancel(ctx context.Context, client ctrlclient.Client, rest
dd.Status.Phase == velerov2alpha1api.DataDownloadPhaseNew ||
dd.Status.Phase == "" {
err := controller.UpdateDataDownloadWithRetry(ctx, client, types.NamespacedName{Namespace: dd.Namespace, Name: dd.Name}, log.WithField("datadownload", dd.Name),
func(dataDownload *velerov2alpha1api.DataDownload) {
func(dataDownload *velerov2alpha1api.DataDownload) bool {
if dataDownload.Spec.Cancel {
return false
}

dataDownload.Spec.Cancel = true
dataDownload.Status.Message = fmt.Sprintf("found a datadownload with status %q during the velero server starting, mark it as cancel", dd.Status.Phase)
dataDownload.Status.Message = fmt.Sprintf("Datadownload is in status %q during the velero server starting, mark it as cancel", dd.Status.Phase)

return true
})

if err != nil {
Expand Down
7 changes: 7 additions & 0 deletions pkg/cmd/server/server_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,13 @@ func Test_newServer(t *testing.T) {
}, logger)
assert.Error(t, err)

// invalid clientQPS Restic uploader
_, err = newServer(factory, serverConfig{
uploaderType: uploader.ResticType,
clientQPS: -1,
}, logger)
assert.Error(t, err)

// invalid clientBurst
factory.On("SetClientQPS", mock.Anything).Return()
_, err = newServer(factory, serverConfig{
Expand Down
Loading

0 comments on commit 3c0948c

Please sign in to comment.