Skip to content

Draft fix for the OADP 4289 - to be tested before proposed upstream#331

Closed
mpryc wants to merge 43 commits intoopenshift:oadp-1.4from
mpryc:fix_for_oadp_4289
Closed

Draft fix for the OADP 4289 - to be tested before proposed upstream#331
mpryc wants to merge 43 commits intoopenshift:oadp-1.4from
mpryc:fix_for_oadp_4289

Conversation

@mpryc
Copy link
Copy Markdown

@mpryc mpryc commented Jul 25, 2024

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 25, 2024
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Jul 25, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@mpryc
Copy link
Copy Markdown
Author

mpryc commented Jul 25, 2024

@sseago @shubham-pampattiwar @weshayutin if you could comment if it's proper fix.

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Jul 25, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mpryc

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mpryc mpryc force-pushed the fix_for_oadp_4289 branch from c756d33 to def0083 Compare July 25, 2024 19:00
Comment thread pkg/restore/restore.go Outdated
Comment thread pkg/restore/restore.go Outdated
Comment thread pkg/restore/restore.go Outdated
Comment thread pkg/restore/restore.go Outdated
Comment thread pkg/restore/restore.go Outdated
@mpryc
Copy link
Copy Markdown
Author

mpryc commented Aug 1, 2024

@sseago I've revisited that block of code and reworked it to be much simpler without so many if-else statements. Also checking if the error is type of conflict is unnecessary as the function RetryOnConflict won't retry on other errors.

If you could comment on current implementation.

@mpryc mpryc force-pushed the fix_for_oadp_4289 branch from 3234e26 to 25a069f Compare August 1, 2024 07:45
Copy link
Copy Markdown

@sseago sseago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated workflow is a lot clearer -- only one patch attempt each time the func is called. The only problems I see now are 1) handling of the obj we're adding managed fields for prior to generating patch, and 2) we shouldn't error our for NotFound errors, to replicate current behavior (where the NotFound logic was added a few releases ago to handle an edge case that a user ran into). Comments inline.

Comment thread pkg/restore/restore.go
Comment thread pkg/restore/restore.go
Comment thread pkg/restore/restore.go
Comment thread pkg/restore/restore.go Outdated

} else {
ctx.log.Infof("the managed fields for %s is patched", kube.NamespaceAndName(obj))
withoutManagedFields = createdObj.DeepCopy()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

withManagedFields = createdObj
(I don't think we need deepcopy here, since this is the last use of createdObj)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The createdObj is used later, so I wonder if we need to do some assignment so createdObj becomes updated with managed fields in a situation we get one fresh from cluster ?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. Actually, in that case, a simpler change. Ignore the suggestion to create withManagedFields and instead, above we can set createdObj to the result of resourceClient.Get and omit the else clause completely. We'll do the deepCopy outside of the if block.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed.

Comment thread pkg/restore/restore.go
Comment thread pkg/restore/restore.go Outdated
Comment thread pkg/restore/restore.go

if err != nil {
ctx.log.Errorf("error patching managed fields %s: %v", kube.NamespaceAndName(obj), err)
errs.Add(namespace, err)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the log.Errorf here, since we already logged the error inside the retry func. Also, we need to add the if block around errs.Add and return here, since we don't error out if it's a NotFound error:

	if !apierrors.IsNotFound(err) {
		errs.Add(namespace, err)
		return warnings, errs, itemExists
	}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For that I think we need to also include:

	if err != nil && !apierrors.IsNotFound(err) {
		errs.Add(namespace, err)
		return warnings, errs, itemExists
	}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed now.

allenxu404 and others added 20 commits August 1, 2024 11:45
Signed-off-by: allenxu404 <qix2@vmware.com>
Signed-off-by: lou <alex1988@outlook.com>
Co-authored-by: lou <alex1988@outlook.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
1. Skip deleting the restore files from storage if the backup/BSL is not found
2. Allow deleting the restore files from storage even though the BSL is readonly

Signed-off-by: Wenkai Yin(尹文开) <yinw@vmware.com>
Add PSA audit and warn labels.

Signed-off-by: Xun Jiang <jxun@vmware.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
Signed-off-by: allenxu404 <qix2@vmware.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
(cherry picked from commit 8e01d1b)

Signed-off-by: David Zaninovic <dzaninovic@catalogicsoftware.com>
Replace the base image with paketobuildpacks image

Fixes velero-io#6851

Signed-off-by: Wenkai Yin(尹文开) <yinw@vmware.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
Signed-off-by: allenxu404 <qix2@vmware.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
Use informer cache with dynamic client for Get calls on restore
When enabled, also make the Get call before create.

Add server and install parameter to allow disabling this feature,
but enable by default

Signed-off-by: Scott Seago <sseago@redhat.com>
When creating resources with generateName, apimachinery
does not guarantee uniqueness when it appends the random
suffix to the generateName stub, so if it fails with
already exists error, we need to retry.

Signed-off-by: Scott Seago <sseago@redhat.com>
Signed-off-by: Sebastian Glab <sglab@catalogicsoftware.com>
Signed-off-by: allenxu404 <qix2@vmware.com>
sseago and others added 4 commits August 1, 2024 11:45
Signed-off-by: Mateus Oliveira <msouzaol@redhat.com>
…penshift#330)

* oadp-1.4: OADP-3227: Mark InProgress backup/restore as failed upon requeuing (openshift#315)

* Mark InProgress backup/restore as failed upon requeuing

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

remove uuid, return err to requeue instead of requeue: true

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

* cleanup to minimize diff from upstream

Signed-off-by: Scott Seago <sseago@redhat.com>

* error message update

Signed-off-by: Scott Seago <sseago@redhat.com>

* requeue on finalize status update.

Unlike the InProgress transition, there's no need to fail here,
since the Finalize steps can be repeated.

* Only run patch once for all backup finalizer return scenarios

Signed-off-by: Scott Seago <sseago@redhat.com>

---------

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Signed-off-by: Scott Seago <sseago@redhat.com>
Co-authored-by: Scott Seago <sseago@redhat.com>

* oadp-1.4: OADP-3227: Reconcile To Fail: Add backup/restore trackers (openshift#324)

* OADP-4265: Reconcile To Fail: Add backup/restore trackers

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

* Apply suggestions from code review: backupTracker

* Address restoreTracker feedback

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

* s/delete from/add to/ in the comment

* unit test fix

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

* backup_controller unit test

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

* restore_controller unit test

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

* `make update`

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

* mock patch to fail failure due to connection refused

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

---------

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

* regenerate mocks

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

---------

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Signed-off-by: Scott Seago <sseago@redhat.com>
Co-authored-by: Scott Seago <sseago@redhat.com>
Comment thread pkg/restore/restore.go
ctx.log.Infof("the managed fields for %s is patched", kube.NamespaceAndName(obj))
withoutManagedFields = createdObj.DeepCopy()
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add withoutManagedFields = createdObj.DeepCopy() here, right before calling setManagedFields.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that logic is much cleaner. thanks.

Comment thread pkg/restore/restore.go Outdated
* fix: ARM images

Signed-off-by: Mateus Oliveira <msouzaol@redhat.com>

* fixup! fix: ARM images

Signed-off-by: Mateus Oliveira <msouzaol@redhat.com>

---------

Signed-off-by: Mateus Oliveira <msouzaol@redhat.com>
(cherry picked from commit 41ad312)
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 6, 2024
mpryc and others added 6 commits August 13, 2024 15:22
…openshift#334)

Introduction of downstream only option to override Kopia default:
 - hashing algorithm
 - splitting algorithm
 - encryption algorithm

With introduction of 3 environment variables it is possible to override
Kopia algorithms used by Velero:

KOPIA_HASHING_ALGORITHM
KOPIA_SPLITTER_ALGORITHM
KOPIA_ENCRYPTION_ALGORITHM

If the env algorithms are not set or they are not within
Kopia SupportedAlgorithms, the default algorithm will be used.
This behavior is consistent with current behavior without this
change.

Signed-off-by: Michal Pryc <mpryc@redhat.com>
Signed-off-by: Michal Pryc <mpryc@redhat.com>
Signed-off-by: Michal Pryc <mpryc@redhat.com>
Signed-off-by: Michal Pryc <mpryc@redhat.com>
Additional changes to the logic for restoring managed fields.

Signed-off-by: Michal Pryc <mpryc@redhat.com>
@mpryc mpryc force-pushed the fix_for_oadp_4289 branch from 25a069f to 74a4f2f Compare August 19, 2024 11:29
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 19, 2024
@mpryc mpryc requested a review from sseago August 19, 2024 20:33
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Sep 10, 2024

@mpryc: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/lint 74a4f2f link true /test lint

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 10, 2024
@openshift-merge-robot
Copy link
Copy Markdown

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-bot
Copy link
Copy Markdown

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci Bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 10, 2024
@openshift-bot
Copy link
Copy Markdown

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci Bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 9, 2025
@openshift-bot
Copy link
Copy Markdown

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci openshift-ci Bot closed this Feb 9, 2025
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Feb 9, 2025

@openshift-bot: Closed this PR.

Details

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.