Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes issue when multiple VRGs conflict with the PVCs being protected #1535

Draft
wants to merge 117 commits into
base: main
Choose a base branch
from

Conversation

asn1809
Copy link
Contributor

@asn1809 asn1809 commented Aug 30, 2024

Issue:
When multiple VRGs are created in the same namespace, PVCs having same labels will be protected by both the VRGs which might turn into error while restoring.

Idea in the PR is to:

  1. Acquire lock on the ns for processing a VRG in case it is in non-admin ns and there are multiple VRGs present in the same namespace.
  2. Need to update Error in the VRG in case lock is already acquired by some other VRG.
  3. DRPC monitoring the error on VRG, will show to the user.

Signed-off-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]> Annaraya Narasagond <[email protected]>
@asn1809
Copy link
Contributor Author

asn1809 commented Sep 4, 2024

Flowchart of the solution proposed:
MultipleDRPC

@ShyamsundarR
Copy link
Member

@asn1809 Please do not refer to downstream tickets/trackers/links, access maybe restricted or not generally available.

Also in the commit message using the "Fixes" keyword would cause github to assume it is fixing an issue in this repository or organization, so avoiding that keyword at the bottom helps.

The commit message also needs some massaging.

@asn1809 asn1809 changed the title Fixes https://issues.redhat.com/browse/OCSBZM-4691 Fixes issue when multiple VRGs conflict with the PVCs being protected Sep 5, 2024
Signed-off-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]> Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]> Annaraya Narasagond <[email protected]>
Copy link
Contributor Author

@asn1809 asn1809 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments after discussion with @raghavendra-talur needs to be addressed.

}

// NewNamespaceLock returns new NamespaceLock
func NewNamespaceLock() *NamespaceLock {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raghavendra-talur : Pass ns string and use for struct init.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just gave a thought on what we discussed. volumereplicationgroup_controller -- SetupWithManager (locks are initialized) r.locks = rmnutil.NewNamespaceLock() ns cannot be passed here as this call is well before Reconcile is invoked. So instead of string with ns, set of namespaces(locked) ones would be better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, it could be

type NamespaceLock struct {
	mux       map[string]sync.Mutex
}

// TryToAcquireLock tries to acquire the lock for processing VRG in a namespace having
// multiple VRGs and returns true if successful.
// If processing has already begun in the namespace, returns false.
func (nl *NamespaceLock) TryToAcquireLock(namespace string) bool {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raghavendra-talur : Namespace arg is not required as it would be already acquired.

// If processing has already begun in the namespace, returns false.
func (nl *NamespaceLock) TryToAcquireLock(namespace string) bool {
nl.mux.Lock()
defer nl.mux.Unlock()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raghavendra-talur : This should be removed.

nl.mux.Lock()
defer nl.mux.Unlock()

if nl.namespace == namespace {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raghavendra-talur : Handle this according to struct.


// Release removes lock on the namespace
func (nl *NamespaceLock) Release(namespace string) {
nl.mux.Lock()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raghavendra-talur : Remove lock line.


var _ = Describe("Testing Locks", func() {
nsLock := util.NewNamespaceLock()
Expect(nsLock.TryToAcquireLock("test")).To(BeTrue())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raghavendra-talur : May have to use go func or anonymous functions.

@@ -115,6 +116,8 @@ func (r *VolumeReplicationGroupReconciler) SetupWithManager(
r.Log.Info("Kube object protection disabled; don't watch kube objects requests")
}

r.locks = rmnutil.NewNamespaceLock()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass the vrg ns as arg.


// if the number of vrgs in the ns is more than 1, lock is needed.
if len(vrgList.Items) > 1 {
if isLockAcquired := v.reconciler.locks.TryToAcquireLock(ns); !isLockAcquired {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since TryToAcquireLock is a blocking call, not of response will not be helpful. Think on alternative.

// Acquiring lock failed, VRG reconcile should be requeued
return err
}
defer v.reconciler.locks.Release(ns)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The release of the locks should be done once PVC is verified for the ownership label.

@@ -437,6 +440,12 @@ func (r *VolumeReplicationGroupReconciler) Reconcile(ctx context.Context, req ct
"Please install velero/oadp and restart the operator", v.instance.Namespace, v.instance.Name)
}

err = v.vrgParallelProcessingCheck(adminNamespaceVRG)

if err != nil {
Copy link
Contributor Author

@asn1809 asn1809 Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The err will not be not nil anytime.

ELENAGER and others added 21 commits October 8, 2024 23:10
Previously we used VolumeReplication name and namespace for updating PVC conditions due to the fact
that we had one-to-one relationship between VolumeReplication and PVC. Now when we are going to use
consistency group this handling will not be right anymore, because several PVCs can be related to
one VolumeReplication. So, we need to use PVC name and namespace for updating PVC conditions.

Signed-off-by: Elena Gershkovich <[email protected]>
Using one item per line to make future changes easier.

Signed-off-by: Nir Soffer <[email protected]>
Current python on Fedora 39 and macOS is 3.12, and 3.13 development
version is available on github actions for a while. Add the versions so
we can ensure compatibility with current and next python version.

We keep python3.9 for compatibility with system using old python like
RHEL 9. At some point we will want to drop old version and consume
newer features in current python, but we don't have anything requiring
this yet, so let try to be compatible.

Signed-off-by: Nir Soffer <[email protected]>
For ramen, the capabilities for hub and
cluster bundles currently states "Basic Install",
Updating it to "Seamless Upgrades".

Fixes: [Bug-2303823](https://bugzilla.redhat.com/show_bug.cgi?id=2303823)
Signed-off-by: Abhijeet Shakya <[email protected]>
The changeset includes a new DeleteNamespaceManifestWork() func,
which first checks if the mw.Spec has delete option or if it already
has a DeletionTimestamp. Accordingly, it proceeds to delete the
namespace manifestwork.
It also updates the namespace manifestwork with the deleteOption and
propogationPolicy of type orphan, whenever the createOrUpdateNamespaceManifest()
func is called.

Fixes: [Bug 2059669](https://bugzilla.redhat.com/show_bug.cgi?id=2059669)
Signed-off-by: Abhijeet Shakya <[email protected]>
Signed-off-by: Elena Gershkovich <[email protected]>
In newer controller-runtime version 0.19.0 the check for unique controller names.
In our tests we are registering the same controller (drcluster controller) several times - in
suite_test and in drcluster_mmode_test and in drcluster_drcconfig_tests. As a temporaty solution
I added a flag for skipping the unique controller name validation. Another solution can be
adding a name as a parameter for SetupWithManager function.

Signed-off-by: Elena Gershkovich <[email protected]>
The issue where the "velero.io/exclude-from-backup" label was not applied
originates from our deployment workflow. Initially, we deploy the workload,
followed by enabling DR. During the first deployment, the Placement Operator
creates the "PlacementDecision" with default labels. However, when "Ramen" tries
to add the "velero.io/exclude-from-backup" label during DR setup, it skips because
the "PlacementDecision" already exists. Consequently, "Velero" backs up the
"PlacementDecision". And during hub recovery, it is restored without its status,
leading to the unintended deletion of the workload. This situation only occurs
when the current state wasn't updated before hub recovery was applied.

The fix in this PR does not address the scenario where the workload is deployed,
a hub backup is taken, DR is enabled, and then the hub is recovered before another
backup is created.

Fixes bug: 2308801

Signed-off-by: Benamar Mekhissi <[email protected]>
Fixing this error seen in the CI:

    drenv.commands.Error: Command failed:
       command: ('addons/argocd/test', 'rdr-hub', 'rdr-dr1', 'rdr-dr2')
       exitcode: 1
       error:
          Traceback (most recent call last):
            ...
          drenv.commands.Error: Command failed:
             command: ('kubectl', 'delete', '--context', 'rdr-dr2', 'namespace', 'argocd-test', '--wait=false')
             exitcode: 1
             error:
                Error from server (NotFound): namespaces "argocd-test" not found

Signed-off-by: Nir Soffer <[email protected]>
This commit allows PVs to be resized by enabling the
'AllowVolumeExpansion' flag in the StorageClass.
When this flag is set to true, users can dynamically
adjust the size of PVCs by specifying the desired
capacity in the PVC specification.

Signed-off-by: rakeshgm <[email protected]>
We updated controller SDK to the 0.19 version, moving the
envtest to the same version for required testing.

This does not fix any test or such, it is just a hygine change.

Signed-off-by: Shyamsundar Ranganathan <[email protected]>
Developers can enable it by copying the file from the hack directory
instead of copying and pasting the code from docs/devel-quick-start.md
and making the hook executable.

Signed-off-by: Nir Soffer <[email protected]>
k8s.io/component-base should have been a direct dependency.
Fixes the go mod error.

Signed-off-by: Raghavendra Talur <[email protected]>
- Integrated the latest CRD from external-snapshotter.
- Updated dependencies and relevant configuration files.
- Changed VolumeSnapshotRefList in VolumeGroupSnapshotStatus to PVCVolumeSnapshotRefList,
which allows to map a VolumeSnapshot to the application PVC.

Signed-off-by: Benamar Mekhissi <[email protected]>
This is mainly to ensure that the VRG on the primary and secondary are in sync
in regards to labels and annotations

Signed-off-by: Benamar Mekhissi <[email protected]>
nirs and others added 30 commits October 8, 2024 23:10
Was forgetten when we replaced minikube volumesnapshot addons.

Signed-off-by: Nir Soffer <[email protected]>
The comment is still unclear, but at least easier to read.

Signed-off-by: Nir Soffer <[email protected]>
- Replace "VR under deletion" with "deleted VR", matching the
  terminology used in the code.
- Replace "detected as missing or deleted" with "is missing" when we
  know the VR does not exist.

Signed-off-by: Nir Soffer <[email protected]>
Moved from validateVRStatus() to make room from checking VR VAlidated
status. This also make the function easier to understand, keeping the
same level of abstraction and getting rid of the uninteresting details.

Signed-off-by: Nir Soffer <[email protected]>
Rename msg to errorMsg and document that this is an error message,
if the we could not get the condition value, because it is missing,
stale, or unknown.

Signed-off-by: Nir Soffer <[email protected]>
Log after important changes to the system in delete VR flow to make it
easier to understand what the system is doing, and how ramen changed the
system.

New logs:
- delete the VR resource
- remove annotations from PV

Improve logs:
- remove annotations, labels, and finalizers from PVC

Signed-off-by: Nir Soffer <[email protected]>
When deleting a primary VRG, we wait until the VR Completed condition is
met. However if a VR precondition failed, for example using a drpolicy
without flattening enabled when the PVC needs flattening, the VR will
never complete and the vrg and drpc deletion will never complete.

Since csi-addons 0.10.0 we have a new Validated VR condition, set to
true if pre conditions are met, and false if not. VR is can be deleted
safely in this state, since mirroring was not enabled.

This changes modifies deleted VRG processing to check the new VR
Validated status. If the condition exist and the condition status is
false, validateVRStatus() return true, signaling that the VR is in the
desired state, and ramen completes the delete flow.

If the VR does not report the Validated condition (e.g. old csi-addon
version) or the condition status is true (mirroring in progress), we
continue in the normal flow. The VR will be deleted only when the
Completed condition status is true.

Tested with discovered deployment and vm using a pvc created from a
volume snapshot.

Signed-off-by: Nir Soffer <[email protected]>
To refresh the cache we need to checkout ramen source and run drenv
cache with the environment files. Using a workflow for this make this
job easy to implement and manage without accessing the runner directly.

The job can also run manually from github UI. This is likely to work for
people with write access.

Signed-off-by: Nir Soffer <[email protected]>
Resolved an issue causing sporadic failures in the Ramen/VolSync related
unit test due to delay creation of the PVC resource.

Signed-off-by: Benamar Mekhissi <[email protected]>
This is useful when you want to avoid drenv start getting stuck, and the
environment does not provide a way to time out.

Example run:

    % drenv start --timeout 60 envs/vm.yaml
    2024-09-21 04:27:43,555 INFO    [vm] Starting environment
    2024-09-21 04:27:43,581 INFO    [cluster] Starting lima cluster
    2024-09-21 04:28:43,785 ERROR   [cluster] did not receive an event with the "running" status
    2024-09-21 04:28:43,790 ERROR   Command failed
    Traceback (most recent call last):
      ...
    drenv.commands.Error: Command failed:
       command: ('limactl', '--log-format=json', 'start', '--timeout=60s', 'cluster')
       exitcode: 1
       error:

For lima provider, we pass the timeout to limactl. For minikube we use
commands.watch() timeout. External provider does not use the timeout
since we don't start the cluster.

Signed-off-by: Nir Soffer <[email protected]>
On github actions we cannot start a vm with more than one cpu, and our
test cluster is very small so it should work with 1g of ram. This makes
the test cluster consume less resources if a developer leave it running
for long time.

Using 1 cpu conflicts with kubeadm preflight checks:

    [init] Using Kubernetes version: v1.31.0
    ...
    error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
    [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

Since we know we are doing, we suppress this preflight error.

Running top shows that the cluster consumes only 9% cpu and 33.5% of the
available memory.

    top - 04:51:02 up 10 min,  1 user,  load average: 0.60, 0.29, 0.16
    Tasks: 123 total,   1 running, 122 sleeping,   0 stopped,   0 zombie
    %Cpu(s):   6.2/3.1     9[||||||                                                        ]
    MiB Mem : 33.5/1959.9   [|||||||||||||||||||||                                         ]
    MiB Swap:  0.0/0.0      [                                                              ]

        PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
       4124 root      20   0 1449480 244352  68224 S   2.3  12.2   0:18.47 kube-apiserver
       4120 root      20   0 1309260 102456  60800 S   0.7   5.1   0:06.17 kube-controller
       4273 root      20   0 1902692  92040  59904 S   2.0   4.6   0:12.47 kubelet
       4159 root      20   0 1288880  60156  44288 S   0.3   3.0   0:02.20 kube-scheduler
       3718 root      20   0 1267660  57880  31020 S   0.3   2.9   0:09.28 containerd
       4453 root      20   0 1289120  53540  42880 S   0.0   2.7   0:00.15 kube-proxy
       5100 65532     20   0 1284608  49656  37504 S   0.0   2.5   0:01.14 coredns
       5000 65532     20   0 1284608  49528  37376 S   0.0   2.5   0:00.86 coredns
       4166 root      20   0   11.2g  47360  21504 S   1.3   2.4   0:08.69 etcd

We could make the cluster even smaller, but kubeadm preflight check
requires 1700 MiB, and we don't have memory issue in github or on
developers machines.

Signed-off-by: Nir Soffer <[email protected]>
We need rosetta only for complex setup when a component is not providing
arm64 images. For testing drenv we have an empty cluster with busybox,
and out busybox image is multi-arch.

Disabling rosetta may speed up the the cluster, and this may be
significant on github actions since the runners are about 6.5 times
slower compared to M1 mac.

Signed-off-by: Nir Soffer <[email protected]>
This is useful for running the tests on github runner, and also support
one option for macOS.

Signed-off-by: Nir Soffer <[email protected]>
Same optimization used in minikube. Minikube use 2 replicas only when
using HA configuration.

Signed-off-by: Nir Soffer <[email protected]>
It takes about 30 seconds until coredns is ready but we don't depend on
it in the current code. Removing this wait we can start deploying 30
seconds earlier. This reduce the time for starting a cluster from 155
seconds to 125 seconds, and regional-dr environment from 450 to 420
seconds.

Example run with vm environment:

    % drenv start envs/vm.yaml
    2024-09-22 18:39:55,726 INFO    [vm] Starting environment
    2024-09-22 18:39:55,743 INFO    [cluster] Starting lima cluster
    2024-09-22 18:41:57,818 INFO    [cluster] Cluster started in 122.07 seconds
    2024-09-22 18:41:57,819 INFO    [cluster/0] Running addons/example/start
    2024-09-22 18:42:18,966 INFO    [cluster/0] addons/example/start completed in 21.15 seconds
    2024-09-22 18:42:18,966 INFO    [cluster/0] Running addons/example/test
    2024-09-22 18:42:19,120 INFO    [cluster/0] addons/example/test completed in 0.15 seconds
    2024-09-22 18:42:19,121 INFO    [vm] Environment started in 143.40 seconds

    % drenv stop envs/vm.yaml
    2024-09-22 18:42:44,244 INFO    [vm] Stopping environment
    2024-09-22 18:42:44,317 INFO    [cluster] Stopping lima cluster
    2024-09-22 18:42:44,578 WARNING [cluster] [hostagent] dhcp: unhandled message type: RELEASE
    2024-09-22 18:42:49,441 INFO    [cluster] Cluster stopped in 5.13 seconds
    2024-09-22 18:42:49,441 INFO    [vm] Environment stopped in 5.20 seconds

    % drenv start envs/vm.yaml
    2024-09-22 18:42:53,132 INFO    [vm] Starting environment
    2024-09-22 18:42:53,156 INFO    [cluster] Starting lima cluster
    2024-09-22 18:43:34,436 INFO    [cluster] Cluster started in 41.28 seconds
    2024-09-22 18:43:34,437 INFO    [cluster] Looking up failed deployments
    2024-09-22 18:43:34,842 INFO    [cluster/0] Running addons/example/start
    2024-09-22 18:43:35,208 INFO    [cluster/0] addons/example/start completed in 0.37 seconds
    2024-09-22 18:43:35,208 INFO    [cluster/0] Running addons/example/test
    2024-09-22 18:43:35,371 INFO    [cluster/0] addons/example/test completed in 0.16 seconds
    2024-09-22 18:43:35,372 INFO    [vm] Environment started in 42.24 seconds

Signed-off-by: Nir Soffer <[email protected]>
- Use `/readyz` endpoint for waiting until kubernetes cluster is ready
  instead of `kubectl version`. This matches the way we wait for the
  cluster in drenv.

- When waiting for coredns deployment, use `kubectl rollout status`,
  matching other code in drenv and addons.

- Nicer probe description, visible in drenv when using --verbose.

Signed-off-by: Nir Soffer <[email protected]>
We can ignore all ports for all protocols for all addresses using a
simpler rule.

With this change we see this log in --verbose mode:

    2024-09-23 01:12:18,130 DEBUG   [cluster] [hostagent] TCP (except for SSH) and UDP port forwarding is disabled

And no "Not forwarding port ..." message. Previously we had lot of these
messages[1]. This requires lima commit[2].  pull current master if you
run older version.

[1] lima-vm/lima#2577
[2] lima-vm/lima@9a09350

Signed-off-by: Nir Soffer <[email protected]>
And remove unneeded export - it was used to run kubectl commands before
we setup root/.kube/config, but this step does not run any kubectl
commands.

Signed-off-by: Nir Soffer <[email protected]>
Provision scripts using `mode: system` run as root and do not need use
sudo. Looks like the script were copied from code not running as root.

Signed-off-by: Nir Soffer <[email protected]>
This is fixed now in lima[1] so we don't need to keep the fix in drenv.
Developers should pull latest lima from git to use this fix.

[1] lima-vm/lima#2632

Signed-off-by: Nir Soffer <[email protected]>
Added a check to stop relocation reconciliation if one of the clusters is
unreachable. This prevents potential misclassification after hub recovery,
which could lead to undesired results and inconsistencies.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2304182

Signed-off-by: Benamar Mekhissi <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]> Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]> Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]> Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]> Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]> Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]> Annaraya Narasagond <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants