-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Testing consistency group based replication and DR protection #1508
Comments
@batrick and @idryomov would like your opinion on the above proposal to test a consistent group snapshot. Also, if there are existing tests in Ceph that can serve as an example, would help developing around the same. (also tagging @BenamarMk @youhangwang @ELENAGER @keslerzhu for inputs) |
@ShyamsundarR A simple application could be a pod in deployment attach multiple PVCs. All these PVCs will be DR protected using a consistency group. this application keeps to append the latest date into these volumes, for example
there are some scenarios for the file in remote(secondary cluster) in the test:
does this meet our test requirement? |
@youhangwang All PVCs are mounted to a single node, so the Ceph kernel drivers that would ensure the CG on the client side are not distributed. If we distributed this across nodes, then writing the same data (date in this case) across these nodes would need some coordination. Although drenv has a single node, it is useful to create the app in such a manner that it can be run across nodes. |
This looks good to me! For testing the test itself, I could provide a rigged RBD build with the consistency logic inside of
I'm not aware of anything like this on the RBD side. |
@ShyamsundarR I'm not sure I understand this comment. Can you elaborate or perhaps just rephrase? The distributed lock that is part of the consistency logic is per-image, so even if everything is on single node, it still plays a role. |
Ack understood. I would still like to develop a cross node RBD images (or CephFS subdir) use and test for systems where we can have more than one worker node. |
We may potentially use this to test VolumeGroupSnapshot with RBD, to ensure the CG part of it. Tagging @nixpanic for thoughts on using such an app to test the snapshot API. @idryomov I am writing the above to state that we may not need the "one-off" build (at present at least) with the modified API to start testing the CG snaps. Mirroring does change things as we need to test the CG snaps on the remote cluster and not the local cluster, but we can start here.
|
We will create a StatefulSet, where the pod with index 0 would log, to disk, counter values issued to other replicas. These counter values are persisted on disk by the non-0 index replicas and acknowledged back to the log pod. On acknowledgement from the replica the log pod will update the replicas persisted counter value on disk.
All PVCs of this STS would be DR protected using a consistency group.
How does this test CGs:
Log pod (replica 0)
Algo:
disk format/file:
<log.yaml>
Replica pod (replica 1..M)
Algo:
disk format/file
<counter.yaml>
Log pod Init container:
Replica pod Init container:
k8s workload type:
Tests:
The text was updated successfully, but these errors were encountered: