-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataUpload Fails on Kubernetes 1.29 due to changed VSC SourceVolumeMode #8259
Comments
Ref docs regarding this new change: https://kubernetes.io/blog/2024/04/30/prevent-unauthorized-volume-mode-conversion-ga/ |
Good catch! Created another issue to collect the error messages in backup/source VS/VSC to help on troubleshooting #8267. |
@msfrucht @shubham-pampattiwar The fix is adding the source volume mode in the PVC VSC, but the reported error is for the VolumeSnapshotContent created in the source namespace.
|
VolumeSnapshotContent objects are not namespaced. The logging is to indicate the VolumeSnapshot's namespace. This is similar to the relationship of PVC and PV. During backup the VolumeSnapshot is moved to the Velero install namespace. The VolumeSnapshotContent is simply copied with a new name, no namespace. During backup the VolumeSnapshotContent is copied and that copy fails to pickup that SourceVolumeMode. |
This should not block v1.15 rc. |
I will need to check what part of the puzzle is responsible for setting a SourceVolumeMode whether the external-snapshotter or the CSI driver. The lack of SourceVolumeMode still would have caused a failure on copying the VolumeSnapshotContents if it had been set. |
It is the responsibility of the external-snapshotter. https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/3141-prevent-volume-mode-conversion
That appears to be working correctly. The SubjectAdmissionReview shows that the ServiceAccount requesting the change is ceph-csi system:serviceaccount:rook-ceph:rook-csi-rbd-provisioner-sa which provides the CSI driver as part of a Rook.io install. I will need to open an issue with ceph-csi. |
Ceph-csi has already updated to external-snapshotter v8.0.0 in development ceph/ceph-csi@c48f5bf There are no existing releases with external-snapshotter v8.0.0. |
@msfrucht |
Below is the While for
|
What steps did you take and what happened:
Performed a DataUpload using Velero 1.14.1 on Kubernetes 1.29/OpenShift 4.16
The DataUpload fails with the error from the node-agent.
2024-10-03T09:21:23Z ERROR Reconciler error {"controller": "dataupload", "controllerGroup": "velero.io", "controllerKind": "DataUpload", "DataUpload": {"name":"be9184c2-b547-46f6-a4c0-c6a20d96e7e0-1","namespace":"ibm-backup-restore"}, "namespace": "ibm-backup-restore", "name": "be9184c2-b547-46f6-a4c0-c6a20d96e7e0-1", "reconcileID": "ea068f8e-f12b-4e14-8e69-ee44e8d72e19", "error": "error to delete volume snapshot content: error to assure VolumeSnapshotContent is deleted, snapcontent-f3d87ab0-5db8-49d9-bd91-f86c089be222: error to get VolumeSnapshotContent snapcontent-f3d87ab0-5db8-49d9-bd91-f86c089be222: client rate limiter Wait returned an error: context deadline exceeded", "errorVerbose": "client rate limiter Wait returned an error: context deadline exceeded\nerror to get VolumeSnapshotContent snapcontent-f3d87ab0-5db8-49d9-bd91-f86c089be222\ngithub.com/vmware-tanzu/velero/pkg/util/csi.EnsureDeleteVSC.func1\n\t/go/src/github.com/vmware-tanzu/velero/pkg/util/csi/volume_snapshot.go:229
The actual failure shows itself in the CSI driver logs for Ceph RBD and the snapshot-controller webhook pod.
During creation of the backup VSC the field Spec.SourceVolumeMode is not copied resulting in the failure. Newer versions of the snapshot-controller verify the SourceVolumeMode field against previous versions of the object.
What did you expect to happen:
DataUpload to succeed.
If you are using velero v1.7.0+:
The Velero Backup object has been deleted. I only have access to the velero and node-agent logs. The node-agent logs were the only ones of value for this issue.
node-agent-logs.zip
Anything else you would like to add:
Webhook logs.
webhook-logs.zip
Environment:
velero version
): Velero 1.14velero client config get features
): EnableCSIkubectl version
): 1.29/etc/os-release
): Red Hat CoreVote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
The text was updated successfully, but these errors were encountered: