Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark dataupload/datadownload in cancel when velero pod restart #6461

Merged
merged 2 commits into from
Jul 11, 2023

Conversation

qiuming-best
Copy link
Contributor

@qiuming-best qiuming-best commented Jul 5, 2023

Thank you for contributing to Velero!

Please add a summary of your change

Does your change fix a particular issue?

Fixes #(issue)
#6127
#6126

Please indicate you've done the following:

  • Accepted the DCO. Commits without the DCO will delay acceptance.
  • Created a changelog file or added /kind changelog-not-required as a comment on this pull request.
  • Updated the corresponding documentation in site/content/docs/main.

@qiuming-best qiuming-best added the kind/changelog-not-required PR does not require a user changelog. Often for docs, website, or build changes label Jul 5, 2023
@qiuming-best qiuming-best changed the title Mark dataupload datadownload status failed when velero pod restart Mark dataupload datadownload in progress status failed when velero pod restart Jul 5, 2023
@qiuming-best qiuming-best changed the title Mark dataupload datadownload in progress status failed when velero pod restart Mark dataupload datadownload in progress or canceling status failed when velero pod restart Jul 5, 2023
@codecov-commenter
Copy link

codecov-commenter commented Jul 5, 2023

Codecov Report

Merging #6461 (480fe44) into main (e54a8af) will decrease coverage by 0.29%.
The diff coverage is 33.01%.

@@            Coverage Diff             @@
##             main    #6461      +/-   ##
==========================================
- Coverage   60.18%   59.90%   -0.29%     
==========================================
  Files         229      229              
  Lines       24215    24397     +182     
==========================================
+ Hits        14575    14616      +41     
- Misses       8632     8767     +135     
- Partials     1008     1014       +6     
Impacted Files Coverage Δ
pkg/cmd/cli/nodeagent/server.go 9.45% <0.00%> (-1.84%) ⬇️
pkg/cmd/server/server.go 21.87% <42.85%> (+0.99%) ⬆️
pkg/controller/data_download_controller.go 71.06% <43.33%> (-6.11%) ⬇️
pkg/controller/data_upload_controller.go 62.02% <43.85%> (-4.03%) ⬇️

... and 2 files with indirect coverage changes

@qiuming-best qiuming-best marked this pull request as draft July 6, 2023 03:31
@qiuming-best qiuming-best force-pushed the mark-crs-failed branch 2 times, most recently from d75a582 to ed195fd Compare July 6, 2023 09:08
pkg/cmd/cli/nodeagent/server.go Outdated Show resolved Hide resolved
pkg/cmd/cli/nodeagent/server.go Outdated Show resolved Hide resolved
@qiuming-best qiuming-best changed the title Mark dataupload datadownload in progress or canceling status failed when velero pod restart Mark dataupload/datadownload in cancel when velero pod restart Jul 6, 2023
@qiuming-best qiuming-best force-pushed the mark-crs-failed branch 4 times, most recently from 210240b to 20651fe Compare July 6, 2023 15:09
pkg/controller/data_download_controller.go Outdated Show resolved Hide resolved
pkg/controller/data_upload_controller.go Outdated Show resolved Hide resolved
@qiuming-best qiuming-best force-pushed the mark-crs-failed branch 5 times, most recently from e1d96dc to 138ae93 Compare July 7, 2023 03:08
@qiuming-best qiuming-best force-pushed the mark-crs-failed branch 2 times, most recently from 13a13fa to 49f9843 Compare July 7, 2023 03:19
pkg/controller/data_download_controller.go Outdated Show resolved Hide resolved
pkg/controller/data_upload_controller.go Outdated Show resolved Hide resolved
@qiuming-best qiuming-best force-pushed the mark-crs-failed branch 3 times, most recently from 0c822f1 to 3e41c99 Compare July 7, 2023 14:59
@qiuming-best qiuming-best marked this pull request as ready for review July 7, 2023 23:17
@github-actions github-actions bot requested a review from Lyndon-Li July 7, 2023 23:17
pkg/controller/data_download_controller.go Show resolved Hide resolved
pkg/controller/data_download_controller.go Show resolved Hide resolved
pkg/controller/data_upload_controller.go Show resolved Hide resolved
pkg/controller/data_upload_controller.go Show resolved Hide resolved
pkg/cmd/cli/nodeagent/server.go Show resolved Hide resolved
pkg/cmd/cli/nodeagent/server.go Show resolved Hide resolved
pkg/cmd/server/server.go Show resolved Hide resolved
pkg/cmd/server/server.go Show resolved Hide resolved
Lyndon-Li
Lyndon-Li previously approved these changes Jul 10, 2023
@blackpiglet
Copy link
Contributor

Just a quick question, why do both node-agent and the Velero server need the DataUpload and DataDownload cancel logic? Looks like there is some overlap.

@qiuming-best
Copy link
Contributor Author

qiuming-best commented Jul 11, 2023

@blackpiglet > Just a quick question, why do both node-agent and the Velero server need the DataUpload and DataDownload cancel logic? Looks like there is some overlap.

As our DataUpload and DataDownload are different from other CRDs in that they are Backup and Restore CRDs related, their restart handling logic are little more complicated.
This means :

  • if Backup and Restore CRDs failed, the related DataUpload and DataDownload must be also terminated(As per our design doc, it would be Canceled). (Velero server restart)
  • if DataUpload and DataDownload are terminated(As per our design doc, it would be Canceled), Backup and Restore CRDs should be partially failed. (node-agent restart)

So we need to handle both node-agent and Velero server restart.

@Lyndon-Li Lyndon-Li merged commit 49e8058 into vmware-tanzu:main Jul 11, 2023
22 checks passed
@qiuming-best qiuming-best deleted the mark-crs-failed branch July 12, 2023 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
has-unit-tests kind/changelog-not-required PR does not require a user changelog. Often for docs, website, or build changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants