Skip to content

Conversation

o-fedorov
Copy link
Contributor

Related issue: #43

Description

  1. If GracefulMasterTakeover fails, reset read_only for master.
  2. Introduce PostUnsuccessfulGracefulTakeoverProcesses to allow running more sophisticated recovery.
  3. Add a related system test.

o-fedorov added 2 commits July 2, 2024 14:08
1. Revert `read_only` for master in case of an error.
2. Introduce `PostUnsuccessfulGracefulTakeoverProcesses`.
@o-fedorov
Copy link
Contributor Author

@kamil-holubicki, could I please ask you to take a look at this PR?

Copy link

@jim3ccc jim3ccc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@o-fedorov
Copy link
Contributor Author

Hello @kamil-holubicki. Hope you are doing well. Could you please share any feedback regarding this PR?

@o-fedorov
Copy link
Contributor Author

@egegunes, @fabio-silva, @kamil-holubicki, @igroene, could I please ask you for a feedback regarding this PR?

@hulyav
Copy link

hulyav commented Jul 17, 2024

Unfortunately, we do not prioritize feature requests or code submissions with our limited bandwidth for Orchestrator as our focus is on more widely impacting issues that affect the majority of the Percona user base. I understand there are discussions underway between our sales team and your leadership that might allow us to one-day take this on.

@o-fedorov
Copy link
Contributor Author

Hi @hulyav. Thank you for your response. I’d like to highlight that my team has already fixed this particular bug on our side, and merging the fix to the upstream would benefit the community, since there can be other users that face this issue. Also, it would be nice to merge this pull request sooner rather than later to prevent the codebases divergence.

@o-fedorov
Copy link
Contributor Author

Hello @hulyav and @kamil-holubicki . It is already 4 month since this PR was created. I have a feeling that it eventually will not be accepted. To get rid of the uncertainty about this change, should I close this PR and the corresponding issue as rejected?

promotedMasterCoordinates = &designatedInstance.SelfBinlogCoordinates

log.Infof("GracefulMasterTakeover: attempting recovery")
recoveryAttempted, topologyRecovery, err := ForceExecuteRecovery(analysisEntry, &designatedInstance.Key, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @o-fedorov , I'm not sure about this error propagation and handling. In the original flow, if ForceExecuteRecovery returned err, but also returned recoveryAttempted and topologyRecovery, GracefulMasterTakeover continued.
Now it makes master R/W (which is fine), but then returns.

Copy link
Contributor Author

@o-fedorov o-fedorov Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @kamil-holubicki , I forgot to clear the error at the end of the function. Updated now.

Note that in the original code, err is ignored below the refactored part, and is eventually overridden.

@hulyav
Copy link

hulyav commented May 13, 2025

Thanks for submitting the patch — we appreciate the effort. While we do maintain a fork of Orchestrator for use in our own products, we aren’t the upstream maintainers and generally don’t make changes unless it addresses an issue we’re actively working on or a customer has commissioned the fix. That said, you're welcome to keep the PR open or close it at your discretion — totally your call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants