Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow darknode migration across cloud provider regions #75

Open
bakebrain opened this issue Aug 3, 2020 · 2 comments
Open

Allow darknode migration across cloud provider regions #75

bakebrain opened this issue Aug 3, 2020 · 2 comments
Labels
feature New feature or request help wanted Extra attention is needed

Comments

@bakebrain
Copy link

Summary

Allow a migration of darknodes across cloud provider regions - this is related to #22 - but staying with the currently selected provider.

Basic example (aws)

darknode migrate YOUR-DARKNODE-NAME --source-aws-region sa-east-1 --target-aws-region eu-west-1

Motivation

  • allow an "internal" migration and avoid the lengthy deregister/destroy process - saving time, money, fees...
  • react to issues within the current region, e.g. longer downtimes, instance type unavailability changes etc.
  • potentially save cost since pricing across "random" regions differ
@loongy
Copy link
Contributor

loongy commented Aug 20, 2020

While this would be a great feature to have, it is very difficult to achieve without risk for the node operation.

Option 1:

a. setup new node, but keep it off (easy)
b. keep new node up-to-date with all of the internal state of the old node (very hard)
c. turn off old node and then turn on new node without losing track of messages in-flight (potentially impossible)

Without achieving these four steps reliably, the new node would be at risk of broadcasting information that was at odds with the old node. In the worst case, for example, it might see itself proposing two different blocks in the same height/round, and this would result in slashing.

Option 2:

a. setup new node, but keep it off (easy)
b. turn off old node (easy)
c. download state and upload to new node (easy, but slow)
d. turn on new node (easy)

This solves the problem in (1), by effectively cloning the state from the old node. This has to be done while the old node is off (and before the new node is on), otherwise the cloned state can become stale and we are have the same problem as (1) again. But, step (2.b) is slow and could result in extended down-time for the node. Especially if there is an intermittent connectivity issue between the node operator's workspace and the node. Storage space can be several GBs, so downloading and then uploading the entire backup will take time. Then, the new node will be (at best) a few minutes behind the rest of the network and need to re-synchronise with it.

It is more than possible that this takes too long, and the node begins begin slashed (and is forcibly deregistered) for not being online.

@loongy loongy added feature New feature or request wontfix This will not be worked on help wanted Extra attention is needed and removed wontfix This will not be worked on labels Aug 20, 2020
@loongy
Copy link
Contributor

loongy commented Aug 20, 2020

Decided not to label as wont fix, and instead label as help wanted. Open to suggestions here, but unless something compelling is suggested that minimises risk to the node operator, I am not sure migrations like this will actually be possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants