Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add removal/startup of nodes during runtime #6

Open
7 tasks
p-offtermatt opened this issue Jun 16, 2023 · 0 comments
Open
7 tasks

Add removal/startup of nodes during runtime #6

p-offtermatt opened this issue Jun 16, 2023 · 0 comments
Labels
prio:mighthave For low priority issues that might be nice to have, but are not particularly important.

Comments

@p-offtermatt
Copy link
Member

p-offtermatt commented Jun 16, 2023

Is your feature request related to a problem? Please describe.
When an application crashes or is disconnected, CometMock should not crash.
It should also allow disconnected apps to reconnect, and new apps to connect during runtime
Optimally, validators that belong to apps which are currently disconnected have their signing_status set to false,
i.e. those validators do not sign. Similarly, CometMock should detect if two applications with the same private key are connected and produce double-sign evidence if that happens.

This feature allows test runs that kill and restart applications to run smoothly. It also seamlessly allows downtime the way tests already do it (although CometMock has its own native way to cause downtime as well).

Proposed solution

  • Calls to the ABCI server are made with a timeout
  • Create a new type that encapsulates the client and associates it to a validator
  • When an app times out, the validator associated with that app is set to not sign
  • Periodically retry apps
  • When an app reconnects, start signing with its validator again

Additional context
The logic was started to be implemented in this branch main...ph/dockerfile, but there is an issue with the way the CometBFT client calls the ABCI server, see this Slack messahe or the code here - the problem is that the context is hardcoded to background.

The solution is to duplicate that code from CometBFT and weave the contexts through.

Sync logic

This issue needs to add in particular support for syncing apps that do not have the current height.

@p-offtermatt p-offtermatt added the prio:mighthave For low priority issues that might be nice to have, but are not particularly important. label Jun 16, 2023
@p-offtermatt p-offtermatt changed the title Add crash recovery Add removal/startup of nodes during runtime Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
prio:mighthave For low priority issues that might be nice to have, but are not particularly important.
Projects
None yet
Development

No branches or pull requests

1 participant