Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stress test the network reliability #1436

Closed
2 tasks
locallycompact opened this issue May 16, 2024 · 1 comment
Closed
2 tasks

Stress test the network reliability #1436

locallycompact opened this issue May 16, 2024 · 1 comment
Assignees
Labels
💭 idea An idea or feature request

Comments

@locallycompact
Copy link
Contributor

locallycompact commented May 16, 2024

Why

We have experienced many situations in which the head can not progress. These problems are hard to reproduce and we have spent a lot of time in coordination attempting to resolve the problem in each case.

Records of these issues are here:

#1374
#1415

One possible solution was a manual snapshot recovery as outlined here:

#1416

This is unsatisfying as we would prefer to make the nodes self-healing and not require manual intervention.

What

How

  1. Create a test that stress tests the network layer in the case of three or more intermittently failing peers. A failing peer is a peer that fails to send, receive or persist network messages.
  2. (Optional) Extract the network layer into its own package to remove coupling.
@ch1bo ch1bo changed the title Ensure the head can not get stuck when multiple peers go offline. Stress test the network reliability May 27, 2024
@ch1bo ch1bo added the 💭 idea An idea or feature request label Jul 10, 2024
@noonio
Copy link
Contributor

noonio commented Sep 4, 2024

This is effectively resolved by #1552 . We will continue to iterate on what we have, but we've taken a great first step!

@noonio noonio closed this as completed Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💭 idea An idea or feature request
Projects
None yet
Development

No branches or pull requests

4 participants