Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add peer ban list to Fluffy #2809

Open
kdeme opened this issue Oct 31, 2024 · 4 comments
Open

Add peer ban list to Fluffy #2809

kdeme opened this issue Oct 31, 2024 · 4 comments
Assignees
Labels
Fluffy optimization Security Security vulnerability or security related

Comments

@kdeme
Copy link
Contributor

kdeme commented Oct 31, 2024

Nodes that offer invalid data will currently be allowed to keep doing so wasting bandwidth and cpu of the receiving node.

Any offer that has a properly encoded content key and that falls in the data radius of the receiving node will be accepted.
Data will be send and once received it will be validated. If the data fails validation, the peer currently is not punished and thus allowed to keep offering this or other invalid data.

Similary a node can provide invalid data on a FindContent. Or send no data at all or very slow until uTP transfers time out.

We should add a temporary ban list for nodes that do this.

There might also be nodes that end up in the DHT of a specific subnetwork for some reason, but don't actually support the subnetwork. Might require a temporary (per network) ban list for these to avoid trying to ping over and over a node that is not supported.

@kdeme
Copy link
Contributor Author

kdeme commented Nov 6, 2024

We currently see a lot of following errors:

WRN 2024-11-06 11:11:29.853+01:00 Offer failed due to accept request failure  topics="portal_wire" error="No message data, peer might not support this talk protocol" contentKeys=@[0x00458ff92094a00fac6e393a435d58242c4971e9711adf1cd16215fc97856e4a1e] node=b7*406c26:164.92.247.243:5008

This is definitely a good candidate as reason for subnetwork specific temporary bans. As it really indicate a node that does not support a subnetwork.
In theory these should get filtered out of the subnetwork DHT, but if other clients don't do that properly or if the node does reply to ping/pongs on that subnetwork, this will not occur.

@bhartnett bhartnett self-assigned this Nov 26, 2024
@bhartnett
Copy link
Contributor

bhartnett commented Nov 26, 2024

@kdeme I remember you mentioned that there is an existing ban list implementation in nimbus-eth2 somewhere but I can't seem to find the link (if you shared it previously). Can you share it here when you get a chance?

@bhartnett
Copy link
Contributor

How long should we ban nodes that are misbehaving?

So we should ban nodes for a period of time when they:

  • Send offers that fail to validate (conteny key or value)
  • Return responses to findContent that fails to validate
  • Fails to respond or sends data too slow and hits timeout in response to pings, lookups and offers.
  • Responds with empty or bad data to pings, lookups and offers.

Does this cover it or are there other scenarios that need to be covered?

@bhartnett bhartnett changed the title Add peer ban list to fluffy Add peer ban list to Fluffy Nov 26, 2024
@kdeme
Copy link
Contributor Author

kdeme commented Nov 26, 2024

How long should we ban nodes that are misbehaving?

I think we might have different timeouts depending on the reason of the ban. But nodes that are simply incompatible should be banned for several hours if not a day. Nodes that have many timeouts perhaps less, as maybe the node/machine was just "busy" for a bit.

Does this cover it or are there other scenarios that need to be covered?

That's a good start, we will probably notice more as we go through the error statements in the code.
One that I know of that is important is when a node responds with an empty talkresp response message, as this means that node does not support that specific subprotocol, so it should not be contacted for that specific subprotocol. This also means that the ban list should be subnetwork/subprotocol specific.

@kdeme I remember you mentioned that there is an existing ban list implementation in nimbus-eth2 somewhere but I can't seem to find the link (if you shared it previously). Can you share it here when you get a chance?

eth2_network.nim has this seenTable which gets checked before dialing a peer: https://github.com/status-im/nimbus-eth2/blob/stable/beacon_chain/networking/eth2_network.nim#L1358

Nodes stay in that table for different times depending on the failure: https://github.com/status-im/nimbus-eth2/blob/stable/beacon_chain/networking/eth2_network.nim#L235-L251

Our situation is of course a bit different as we don't keep long-lived connections to other peers.
But I think something similar can be used, but then the difference would be where do we check the list. Probably something like:

  • When selecting nodes to send FindNodes, FindContent or Offer message requests to.
  • When adding or re-validating a node in our routing_table. Not fully sure how it would fit in here exactly, I would need to have a better look at this.

This would also be something that could potentially be enabled on the level of discv5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Fluffy optimization Security Security vulnerability or security related
Projects
None yet
Development

No branches or pull requests

2 participants