Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAT traversal: QUIC Hole Punching #1015

Closed
aarshkshah1992 opened this issue Nov 5, 2020 · 5 comments
Closed

NAT traversal: QUIC Hole Punching #1015

aarshkshah1992 opened this issue Nov 5, 2020 · 5 comments
Assignees

Comments

@aarshkshah1992
Copy link
Contributor

  • We now have full-fledged support for QUIC which is a UDP based protocol.
  • We also have a variant of the “STUN” protocol in the (Identify + AutoNAT) implementation which can inform peers about their publicly dialable addresses with some confidence.
  • We need to introduce hole punching for the QUIC transport by getting two peers to attempt to simultaneously connect to each other on their advertised public addresses to punch a hole in their NAT for the other peer.
  • We can use Circuit Relays for this co-ordination as detailed here.
@aschmahmann
Copy link
Collaborator

@aarshkshah1992 what is the advantage of doing a connection upgrade over a relay vs a specific coordination protocol?

Perhaps I'm missing something but here's an example of how I'm looking at it. If we have three peers Alice, Bob, Relay where Alice is trying to connect to Bob using Relay.

Connection Upgrade:

  • Strategy
    • Alice asks Relay to connect her to Bob
    • Alice + Bob coordinate hole punching and a separate direct connection using some new protocol (e.g. /p2p/holepunch/1.0.0)
    • Alice is now connected to Bob + Relay
  • Advantages
    • We have a relay transport
    • It allows us to upgrade the holepunch protocol on Alice + Bob without touching the Relay
    • Allows fallback to full communication over Relay if holepunching fails
  • Disadvantages
    • Requires us to figure out how to limit relay abilities to prevent abuse (e.g. R is only going to let A send X bytes/second) where X is normally really small
    • Likely requires us to figure out how R should tell A it's terms + conditions such as
      • that it's not a full relay and that it should not be trying to use it as a full relay
      • that it will only serve connections that it feels responsible for (e.g. R might decide it'll help anyone in the world connect to B, but not waste bandwidth letting other people connect to A)
      • Need to figure out how to get go-libp2p to deal with multiple connections to the same PeerID

Separate Protocol:

  • Strategy (rough draft)
    • Alice asks Relay to do a holepunch with Bob via some new protocol (e.g. /p2p/holepunch/1.0.0)
    • Relay either responds "I don't know/am not connect to Bob", or "ok"
    • Bob tries to directly dial Alice, and if that fails Bob asks Relay to orchestrate a holepunch with Alice
  • Advantages
    • New protocol that only orchestrates NAT traversal means no one should be attempting to use it for communication
      • Perhaps the protocol ends up with user-data such that it technically "could" be used for communication, but that abuse seems pretty easy to prevent
    • No need to do any libp2p plumbing related to getting Alice + Bob to talk to each other + upgrade, just for the low level UDP holepunching itself
  • Disadvantages
    • Makes the new protocol a little more complicated/state-machine like, or requires multiple new protocols
    • Alice and/or Bob have to find a fallback relay to talk to in the event holepunching is unsuccessful
      • Although at least here they know they can't use Relay, whereas above they might be confused unless we upgrade the circuit-relay protocol

@aarshkshah1992
Copy link
Contributor Author

@aschmahmann I am not sure what you mean.

  • The circuit upgrade over Relay will use a new protocol but the bytes for that protocol will be relayed over the Relay server.
  • In the existing solution, Alice will first try to dial Bob directly using the addresses it sees in the Identify protocol and then use the Relay to co-ordinate hole punching if that dial fails.

Separate Protocol:

Strategy (rough draft)
Alice asks Relay to do a holepunch with Bob via some new protocol (e.g. /p2p/holepunch/1.0.0)
Relay either responds "I don't know/am not connect to Bob", or "ok"
Bob tries to directly dial Alice, and if that fails Bob asks Relay to orchestrate a holepunch with Alice

Even this needs a Relay that Bob would be connected to , right ? Remember, both Alice and Bob could be behind a NAT which means they need to be connected to a common publicly reachable server to co-ordinate the hole punch. What we are saying here is that since we already have the circuit relay infra and code, why not use that to co-ordinate the hole punch albeit over a protocol "layered" on top of the Circuit Relay.

@aschmahmann
Copy link
Collaborator

aschmahmann commented Nov 5, 2020

Even this needs a Relay that Bob would be connected to , right ? Remember, both Alice and Bob could be behind a NAT which means they need to be connected to a common publicly reachable server to co-ordinate the hole punch. What we are saying here is that since we already have the circuit relay infra and code, why not use that to co-ordinate the hole punch albeit over a protocol "layered" on top of the Circuit Relay.

By reusing the existing circuit relay code we are in a position where we need to some issues with circuit relays before doing anything involving hole punching, at least for us to deploy this to people and make say every DHT server node a holepunching relay. Issues include:

  • Limit bandwidth per user through relay
  • R needs to tell users "I'm not a full relay, you have reduced bandwidth"
  • Allow go-libp2p to connect to Bob twice, once over circuit-relay and another time directly

@aarshkshah1992
Copy link
Contributor Author

aarshkshah1992 commented Nov 5, 2020

Discussed offline with @aschmahmann :

  • Yes, we will have to restrict bandwidth but that problem is orthogonal to NAT traversal. We should have already done that.
  • But, I discussed this with @jacobheun today and we decided that we will get in the bandwidth limiting once we have the hole punching in place so Relays are used ONLY for co-ordinating hole punching and not for data transfer.
  • Using Circuit Relays to co-ordinate that hole punch is really the quickest path to get to QUIC hole punching (which is our main goal right now) given that we have a lot of the Infra and code in place already. It shouldn't be hard to change the co-ordination protocol once we have the hole punching delivered.
  • I agree, we will have to make a change to go-libp2p to allow multiple connections between peers temporarily (shouldn't be that hard) for this approach to work. We have an issue for it at NAT traversal: Swarm should allow creating a new connection between peers even if one already exists #1014.

@aarshkshah1992
Copy link
Contributor Author

This is now being tracked as part of #1039.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants