Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[P2P] Router bootstrapping #859

Open
13 tasks
bryanchriswhite opened this issue Jun 26, 2023 · 2 comments · May be fixed by #694
Open
13 tasks

[P2P] Router bootstrapping #859

bryanchriswhite opened this issue Jun 26, 2023 · 2 comments · May be fixed by #694
Assignees
Labels
p2p P2P specific changes

Comments

@bryanchriswhite
Copy link
Contributor

bryanchriswhite commented Jun 26, 2023

Objective

Clarify bootstrapping requirements, constraints sufficient to align on "correct" behavior and realize "low-hanging" optimization opportunity.

Origin Document

Questions surfaced while working on #732 & #694.

Goals

  • Clarify bootstrapping "success"/"failure" conditions
  • Reduce time to bootstrap (or fail) in router implementations
  • Consider how bootstrapping status is signaled to other modules (esp. if we plan on removing the FSM)
  • Account for TTL-base nature of libp2p peerstore

Legend

flowchart

a[State description]
subgraph next[Next state description]
	nest[Nested state description]
end
other[Other state]

cond{Condition}
act([Action])

a --> next
next --> cond
cond --"condition value"--> other


cond --"alternative value"--> act
act --"result"--> other
Loading

Flowchart

flowchart

start[Node Startup]
subgraph persStart[Persistence Module Start]
    hasState{Does this node already\nhave some state?}
    gen(["Genesis hydration\n(initial staked actor\nidentities nadded to state)"])
end

hasState --"NO"--> gen


subgraph p2pStart[P2P Module Start]
    subgraph l[Libp2p Host Setup]
        ll([Libp2p host\nlistening])
    end
    subgraph sar[Staked Actor Router Setup]
        sarHandle([Staked actor router\nprotocol handler\nregistered])
    end
    subgraph usar[Unstaked actor Router Setup]
        usarDisc([Unstaked actor DHT peer\ndiscovery start])
        usarHandle([Unstaked actor router\nprotocol handler\nregistered])
        usarGossip([Unstaked actor router\nGossipsub setup])
    end
end

l --> sar
sar --> usar

bs[P2P Bootstrapping Start]

isBSNode{Is this node\nconfigured as a\nbootstrap node?}

start --> persStart
persStart --> p2pStart
p2pStart --> bs
bs --> isBSNode
isBSNode --"NO"--> bsProg
isBSNode --"YES"--> bsBSNode

firstBS --> bsReachable
bsReachable --"YES"--> rpc

subgraph bsProg[RPC bootstrapping]
    firstBS[Considering first configured bootstrap node]
    rpc([Get staked peers from\n`rpcPeerstoreProvider`\nusing bootstrap node])
    isStaked{Is this node a\nstaked actor?}

    firstPeer[Considering first peer]
    nextPeer[Considering next peer]
    peerReachable{Is the current\npeer healthy?}
    morePeers{Are there more peers?}

    bsReachable{"is the current\nbootstrap node healthy?"}
    
    addStaked([Add peer to\nstaked actor router])
    addUnstaked([Add peer to\nunstaked actor router])
    addLibp2p([Add peer to libp2p host])
    con([Attempt to connect])

    minbs{are >= 3 peers\nconnected?}
    bsRetry(["Retry"])
    bsAttempts{"Max attempts\nreached for this\npeer?"}

    moreBS{Are there more\nconfigured\nbootstrap nodes?}
    nextBS[Considering next\nbootstrap node]
end

minbs --"NO"--> bsFail
minbs --"YES"--> bsDone

bsReachable --"NO"--> nextBS
rpc --> firstPeer


firstPeer --> peerReachable
peerReachable --"YES"--> isStaked
isStaked --"YES"--> addStaked
isStaked --"NO"--> addUnstaked
addStaked --> addUnstaked
addUnstaked --> addLibp2p
peerReachable --"NO"--> nextPeer
nextPeer --> peerReachable
addLibp2p --> con
con --"success"--> morePeers
morePeers --"NO"--> moreBS
morePeers --"YES"--> nextPeer
con --"error"--> bsAttempts

nextBS --> bsReachable


bsFail --> nodeFail
bsAttempts --"NO"--> bsRetry
bsAttempts --"YES"--> nextPeer

bsRetry --> con

moreBS --"NO" --> minbs
moreBS --"YES"--> nextBS

subgraph bsBSNode[Bootstrap node setup]
    gps["Get staked peers from\n`persistencePeerstoreProvider`\n(last known state; possibly genesis\nor a snapshot)"]
    addAllStaked([Add peers to\nstaked actor router])
    addAllUnstaked([Add peers to\nunstaked actor router])
    addAllLibp2p([Add peers to libp2p\nhost peerstore])
    isStaked2{Is this node a\nstaked actor?}
end

gps --> isStaked2
isStaked2 --"YES"--> addAllStaked
isStaked2 --"NO"--> addAllUnstaked
addAllStaked --> addAllUnstaked
addAllUnstaked --> addAllLibp2p
addAllLibp2p --> bsDone

bsFail["P2P Bootrapping Failure"]
nodeFail["Node Startup Failure"]

bsDone["P2P Bootstrapping Success"]

fsm["State Machine Transition\n(`P2P_IsBootstrapped`)"]
rest[...]
bsDone --> fsm
fsm --> rest
Loading

Deliverable

  • Determine success/failure bootstrapping condition(s) (e.g. when quorum number of known bootstrap nodes are (un)reachable)
  • Update P2P docs to describe bootstrapping success and failure scenarios
  • Update router bootstrapping implementations respectively
  • Support simultaneous dialing of bootstrap nodes with some "max concurrency" (see: [P2P] chore: concurrent bootstrapping #694)
  • Design bootstrap status signaling mechanism / convention
  • Ensure libp2p peerstore network addresses expire & renew appropriately

Non-goals / Non-deliverables

  • Remove or replace the state machine module

General issue deliverables

  • Update the appropriate CHANGELOG(s)
  • Update any relevant local/global README(s)
  • Update relevant source code tree explanations
  • Add or update any relevant or supporting mermaid diagrams

Testing Methodology

  • All tests: make test_all
  • LocalNet: verify a LocalNet is still functioning correctly by following the instructions at docs/development/README.md
  • k8s LocalNet: verify a k8s LocalNet is still functioning correctly by following the instructions here

Creator: @bryanchriswhite
Co-Owners:

@bryanchriswhite bryanchriswhite added p2p P2P specific changes triage It requires some decision-making at team level (it can't be worked on as it stands) labels Jun 26, 2023
@bryanchriswhite bryanchriswhite self-assigned this Jun 26, 2023
@bryanchriswhite bryanchriswhite linked a pull request Jun 26, 2023 that will close this issue
18 tasks
@bryanchriswhite bryanchriswhite moved this to Rescope in V1 Dashboard Jun 26, 2023
@Olshansk
Copy link
Member

@bryanchriswhite I have not done a deep dive into it, but am aware that libp2p has it's own opinion, approach and tooling to bootstrap (e.g. [1]). Questions are:

  1. Are you aware of it and/or have looked into it?
  2. Is it one of the potential options we are, or should, consider?

[1] https://discuss.libp2p.io/t/how-to-create-bootstrap-node-correctly-always-searching-for-other-peers/1389

@bryanchriswhite bryanchriswhite moved this from Rescope to Backlog in V1 Dashboard Jul 10, 2023
@bryanchriswhite bryanchriswhite moved this from Backlog to In Progress in V1 Dashboard Aug 9, 2023
@bryanchriswhite bryanchriswhite removed the triage It requires some decision-making at team level (it can't be worked on as it stands) label Aug 9, 2023
@bryanchriswhite
Copy link
Contributor Author

bryanchriswhite commented Aug 9, 2023

@bryanchriswhite I have not done a deep dive into it, but am aware that libp2p has it's own opinion, approach and tooling to bootstrap (e.g. [1]). Questions are:

  1. Are you aware of it and/or have looked into it?
  2. Is it one of the potential options we are, or should, consider?

[1] https://discuss.libp2p.io/t/how-to-create-bootstrap-node-correctly-always-searching-for-other-peers/1389

TL;DR everything is in terms of pokt address at the highest level at the moment which adds an otherwise unnecessary layer of complexity.

@Olshansk, I am aware of and we are using the go-libp2p-kad-dht package to facilitate unstaked actor (aka background) router bootstrapping. However, until we go libp2p-native with respect to at least peer IDs, we have to ensure that both routers can map a given pokt address to its corresponding public key.

FWIW, my experience has also been that some significant changes have been made to that library in relatively recent history which renders much of the discussions and examples I've encountered irrelevant, including conversations with chatGPT. 😕 Although, I think we're pretty well sorted on that front (see: kad_discovery_baseline_test.go).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p2p P2P specific changes
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

2 participants