Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(bitswap): wantlist overflow handling #629

Merged
merged 17 commits into from
Jul 30, 2024
Merged

Conversation

gammazero
Copy link
Contributor

@gammazero gammazero commented Jun 23, 2024

Handle incoming wants that can not be added to the peer ledger without exceeding the peer want limit. These are handled by trying to make room for them by canceling existing wants for which there is no block. If this does not make sufficient room, then any lower priority wants that have blocks are canceled. This fix prevents the wantlist from filling up with CIDs that the server does not have.

Priority is also considered when truncating a wantlist that exceeds the size limit. Considering priority in wantlist truncation and overflow handling ensures that higher priority tasks handled when wantlist size needs to be limited. This will be more important in the future if/when priority is determined by a block's dag path (higher priority when closer to root).

Fixes #527

@gammazero gammazero requested a review from a team as a code owner June 23, 2024 20:23
Copy link

codecov bot commented Jun 23, 2024

Codecov Report

Attention: Patch coverage is 96.29630% with 4 lines in your changes missing coverage. Please review.

Project coverage is 59.92%. Comparing base (96ceee6) to head (d24abc7).

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #629      +/-   ##
==========================================
+ Coverage   59.82%   59.92%   +0.10%     
==========================================
  Files         238      238              
  Lines       30004    30049      +45     
==========================================
+ Hits        17949    18008      +59     
+ Misses      10437    10425      -12     
+ Partials     1618     1616       -2     
Files Coverage Δ
bitswap/server/internal/decision/peer_ledger.go 92.79% <100.00%> (-1.39%) ⬇️
bitswap/server/internal/decision/engine.go 91.84% <95.83%> (+0.93%) ⬆️

... and 15 files with indirect coverage changes

@lidel lidel changed the title Fix wantlist overflow handling to select newer entries. fix: wantlist overflow handling to select newer entries Jun 24, 2024
@Wondertan
Copy link
Member

This option will need a minor change

gammazero and others added 2 commits July 2, 2024 17:06
wantlist overflow handling now cancels existing entries to make room for newer requests. This fix prevents the wantlist from filling up with CIDs that the server does not have.

Fixes #527
@gammazero gammazero force-pushed the fix/wantlist-overflow-handling branch from 713faee to 9c35f18 Compare July 3, 2024 18:19
@gammazero gammazero marked this pull request as ready for review July 4, 2024 01:14
@gammazero gammazero marked this pull request as draft July 4, 2024 01:37
Handle incoming wants that could not be addded to the peer ledger without exceeding the peer want limit. These are handled by trying to make room for them by canceling existing wants for which there is no block. If this does not make sufficient room, then any lower priority wants that have blocks are canceled.
@gammazero gammazero marked this pull request as ready for review July 4, 2024 06:31
@gammazero gammazero requested a review from aschmahmann July 4, 2024 06:36
Copy link
Contributor

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing on this 🙏

Comment on lines 926 to 932
// Do not take more wants that can be handled.
if len(wants) > int(e.maxQueuedWantlistEntriesPerPeer) {
// Keep the highest priority wants.
slices.SortFunc(wants, func(a, b bsmsg.Entry) int {
return cmp.Compare(b.Entry.Priority, a.Entry.Priority)
})
wants = wants[:int(e.maxQueuedWantlistEntriesPerPeer)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be fine, however do we want to do this here given that there could be duplicates once we actually look at the wantlist?

  • Pro: We shed ram usage earlier
  • Con: There might be a better set of wants that we can honor if we wait

Copy link
Contributor Author

@gammazero gammazero Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to truncate the list, but sorting can be avoided.

Truncation makes sense:
The incoming wants are already unique, so even if there are no existing wants, or all possible wants have duplicates, there is still no way any more than the limit can be used. If all those not added to the message queue are included in the overflow then they will end up getting dropped anyway because there will not be enough existing wants available to replace. Dropping them early does two things:

  • Prevents trying to add excessive number of wants to peer ledger.
  • Prevents a large number of wants from being sorted and incurring block size lookup in handleOverflow.

Truncation without sorting:
This will potentially lose higher priority wants, but avoids sorting an incoming wantlist of unknown size. In the usual case sorting is not needed anyway, so not sorting will avoid a performance hit.

There might be a better set of wants that we can honor if we wait

True, but then it is necessary to examine all the wants (a possibly excessive amount) to see if they can be added directly to the message queue, or need to be handled as overflow.

for _, entry := range existingWants {
queuedWantKs.Add(entry.Cid)
}
queuedBlockSizes, err := e.bsm.getBlockSizes(ctx, queuedWantKs.Keys())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I get what the idea is here and if this is necessary / if we can make this much cheaper

  • Is this meant as "is there a DONT_HAVE response queued up that we should replace".
    • While I get this it might also be overkill, and it might be fine to respect the user priority in responding with DONT_HAVEs, HAVEs, and blocks in the same way.
  • Is this meant as "I previously sent a DONT_HAVE and now this is sitting on my list as a subscription".
    • As discussed this definitely seems like something we should want to knock off our list if out of space

In either case it seems like we could add some extra data to the in-memory structs here rather than going to the blockstore to see if we have the data (and being at the mercy of whatever caching, bloom filters, etc. are used there)

Copy link
Contributor Author

@gammazero gammazero Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is that there is a DONT_HAVE message queued for the peer, but it has not been sent yet and is blocking new messages from being queued for the peer. So, cancel the unsent DONT_HAVE and try to enqueue something possibly more important. Either a delayed HAVE message will replace the pending DONT_HAVE, or the peer can ask again later. This should keep messages moving, even if there is some backup sending DONT_HAVE messages to peers.

This also handles the case where a DONT_HAVE message has been sent, but is not removed from the queue. Once a message is sent, the want is removed from the message queue and peer ledger only when blocks have been sent or when block presence has been sent. If a DONT_HAVE was sent the want remains on the queue and peer ledger as a place-holder should a block arrive later, and this is stopping new wants from being accepted. This is what the 5th bullet in #527 is referring to by:

This is because the bitswap server never cleanup entries after sending DONT_HAVE

So, in short, it handles both cases.

it seems like we could add some extra data to the in-memory structs here rather than going to the blockstore.

Yes, the wants for which block is found can be recorded in the peer ledger so that these can be ignored in overflow handling. However, that would need to be done in every call to engine.MessageReceived, and seems less preferable than doing something more expensive only during the exceptional case.

The task queue does already have this info, but this would require locking the tasqueue and the peer tracker for each overflow want CID to look at. Or, this would require a new taskqueue API to get a list of wants with HaveBlock set to true for a given peer. This last option might be less expensive than looking at the blockstore, but I was not comfortable with that amount of new plumbing for handling this bitswap exceptional case. WDYT?

return wants
}

// Remove entries for blocks that are not present to make room for overflow.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to above is this about blocks that aren't present or subscriptions?

Note: the reason I'm pushing on the difference is that from my perspective subscriptions are much more expensive by virtue of occupying memory for an indefinite amount of time rather than a transient "while I'm sending out a response". Not sure if that's enough to justify different lists, but it's how I'm thinking in my review here (but lmk if you disagree or think I'm missing the point).

Copy link
Contributor Author

@gammazero gammazero Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the engine perspective, I do not think there is any need for distinction between subscription and request-response since that I think only determines how long a peer is in the task queue/ledger.

Overall, it probably does make more sense to only do this overflow handling for subscriptions. I was thinking/hoping this would handle itself by subscriptions being the ones primarily affected in the first place and needing to do overflow handling. I think some real-world use is necessary to determine this. I will add logging that can be used to determine when overflow handling is happening.

bitswap/server/internal/decision/engine.go Outdated Show resolved Hide resolved
Comment on lines 855 to 856
// Not enough dont-haves removed. Replace existing entries, that are a
// lower priority, with overflow entries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice I expect this never to happen given that IIUC the boxo client (which is the most widely used one) just decreases the priority over time

Given that this is the case it'd be good to:

  1. Make this shortcut fairly cheaply when looking at the overflow + existing lists (which have already been sorted earlier) we see that the lowest priority in the existingwants is higher than the highest in the overflow list (doesn't have to be an explicit check, but anything that allows this to be pretty cheap rather than linear)
  2. Have it still work (and be tested) when people choose different priorities
  3. Not being particularly expensive even in pathological cases

Copy link
Contributor Author

@gammazero gammazero Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think it probably will be unlikely to happen given that the client decreases priority over time. The thinking here was looking more toward a future where priority may be set by path distance, where items closer to a DAG root have a higher priority. In that case it seems more likely that as new wants are requested that some do have a high priority because they are root or closer to root items.

  1. Revised logic to break out as soon as lowest usable priority is hit on incoming wantlist (now ordered from most to least important).
  2. Still works.
  3. In the normal case, most of the work was already done in sorting the lists, so this compares list items until the priority in the ascending-sorted list is higher than the descending-sorted list. The pathological case is where all new incoming wants are at a higher priority. This can be detected, but there is not a better way to handle it since canceling the individual overflows need to be done. Clearing the peer's wantlist does the same thing, just for all CIDs, so that is not better.

bitswap/server/internal/decision/engine.go Outdated Show resolved Hide resolved
Comment on lines 817 to 823
// Sort wl and overflow from least to most important.
slices.SortFunc(existingWants, func(a, b wl.Entry) int {
return cmp.Compare(a.Priority, b.Priority)
})
slices.SortFunc(overflow, func(a, b bsmsg.Entry) int {
return cmp.Compare(a.Entry.Priority, b.Entry.Priority)
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps premature optimization (depends on how large people set their limits to and how frequently they're hit). Seems like we could spend a lot of time sorting here.

For the overflow list this is probably fine (shouldn't be that big anyway and it's related to message size), but if nodes spend significant time near the limit they'll be doing:

  1. Copy the wantlist map into a list
  2. Sort the list

For basically every message that comes in.

Can the PeerRequestQueue help us out here since it's already storing a prioritized queue of what needs to be done? It might not due to how the locking/concurrency works but could save a bunch of pain.

Copy link
Contributor Author

@gammazero gammazero Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PeerRequestQueue cannot really help out, since that maintains a heap so getting an ordered list would require iterating the heap. I do not see a much less expensive alternative, other than just clearing the peer's want list at some point, like if overflow is happening too frequently.

Maybe if overflow happens 5 times in a row for a particular peer, then clear that peer's message queue? WDYT?

bitswap/server/internal/decision/engine_test.go Outdated Show resolved Hide resolved
bitswap/server/internal/decision/engine_test.go Outdated Show resolved Hide resolved
@gammazero gammazero requested a review from aschmahmann July 8, 2024 17:57
lidel added a commit to ipfs/kubo that referenced this pull request Jul 26, 2024
switching early to
ipfs/boxo#629
to see if ci passes
@lidel lidel mentioned this pull request Jul 26, 2024
4 tasks
@lidel lidel changed the title fix: wantlist overflow handling to select newer entries fix(bitswap): wantlist overflow handling Jul 26, 2024
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smoke tested in ipfs/kubo#10462 and lgtm.

Release / test plan: I'll apply cosmetics below and tag boxo release on Monday and bubble up to kubo 0.30.0-rc1 (ipfs/kubo#10436), then deploy rc1 to collab cluster and see if it impacts performance of its nodes.

CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

bitswap/server/internal/decision/engine_test.go Outdated Show resolved Hide resolved
bitswap/server/internal/decision/engine.go Show resolved Hide resolved
@lidel lidel merged commit 42c0c86 into main Jul 30, 2024
15 checks passed
@lidel lidel deleted the fix/wantlist-overflow-handling branch July 30, 2024 13:09
wenyue pushed a commit to wenyue/boxo that referenced this pull request Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bitswap/server: wantlist overflows fails in a toxic maner preventing any data transfer
4 participants