Skip to content

Conversation

krishna-samy
Copy link
Contributor

Issue:

  • There is a CPU overhead in BGPd at high scale due to bgp_attr_intern() -> attrhash_key_make() -> jhash_* when processing UPDATEs carrying whole bunch of prefixes within a single NLRI that share identical attributes. Interning was repeated per prefix, redoing hashing for the same attr received in the update msg.

Root-cause:

  • Within a single UPDATE's NLRI section, the same parsed attributes were redundantly interned for each prefix. This will be repeatedly invoking attribute hashing across the NLRI loop.

Fix:

  • Added a per-UPDATE reuse context passed into bgp_update(): struct bgp_attr_reuse_ctx
    • If the attributes (after policy) haven't changed from what was parsed and we're still in the same NLRI section, reuse the already-interned attributes (just bump refs) instead of re-hashing/re-intern.
    • If not, intern once and remember it in the per-UPDATE context so that following prefixes with the same unchanged attributes can reuse it.
  • bgp_nlri_parse_ip() creates a stack-local ctx and passes &ctx into bgp_update() for the UPDATE's NLRI section; all other calls pass NULL.
  • attr_unintern_safe_ctx(ctx, &attr) to invalidate the cache before unintern(in bgp_update).

Overall CPU time reduction is ~10% at high scale.

* Increment the refcnt for intern'd attr and its intern'd subs
* If any intern'd sub-object in 'struct attr' is added/deleted, update this function
*/
void bgp_attr_ref(struct attr *attr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, have some reservations about this. this is a ton of exposure of internal details that will have to be maintained forever - just to be used in exactly one place in bgp_route.c (right?)
could we find a way to do this through the attr_intern() code path - like maybe an adaptation or variant of that to take a reuse context? that might allow all the maintenance of the attr struct internals to remain inside attr_intern and _unintern?

@ton31337
Copy link
Member

I'm sort of afraid that with this solution we gonna have another kind of issues (timing related) or use-after-free when these interned pointers get staled/dangled.

Issue:
 - there is a CPU overhead in BGPd due to bgp_attr_intern() ->
   attrhash_key_make() -> jhash_* when processing UPDATEs carrying whole
   bunch of prefixes within a single NLRI that share identical
   attributes. Interning was repeated per prefix, redoing hashing for
   the same attr received in the update msg.

Root-cause:
 - Within a single UPDATE's NLRI section, the same parsed attributes
   were redundantly interned for each prefix. This will be repeatedly
   invoking attribute hashing across the NLRI loop.

Fix:
 - Added a per-UPDATE reuse context passed into bgp_update(): struct bgp_attr_reuse_ctx
   - If the attributes (after policy) haven't changed from what was parsed
     and we're still in the same NLRI section, reuse the already-interned
     attributes (just bump refs) instead of re-hashing/re-intern.
   - If not, intern once and remember it in the per-UPDATE context so that
     following prefixes with the same unchanged attributes can reuse it.
 - bgp_nlri_parse_ip() creates a stack-local ctx and passes &ctx
   into bgp_update() for the UPDATE's NLRI section; all other calls pass NULL.
 - attr_unintern_safe_ctx(ctx, &attr) to invalidate the cache before unintern(in bgp_update).

Overall CPU time reduction is ~10% at high scale.

Signed-off-by: Krishnasamy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants