Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use foldhash in ExtractMap<K, V> within model/cache instead of std's ` hasher #3113

Draft
wants to merge 183 commits into
base: next
Choose a base branch
from

Conversation

xacrimon
Copy link
Contributor

@xacrimon xacrimon commented Feb 14, 2025

This PR replaces std hashers with the foldhash variant optimized for hash-based datastructures. It doesn't make a whole lot of sense to use the std hasher here since the keys aren't user controlled and follow a known pattern that prevents any sort of HashDOS type of attack. This is significantly faster than std for our keys and due to reduced size of the hasher factory type, reduces the size of any type containing ExtractMap's such as Guild, which goes from 708 to 648 bytes.

GnomedDev and others added 30 commits February 13, 2025 18:32
…erenity-rs#2646)

This avoids having to allocate to store fixed length (replaced with normal
array) or fixed capacity (replaced with `ArrayVec`) collections as vectors for
the purposes of putting them through the `Request` plumbing.

Slight behavioral change - before, setting `params` to `Some(vec![])`
would still append a question mark to the end of the url. Now, we check
if the params array `is_empty` instead of `is_some`, so the question
mark won't be appended if the params list is empty.

Co-authored-by: Michael Krasnitski <[email protected]>
These are unnecessary. Accepting `impl Into<Arc<T>>` allows passing either `T` or `Arc<T>`.
This trades a heap allocation for messages sent along with thread
creation for `Message`'s inline size dropping from 1176 bytes to 760
bytes,
…l models (serenity-rs#2656)

This shrinks type sizes by a lot; however, it makes the user experience slightly
different:

- `FixedString` must be converted to String with `.into()` or `.into_string()`
  before it can be pushed to, but dereferences to `&str` as is.
- `FixedArray` must be converted to `Vec` with `.into()` or `.into_vec()`
  before it can be pushed to, but dereferences to `&[T]` as is.

The crate of these types is currently a Git dependency, but this is fine for
the `next` branch. It needs some basic testing, which Serenity is perfect for,
before a release will be made to crates.io.
…enity-rs#2668)

This commit:

- switches from `u64` to `i64` in `CreateCommandOption::min_int_value` and
`CreateCommandOption::max_int_value` to accommodate negative integers in
Discord's integer range (between -2^53 and 2^53). Values outside this
range will cause Discord's API to return an error.
- switches from `i32` to `i64` in `CreateCommandOption::add_int_choice` and
`CreateCommandOption::add_int_choice_localized` to accommodate Discord's
complete integer range (between -2^53 and 2^53). Values outside this
range will cause Discord's API to return an error.
This cache was just duplicating information already present in `Guild::members`
and therefore should be removed.

This saves around 700 MBs for my bot (pre-`FixedString`).

This has to refactor `utils::content_safe` to always take a `Guild` instead
of`Cache`, but in practice it was mostly pulling from the guild cache anyway
and this means it is more likely to respect nicknames and other information,
while losing the ability to clean mentions from DMs, which do not matter.
`Embed::fields` previously had to stay as a `Vec` due to `CreateEmbed` wrapping
around it, but by implementing `Serialize` manually we can overwrite the
`Embed::fields` with a normal `Vec`, for a small performance hit on
serialization while saving some space for all stored `Embed`s.
Simply missed these when finding and replacing.
This uses the `bool_to_bitflags` macro to remove boolean (and optional boolean)
fields from structs and pack them into a bitflags invocation, so a struct with
many bools will only use one or two bytes, instead of a byte per bool as is.

This requires using getters and setters for the boolean fields, which changes
user experience and is hard to document, which is a significant downside, but
is such a nice change and will just become more and more efficient as time goes
on.
…rs#2681)

This swaps fields that store `Option<Int>` for `Option<NonMaxInt>` where the
maximum value would be ludicrous. Since `nonmax` uses `NonZero` internally,
this gives us niche optimisations, so model sizes can drop some more.

I have had to include a workaround for [serenity-rs#17] in `optional_string` by making my
own `TryFrom<u64>`, so that should be removable once that issue is fixed.

[serenity-rs#17]: LPGhatguy/nonmax#17
A couple of clippy bugs have been fixed and I have shrunk model
sizes enough to make `clippy::large_enum_variant` go away.
A discord bot library should not be using the tools reserved for low
level OS interaction/data structure libraries.
Discord seems to internally default Ids to 0, which is a bug whenever
exposed, but this makes ID parsing more resilient. I also took the
liberty to remove the `From<NonZero*>` implementations, to prevent future
headaches, as it was impossible to not break public API as we exposed
`NonZero` in `*Id::parse`.
…nity-rs#2694)

This,
1. shrinks the size of Request, when copied around, as it doesn't have
to store the max capacity at all times
2. shrinks llvm-lines (compile time metric) for my bot in debug from
`1,153,519` to `1,131,480` as no monomorphisation has to be performed
for `MAX_PARAMS`.
Follow-up to serenity-rs#2694.

When `Request::params` was made into an ArrayVec, the `Option` around it
was removed in order to avoid having to add a turbofish on `None` to
specify the value of `MAX_PARAMS`. Also, `Request::new` also needed to
be changed so that the value of `MAX_PARAMS` could be inferred. Now that
the field is a slice again, we can wrap it in `Option` again (at no cost
to size, thanks to niche opts).

We ensure we never store the redundant `Some(&[])` by checking for an
empty slice and storing `None` instead. This way, we ensure we never
read an empty slice out of the `Some` variant.
The instrument macros generate 2% of Serenity's release mode llvm-lines,
and are proc-macros so hurt compile time in that way, so this limits
them to opt-in. This commit also fixes the issues that the instrument macro
was hiding, such as results that didn't ever error and missing
documentation.
This signature is hard to use as `None` cannot infer the type of the
generic. I also replaced `Option<u8>` with `Option<NonMaxU8>` as it's
more efficient and will make the user think of the maximum value.
…nity-rs#2698)

This removes inefficient `IntoIterator` generics and instead takes what is
actually required. I also reworked `reorder_channels` to allow for keeping the
generic, as it actually does only just need iterator semantics.
Previously, someone assumed that `Ratelimiter` was going to be cloned, so
put a ton of `Arc`s everywhere. This was unneeded, and before dashmap,
so the buckets were also stored massively inefficiently. This fixes all
that.

I had to shuffle around the `Ratelimit` methods a little bit to return
their sleep time instead of sleeping themselves, so I didn't have to
hold a dashmap lock over an `.await`.
This removes multiple error variants and overall cleans up the codebase
by moving overflow checks into two `ModelError` variants.
Shrinks `size_of::<Error>` from 136 bytes to 64 bytes, while removing unused
variants. This will improve performance for any method returning
`Result<T>` where `T` is less than the size of `Error` as both `Result`'s
`Ok` and `Err` have to be allocated stack space.
The compiler knows best as inlining is quite complicated. This should
help with compile times, significantly.
mkrasnitski and others added 15 commits February 13, 2025 20:44
…rs#3075)

Similar to message URLs, Discord also provides URLs for guild channels.

Additionally, the function is added as an alternative for parsing a `Channel` from a string. 
Private channels are not affected by this change.
The `Deserialize` implementation neglects to add the `Bot ` prefix to
the string when it is deserialised.

This adds `TryFrom` implementations for `&str` and `String` and tells
serde to deserialise `Token` using the `TryFrom<&str>` implementation,
which will prepend the `Bot ` prefix.

Fixes serenity-rs#3085
This commit refactors how the gateway connection being closed gets handled,
and also reworks how resuming is performed. If a resume fails, or if the
session id is invalid/doesn't exist, the shard will fall back to restart
+ reidentify after a 1 second delay. This behavior was only present in
some circumstances before.

Also, cleaned up the loop in `ShardRunner::run` by adding a
`ShardAction::Dispatch` variant, since event dispatch was already
mutually exclusive to hearbeating, identifying, and restarting. The
overall effect is less interleaving of control flow.

Plus, removed the `Shard::{reconnect, reset}` functions as they were
unused.

A notable change is that 4006 is no longer considered a valid close code
as it is undocumented, and neither is 1000, which tungstenite assigns as
`Normal` or "clean". We should stick to the [table of close
codes](https://discord.com/developers/docs/topics/opcodes-and-status-codes#gateway-gateway-close-event-codes)
provided by Discord.
A regression introduced by serenity-rs#3099 was that successful resumes will break
out of the loop inside `ShardRunner::run`, but they shouldn't (or
rather, didn't before). Therefore, only break out of the loop if the
resume failed and we had to fallback to reidentifying.
@github-actions github-actions bot added model Related to the `model` module. cache Related to the `cache`-feature. http Related to the `http` module. gateway Related to the `gateway` module. labels Feb 14, 2025
…3111)

As the title notes, this commit replaces fxhash for foldhash as used in the
cache. dashmap, due to it's sharding, has to share entropy with what's handed
down to internal maps. Since `hashbrown` and by extension `std` use various
sections of the high bit range for special grouping & sorting, dashmap is left
with the only option to shard on low bits.

This, however, presents problems, because fxhash outputs hashes of very bad
quality, with only the high bits having any real entropy. This was probably a
solid choice back in 2018 when we lacked other good fast alternatives. But
since then `ahash` matured and we've had significant research and development
in "good enough" hashing for datastructures with short keys, [the most recent
step forward coming from a rather well known face][foldhash]. This improves
shard selection quite a bit and reduces contention significantly. Using fxhash
in a dashmap specific benchmark causes contention to go up by 3-8x when keys
are k-sortable with time (Discord snowflakes) on an M1 Pro.

[foldhash]: https://github.com/orlp/foldhash
@mkrasnitski
Copy link
Collaborator

Writing out the explicit random state parameter each time is a bit noisy. Could you add a typedef to src/internal/prelude.rs that assigns it once and then the whole lib can use it?

@xacrimon
Copy link
Contributor Author

@mkrasnitski gnome suggested I move the hasher wrapper in cache into model and use that instead, going to try that.

@arqunis arqunis force-pushed the next branch 2 times, most recently from 57c79ff to 9a811a7 Compare March 11, 2025 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cache Related to the `cache`-feature. gateway Related to the `gateway` module. http Related to the `http` module. model Related to the `model` module.
Projects
None yet
Development

Successfully merging this pull request may close these issues.