Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

Update to llm-samplers v0.0.7 #440

Merged
merged 2 commits into from
Nov 9, 2023

Conversation

KerfuffleV2
Copy link
Contributor

See KerfuffleV2/llm-samplers#9 for more information about the changes.

Notable features are adding Top-A and Min-P samplers.

@KerfuffleV2 KerfuffleV2 marked this pull request as ready for review November 6, 2023 23:55
@KerfuffleV2
Copy link
Contributor Author

This shouldn't be merged until I release 0.0.7 (likely in the next couple days), but I think it's ready for review in case there are any changes need. After the release, I'll update Cargo.toml to use that instead of pointing at the repo.

I added a new way to build the logits that prunes them (like Top-K) and they start out sorted which can be a big performance win. For example, doing logits::try_from_iter_top_k(blah, 1000) only takes the top 1,000. The remainder are not likely to ever be selected by sampling. Let me know if you want me to add a commandline option or something to enable that. There's some discussion here: KerfuffleV2/llm-samplers#9 (comment)

@KerfuffleV2
Copy link
Contributor Author

@philpax Could you please take a look at this one? (I don't seem to have access to request a review.)

@philpax philpax self-requested a review November 9, 2023 21:25
Copy link
Collaborator

@philpax philpax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems pretty reasonable - nice work on removing the type parameters from Sampler! Is there anything else you'd like to do, or should I merge it?

@KerfuffleV2
Copy link
Contributor Author

Thanks for checking. Should be all set to merge as far as I know as long as you're satisfied with the changes. Note that it passes the various tests but actual usage wasn't extensively tested, so I'd recommend running it on a model and making sure you get reasonable results. I don't have a lot of old GGML format models laying around.

@philpax
Copy link
Collaborator

philpax commented Nov 9, 2023

Tested and it seems to work. Thanks for your work!

@philpax philpax merged commit 23e4b46 into rustformers:main Nov 9, 2023
14 checks passed
@KerfuffleV2
Copy link
Contributor Author

Not a problem. If you ever find the sampling performance to have a measurable impact, you can try the Logits::try_from_iter_top_k method. If you set k to 2000 or so it's extremely unlikely to affect results and can increase the performance a lot, especially for those models with a very large vocab size (there are a few with 250K+).

This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants