Skip to content

mcts #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
citizenhicks opened this issue Oct 8, 2024 · 2 comments
Open

mcts #45

citizenhicks opened this issue Oct 8, 2024 · 2 comments

Comments

@citizenhicks
Copy link
Contributor

mcts to low ent / high vent branching.

@citizenhicks
Copy link
Contributor Author

citizenhicks commented Oct 8, 2024

@xjdr-alt
so right now, being gpu poor, mcts runs very slowly - as expected. instead i've implemented sparkling beam search. here is a comparison of beam search vs the current implementation:

colour code:
white: 'adaptive sampling'; (middle point)
blue: 'branching'; (low entropy, high varentorpy), branching.
red: 'resampling'; (high entropy, high varentropy)
yellow: (none present): but that would be the high entropy, low varentropy - ask clarifying questions

current sampling (original): note that the name is wrong
Screenshot 2024-10-08 at 7 57 06 PM

scenario 1: static beam search on blue only: note that the name is wrong
Screenshot 2024-10-08 at 7 51 49 PM

i think the difference between the two is marginal, albeit i like beam is a little bit better.

Scenario 2: apply adaptive beam search to red (high / high): note that the name is right
Screenshot 2024-10-08 at 8 11 38 PM

Scenario 3: apply blue, static beam search and adaptive red: note that the name is right
Screenshot 2024-10-08 at 8 21 08 PM

i like scenario 3 the most.
i'll keep experimenting later.

@HenkPoley
Copy link

HenkPoley commented Oct 10, 2024

How about also using the base model, since it usually has better logprobs than what instruction turning messes the model up with.

#61

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants