In chat mode, add a "generate more" button when reaching the max token limit #156
Labels
enhancement
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
Is your feature request related to a problem? Please describe.
In chat mode, the model sometimes outputs the max number of tokens (512/512) but hasn't finished answering yet.
Setting num predict chat to -1 doesn't work with the model I'm using, and increasing the limit to 1024-2048 tokens just pushes the issue little bit further.
Describe the solution you'd like
It would be there a way to complete/extend the answer with additional tokens. (another batch of 512 for instance)
Describe alternatives you've considered
n/a
Additional context
n/a
The text was updated successfully, but these errors were encountered: