Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interface lag after a number of messages in chat #261

Open
bars0um opened this issue Jun 17, 2024 · 1 comment
Open

Interface lag after a number of messages in chat #261

bars0um opened this issue Jun 17, 2024 · 1 comment
Labels
bug Something isn't working enhancement New feature or request

Comments

@bars0um
Copy link

bars0um commented Jun 17, 2024

Describe the bug
The interface lags after a conversation gets a little long (around 10-20 messages with full code text, say about 5000 tokens in each message). When this happens, the chat box seems to slowly catch up to my typing and can take several seconds to start showing up. The problem goes away if I start a new chat...so I'm suspecting the history of the chat

To Reproduce
Use with a local LLM running on oobabooga

Expected behavior
No change in experience the longer the history gets, at least not in the interface, I can understand response from LLM may lag or even break if token limits are reached, but I don't have an issue with the response. As soon as the lag goes away and my text is in the box, I can hit enter and get the same token rate I get when it's not lagging...

Screenshots
N/A

API Provider
oobabooga

Chat or Auto Complete?
Chat

Model Name
any

Desktop (please complete the following information):

  • OS: Mac Os
@ag3cko
Copy link

ag3cko commented Jun 26, 2024

Can confirm this also on Mac using Twinny chat with a locally networked Ubuntu system with ollama serving llama3;
Text input starts lagging once tokens reach ~400 and ~2000 chars.
The "Code Helper (Renderer)" process spikes in CPU usage ~80%+ until the typed text has caught up, and then drops to idle around ~40%.
The lag doesn't recover unless a new chat is started, and instantly kicks in again if the old chat is loaded from hisotory.
No local ollama installed FWIW.
Temp workaround is to instruct the bot to state a warning when the tokens/char count reach the threshold (probably different for each system) then switch to a new chat.

@rjmacarthy rjmacarthy added bug Something isn't working enhancement New feature or request labels Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants