Replies: 1 comment
-
|
I also added but no improvement |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I am running local llm on CPU only (super slow I know but it does not matter for my usecase)
when I run inference directly it just waits (300s for '2+2=?' inference)
but when I proxy the request through litellm - it get's killed after 60s and I believe and it tries again.
I have
what am I missing?
Beta Was this translation helpful? Give feedback.
All reactions