Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to set OllamaLLM post url #231

Open
prbarcelon opened this issue Apr 4, 2024 · 0 comments
Open

Unable to set OllamaLLM post url #231

prbarcelon opened this issue Apr 4, 2024 · 0 comments

Comments

@prbarcelon
Copy link

Hi, I wanted to use Ollama as my local LLM, but I'm hosting it in a different docker container than my app.

When I try to connect to Ollama from my app, I get the following expected error since they're running on different containers:
LLM error: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/chat (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f622c1ead90>: Failed to establish a new connection: [Errno 111] Connection refused'))

Here is my code:

encoder = HuggingFaceEncoder()
_llm = OllamaLLM(llm_name="mistral")
rl = RouteLayer(encoder=encoder, routes=routes, llm=_llm)

Ideally, I would be able to instantiate OllamaLLM and set the base URL, something like the following (assuming my container is called "ollama"):

_llm = OllamaLLM(llm_name="mistral", base_url="http://ollama:11434")

However, OllamaLLM hardcodes the url, which makes sense given that it's meant to run locally.

...
response = requests.post("http://localhost:11434/api/chat", json=payload)
output = response.json()["message"]["content"]
...

(https://github.com/aurelio-labs/semantic-router/blob/main/semantic_router/llms/ollama.py#L52)


I think a simple fix to add a base_url arg would be the following:

diff --git a/semantic_router/llms/ollama.py b/semantic_router/llms/ollama.py
index df35ac0..3a09244 100644
--- a/semantic_router/llms/ollama.py
+++ b/semantic_router/llms/ollama.py
@@ -13,4 +13,5 @@ class OllamaLLM(BaseLLM):
     max_tokens: Optional[int]
     stream: Optional[bool]
+    base_url: Optional[str]

     def __init__(
@@ -21,4 +22,5 @@ class OllamaLLM(BaseLLM):
         max_tokens: Optional[int] = 200,
         stream: bool = False,
+        base_url: str = "http://localhost:11434",
     ):
         super().__init__(name=name)
@@ -27,4 +29,5 @@ class OllamaLLM(BaseLLM):
         self.max_tokens = max_tokens
         self.stream = stream
+        self.base_url = base_url

     def __call__(
@@ -35,4 +38,5 @@ class OllamaLLM(BaseLLM):
         max_tokens: Optional[int] = None,
         stream: Optional[bool] = None,
+        base_url: Optional[str] = None,
     ) -> str:
         # Use instance defaults if not overridden
@@ -41,4 +45,5 @@ class OllamaLLM(BaseLLM):
         max_tokens = max_tokens if max_tokens is not None else self.max_tokens
         stream = stream if stream is not None else self.stream
+        base_url = base_url if base_url is not None else self.base_url

         try:
@@ -50,5 +55,5 @@ class OllamaLLM(BaseLLM):
                 "stream": stream,
             }
-            response = requests.post("http://localhost:11434/api/chat", json=payload)
+            response = requests.post(f"{base_url}/api/chat", json=payload)
             output = response.json()["message"]["content"]

Here's a draft PR on my fork: https://github.com/prbarcelon/semantic-router/pull/1/files

What are the team's thoughts? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant