-
Notifications
You must be signed in to change notification settings - Fork 98
llm-uservice: Support air gapped environment #1085
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Let's wait for opea-project/GenAIComps#1758, opea-project/GenAIComps#1743 to land in first |
eero-t
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few typo fixes + text suggestions
bfa4919 to
93b6aea
Compare
|
Not sure if this flag help with the air gapped mode? HF_HUB_OFFLINE=1 |
If I understood correctly, it would help by blocking engines using HF download facilities, from trying to access online resources. Does that mean it having impact only on TEI / TGI, but not on vLLM, Ollama etc? |
HF_HUB_OFFLINE only prevent the huggingface modules to download the model data from online resources. If the data is not existing in local cache dir, it will still cause problems. The only way to use it is when the data is already exiting in local dir, but sometimes the underlying huggingface modules will still try to access online resources which cause the failure. Just like what we did in #1072. However, for llm-uservice, we don't need it. |
Signed-off-by: Lianhao Lu <[email protected]>
Following GenAIExample to change the default model of vllm as a temporary workaround for GenAIComps issue #1719. Signed-off-by: Lianhao Lu <[email protected]>
for more information, see https://pre-commit.ci
64246ee to
2251d0c
Compare
Description
Issues
n/a.Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
List the newly introduced 3rd party dependency if exists.
Tests
Describe the tests that you ran to verify your changes.