Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for Llama 3.2-Vision models #2376

Merged
merged 10 commits into from
Nov 5, 2024

Conversation

vikrantrathore
Copy link
Contributor

This pull request introduces support for the Llama 3.2-Vision collection of multimodal large language models (LLMs) within Xinference. These models bring the capability to process both text and image inputs, expanding the potential for diverse applications.

Key Changes:

  • Expanded Model Support: Adds Llama 3.2-Vision and Llama 3.2-Vision-Instruct models to the list of supported models, accessible through both the transformers and vllm engines.
  • Vllm Engine Enhancement: Updates the vllm engine to accommodate the specific requirements of the Llama 3.2-Vision models.
  • Documentation Updates: Improves the documentation to include details about the newly supported models, guiding users on their effective utilization.

This pull request adds support for the Llama 3.2-Vision collection of multimodal LLMs for both the transformers and vllm engines.

  • Updated llm_family.json and llm_family_modelscope.json to include Llama 3.2-Vision and Llama 3.2-Vision-Instruct model information.
  • Modified vllm engine's core.py to handle these models.
  • Enhanced documentation with model reference files to reflect the newly supported built-in models.

This commit adds support for the Llama 3.2-Vision collection of multimodal LLMs for both the transformers and vllm engines.

- Updated `llm_family.json` and `llm_family_modelscope.json` to include Llama 3.2-Vision and Llama 3.2-Vision-Instruct model information.
- Modified `vllm` engine's `core.py` to handle these models.
- Enhanced documentation with model reference files to reflect the newly supported built-in models.
@vikrantrathore
Copy link
Contributor Author

@qinxuye Any issues in merging this changes, the related issue #2372 was automatically closed given no activity by github bot?

@qinxuye
Copy link
Contributor

qinxuye commented Oct 10, 2024

@qinxuye Any issues in merging this changes, the related issue #2372 was automatically closed given no activity by github bot?

Ci has not pass yet, please fix it first.

@vikrantrathore
Copy link
Contributor Author

@qinxuye Any issues in merging this changes, the related issue #2372 was automatically closed given no activity by github bot?

Ci has not pass yet, please fix it first.

Following is the CI error:

xinference/__init__.py:37: in <module>
    _install()
xinference/__init__.py:34: in _install
    install_model()
xinference/model/__init__.py:25: in _install
    llm_install()
xinference/model/llm/__init__.py:209: in _install
    for json_obj in json.load(codecs.open(json_path, "r", encoding="utf-8")):
/usr/share/miniconda/envs/test/lib/python3.9/json/__init__.py:293: in load
    return loads(fp.read(),
/usr/share/miniconda/envs/test/lib/python3.9/json/__init__.py:346: in loads
    return _default_decoder.decode(s)
/usr/share/miniconda/envs/test/lib/python3.9/json/decoder.py:337: in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
/usr/share/miniconda/envs/test/lib/python3.9/json/decoder.py:353: in raw_decode
    obj, end = self.scan_once(s, idx)
E   json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1342 column 7 (char 44041)

This seems to originate from xinference/model/llm/__init__.py:209: in _install not linked to the code changes made by my pull request. Earlier also when I added Gemma model CI failed except the lint.

@qinxuye
Copy link
Contributor

qinxuye commented Oct 10, 2024

@qinxuye Any issues in merging this changes, the related issue #2372 was automatically closed given no activity by github bot?

Ci has not pass yet, please fix it first.

Following is the CI error:

xinference/__init__.py:37: in <module>
    _install()
xinference/__init__.py:34: in _install
    install_model()
xinference/model/__init__.py:25: in _install
    llm_install()
xinference/model/llm/__init__.py:209: in _install
    for json_obj in json.load(codecs.open(json_path, "r", encoding="utf-8")):
/usr/share/miniconda/envs/test/lib/python3.9/json/__init__.py:293: in load
    return loads(fp.read(),
/usr/share/miniconda/envs/test/lib/python3.9/json/__init__.py:346: in loads
    return _default_decoder.decode(s)
/usr/share/miniconda/envs/test/lib/python3.9/json/decoder.py:337: in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
/usr/share/miniconda/envs/test/lib/python3.9/json/decoder.py:353: in raw_decode
    obj, end = self.scan_once(s, idx)
E   json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1342 column 7 (char 44041)

This seems to originate from xinference/model/llm/__init__.py:209: in _install not linked to the code changes made by my pull request. Earlier also when I added Gemma model CI failed except the lint.

This should be related to model_config.json, the json file cannot be read normally.

@vikrantrathore
Copy link
Contributor Author

I am running locally a production instance using my branch of xinference and it works without any errors and also it can load the Llama-3.2 models correctly. I need to make some changes to install vLLM 0.6.2 as it requires fastapi>=0.114.1 but xinference requires a dependency of fastapi==0.110.3 or smaller. I am using ubuntu 22.04 with python 3.11.9 using uv package manager.

@qinxuye
Copy link
Contributor

qinxuye commented Oct 10, 2024

I am running locally a production instance using my branch of xinference and it works without any errors and also it can load the Llama-3.2 models correctly. I need to make some changes to install vLLM 0.6.2 as it requires fastapi>=0.114.1 but xinference requires a dependency of fastapi==0.110.3 or smaller. I am using ubuntu 22.04 with python 3.11.9 using uv package manager.

I think the limitation of fastapi can be removed now IMO.

- Updated `llm_family.json` and `llm_family_modelscope.json` to remove trailing commas in the Llama-3.2 model configuration.
@vikrantrathore
Copy link
Contributor Author

I am running locally a production instance using my branch of xinference and it works without any errors and also it can load the Llama-3.2 models correctly. I need to make some changes to install vLLM 0.6.2 as it requires fastapi>=0.114.1 but xinference requires a dependency of fastapi==0.110.3 or smaller. I am using ubuntu 22.04 with python 3.11.9 using uv package manager.

I think the limitation of fastapi can be removed now IMO.

Ok will do that and commit again. I just fixed the trailing ',' error from the json files. JSON validator worked fine on it but trailing commas are ok in Python dictionaries not in JSON.

- Updated `setup.cfg` to require `fastapi>=0.114.1` to support the installation of `vllm>=0.6.2`, which depends on the updated FastAPI version.
@vikrantrathore
Copy link
Contributor Author

@qinxuye All the CI jobs passed except the self_hosted GPU, the error is linked to ChatTTS module not connected to changes in this PR. So I believe you should be able to merge this PR unless you want to fix the ChatTTS related errors which might have been introduced by some other merged PR.

2024-10-10 08:50:48,819 xinference.core.model 3824773 ERROR    [request bf128510-86e4-11ef-8c2e-0a990c82cb6e] Leave speech, error: 'Chat' object has no attribute 'speaker', elapsed time: 0 s
Traceback (most recent call last):
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/utils.py", line 69, in wrapped
    ret = await func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 711, in speech
    return await self._call_wrapper_binary(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 410, in _call_wrapper_binary
    return await self._call_wrapper("binary", fn, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 120, in _async_wrapper
    return await fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 424, in _call_wrapper
    ret = await asyncio.to_thread(fn, *args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/model/audio/chattts.py", line 93, in speech
    rnd_spk_emb = self._model.sample_random_speaker()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/ChatTTS/core.py", line 160, in sample_random_speaker
    return self.speaker.sample_random()
           ^^^^^^^^^^^^
AttributeError: 'Chat' object has no attribute 'speaker'
2024-10-10 08:50:48,821 xinference.core.model 3824773 DEBUG    After request speech, current serve request count: 0 for the model ChatTTS-1-0
2024-10-10 08:50:48,825 xinference.api.restful_api 3824795 ERROR    [address=localhost:40523, pid=3824773] 'Chat' object has no attribute 'speaker'
Traceback (most recent call last):
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/api/restful_api.py", line 1465, in create_speech
    out = await model.speech(
          ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/context.py", line 231, in send
    return self._process_result_message(result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/pool.py", line 656, in send
    result = await self._run_coro(message.message_id, coro)
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/pool.py", line 367, in _run_coro
    return await coro
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 96, in wrapped_func
    ret = await fn(self, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/api.py", line 462, in _wrapper
    r = await func(self, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/utils.py", line 69, in wrapped
    ret = await func(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 711, in speech
    return await self._call_wrapper_binary(
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 410, in _call_wrapper_binary
    return await self._call_wrapper("binary", fn, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 120, in _async_wrapper
    return await fn(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 424, in _call_wrapper
    ret = await asyncio.to_thread(fn, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
      ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/model/audio/chattts.py", line 93, in speech
    rnd_spk_emb = self._model.sample_random_speaker()
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/ChatTTS/core.py", line 160, in sample_random_speaker
    return self.speaker.sample_random()
    ^^^^^^^^^^^^^^^^^
AttributeError: [address=localhost:40523, pid=3824773] 'Chat' object has no attribute 'speaker'
=============================== warnings summary ===============================
../../../../../miniconda3/envs/inference_test/lib/python3.11/site-packages/coverage/inorout.py:460
  /root/miniconda3/envs/inference_test/lib/python3.11/site-packages/coverage/inorout.py:460: CoverageWarning: --include is ignored because --source is set (include-ignored)
    self.warn("--include is ignored because --source is set", slug="include-ignored")

xinference/model/audio/tests/test_chattts.py::test_chattts
  /root/miniconda3/envs/inference_test/lib/python3.11/site-packages/passlib/utils/__init__.py:854: DeprecationWarning: 'crypt' is deprecated and slated for removal in Python 3.13
    from crypt import crypt as _crypt

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

---------- coverage: platform linux, python 3.11.5-final-0 -----------
Coverage XML written to file coverage.xml

=========================== short test summary info ============================
FAILED xinference/model/audio/tests/test_chattts.py::test_chattts - RuntimeError: Failed to speech the text, detail: [address=localhost:40523, pid=3824773] 'Chat' object has no attribute 'speaker'
======================== 1 failed, 2 warnings in 29.18s ========================
Error: Process completed with exit code 1.

@qinxuye
Copy link
Contributor

qinxuye commented Oct 11, 2024

@qinxuye All the CI jobs passed except the self_hosted GPU, the error is linked to ChatTTS module not connected to changes in this PR. So I believe you should be able to merge this PR unless you want to fix the ChatTTS related errors which might have been introduced by some other merged PR.

2024-10-10 08:50:48,819 xinference.core.model 3824773 ERROR    [request bf128510-86e4-11ef-8c2e-0a990c82cb6e] Leave speech, error: 'Chat' object has no attribute 'speaker', elapsed time: 0 s
Traceback (most recent call last):
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/utils.py", line 69, in wrapped
    ret = await func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 711, in speech
    return await self._call_wrapper_binary(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 410, in _call_wrapper_binary
    return await self._call_wrapper("binary", fn, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 120, in _async_wrapper
    return await fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 424, in _call_wrapper
    ret = await asyncio.to_thread(fn, *args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/model/audio/chattts.py", line 93, in speech
    rnd_spk_emb = self._model.sample_random_speaker()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/ChatTTS/core.py", line 160, in sample_random_speaker
    return self.speaker.sample_random()
           ^^^^^^^^^^^^
AttributeError: 'Chat' object has no attribute 'speaker'
2024-10-10 08:50:48,821 xinference.core.model 3824773 DEBUG    After request speech, current serve request count: 0 for the model ChatTTS-1-0
2024-10-10 08:50:48,825 xinference.api.restful_api 3824795 ERROR    [address=localhost:40523, pid=3824773] 'Chat' object has no attribute 'speaker'
Traceback (most recent call last):
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/api/restful_api.py", line 1465, in create_speech
    out = await model.speech(
          ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/context.py", line 231, in send
    return self._process_result_message(result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/pool.py", line 656, in send
    result = await self._run_coro(message.message_id, coro)
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/pool.py", line 367, in _run_coro
    return await coro
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 96, in wrapped_func
    ret = await fn(self, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/api.py", line 462, in _wrapper
    r = await func(self, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/utils.py", line 69, in wrapped
    ret = await func(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 711, in speech
    return await self._call_wrapper_binary(
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 410, in _call_wrapper_binary
    return await self._call_wrapper("binary", fn, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 120, in _async_wrapper
    return await fn(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 424, in _call_wrapper
    ret = await asyncio.to_thread(fn, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
      ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/model/audio/chattts.py", line 93, in speech
    rnd_spk_emb = self._model.sample_random_speaker()
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/ChatTTS/core.py", line 160, in sample_random_speaker
    return self.speaker.sample_random()
    ^^^^^^^^^^^^^^^^^
AttributeError: [address=localhost:40523, pid=3824773] 'Chat' object has no attribute 'speaker'
=============================== warnings summary ===============================
../../../../../miniconda3/envs/inference_test/lib/python3.11/site-packages/coverage/inorout.py:460
  /root/miniconda3/envs/inference_test/lib/python3.11/site-packages/coverage/inorout.py:460: CoverageWarning: --include is ignored because --source is set (include-ignored)
    self.warn("--include is ignored because --source is set", slug="include-ignored")

xinference/model/audio/tests/test_chattts.py::test_chattts
  /root/miniconda3/envs/inference_test/lib/python3.11/site-packages/passlib/utils/__init__.py:854: DeprecationWarning: 'crypt' is deprecated and slated for removal in Python 3.13
    from crypt import crypt as _crypt

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

---------- coverage: platform linux, python 3.11.5-final-0 -----------
Coverage XML written to file coverage.xml

=========================== short test summary info ============================
FAILED xinference/model/audio/tests/test_chattts.py::test_chattts - RuntimeError: Failed to speech the text, detail: [address=localhost:40523, pid=3824773] 'Chat' object has no attribute 'speaker'
======================== 1 failed, 2 warnings in 29.18s ========================
Error: Process completed with exit code 1.

This is a known issue, we can ignore it now, I will review this PR ASAP.

@amumu96
Copy link
Contributor

amumu96 commented Oct 12, 2024

Does Llama 3.2-Vision-Instruct work well?

@XprobeBot XprobeBot modified the milestones: v0.15, v0.16 Oct 30, 2024
Merged with upstream changes and made modifications to VLLM_SUPPORTED_VISION_MODEL_LIST
Added space before VLLMModel class for flake8 rule
@vikrantrathore
Copy link
Contributor Author

@qinxuye Any updates on this PR for Llama 3.2 Vision model?
Now I am using my branch for running it, if you merge can have the same code base as main.

…sion

Updated the model_id in modelscope model link for Llama-3.2-90B-Vision
Copy link
Contributor

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qinxuye qinxuye merged commit ee98bc4 into xorbitsai:main Nov 5, 2024
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants