feat: Add support for Llama 3.2-Vision models #2376

vikrantrathore · 2024-09-27T03:56:18Z

This pull request introduces support for the Llama 3.2-Vision collection of multimodal large language models (LLMs) within Xinference. These models bring the capability to process both text and image inputs, expanding the potential for diverse applications.

Key Changes:

Expanded Model Support: Adds Llama 3.2-Vision and Llama 3.2-Vision-Instruct models to the list of supported models, accessible through both the transformers and vllm engines.
Vllm Engine Enhancement: Updates the vllm engine to accommodate the specific requirements of the Llama 3.2-Vision models.
Documentation Updates: Improves the documentation to include details about the newly supported models, guiding users on their effective utilization.

This pull request adds support for the Llama 3.2-Vision collection of multimodal LLMs for both the transformers and vllm engines.

Updated llm_family.json and llm_family_modelscope.json to include Llama 3.2-Vision and Llama 3.2-Vision-Instruct model information.
Modified vllm engine's core.py to handle these models.
Enhanced documentation with model reference files to reflect the newly supported built-in models.

This commit adds support for the Llama 3.2-Vision collection of multimodal LLMs for both the transformers and vllm engines. - Updated `llm_family.json` and `llm_family_modelscope.json` to include Llama 3.2-Vision and Llama 3.2-Vision-Instruct model information. - Modified `vllm` engine's `core.py` to handle these models. - Enhanced documentation with model reference files to reflect the newly supported built-in models.

…t code style.

vikrantrathore · 2024-10-10T05:33:46Z

@qinxuye Any issues in merging this changes, the related issue #2372 was automatically closed given no activity by github bot?

qinxuye · 2024-10-10T06:18:21Z

@qinxuye Any issues in merging this changes, the related issue #2372 was automatically closed given no activity by github bot?

Ci has not pass yet, please fix it first.

vikrantrathore · 2024-10-10T06:47:58Z

@qinxuye Any issues in merging this changes, the related issue #2372 was automatically closed given no activity by github bot?

Ci has not pass yet, please fix it first.

Following is the CI error:

xinference/__init__.py:37: in <module>
    _install()
xinference/__init__.py:34: in _install
    install_model()
xinference/model/__init__.py:25: in _install
    llm_install()
xinference/model/llm/__init__.py:209: in _install
    for json_obj in json.load(codecs.open(json_path, "r", encoding="utf-8")):
/usr/share/miniconda/envs/test/lib/python3.9/json/__init__.py:293: in load
    return loads(fp.read(),
/usr/share/miniconda/envs/test/lib/python3.9/json/__init__.py:346: in loads
    return _default_decoder.decode(s)
/usr/share/miniconda/envs/test/lib/python3.9/json/decoder.py:337: in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
/usr/share/miniconda/envs/test/lib/python3.9/json/decoder.py:353: in raw_decode
    obj, end = self.scan_once(s, idx)
E   json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1342 column 7 (char 44041)

This seems to originate from xinference/model/llm/__init__.py:209: in _install not linked to the code changes made by my pull request. Earlier also when I added Gemma model CI failed except the lint.

qinxuye · 2024-10-10T07:20:27Z

@qinxuye Any issues in merging this changes, the related issue #2372 was automatically closed given no activity by github bot?

Ci has not pass yet, please fix it first.

Following is the CI error:
xinference/__init__.py:37: in <module>
    _install()
xinference/__init__.py:34: in _install
    install_model()
xinference/model/__init__.py:25: in _install
    llm_install()
xinference/model/llm/__init__.py:209: in _install
    for json_obj in json.load(codecs.open(json_path, "r", encoding="utf-8")):
/usr/share/miniconda/envs/test/lib/python3.9/json/__init__.py:293: in load
    return loads(fp.read(),
/usr/share/miniconda/envs/test/lib/python3.9/json/__init__.py:346: in loads
    return _default_decoder.decode(s)
/usr/share/miniconda/envs/test/lib/python3.9/json/decoder.py:337: in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
/usr/share/miniconda/envs/test/lib/python3.9/json/decoder.py:353: in raw_decode
    obj, end = self.scan_once(s, idx)
E   json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1342 column 7 (char 44041)
This seems to originate from xinference/model/llm/__init__.py:209: in _install not linked to the code changes made by my pull request. Earlier also when I added Gemma model CI failed except the lint.

This should be related to model_config.json, the json file cannot be read normally.

vikrantrathore · 2024-10-10T07:53:59Z

I am running locally a production instance using my branch of xinference and it works without any errors and also it can load the Llama-3.2 models correctly. I need to make some changes to install vLLM 0.6.2 as it requires fastapi>=0.114.1 but xinference requires a dependency of fastapi==0.110.3 or smaller. I am using ubuntu 22.04 with python 3.11.9 using uv package manager.

qinxuye · 2024-10-10T08:33:39Z

I am running locally a production instance using my branch of xinference and it works without any errors and also it can load the Llama-3.2 models correctly. I need to make some changes to install vLLM 0.6.2 as it requires fastapi>=0.114.1 but xinference requires a dependency of fastapi==0.110.3 or smaller. I am using ubuntu 22.04 with python 3.11.9 using uv package manager.

I think the limitation of fastapi can be removed now IMO.

- Updated `llm_family.json` and `llm_family_modelscope.json` to remove trailing commas in the Llama-3.2 model configuration.

vikrantrathore · 2024-10-10T08:39:12Z

I am running locally a production instance using my branch of xinference and it works without any errors and also it can load the Llama-3.2 models correctly. I need to make some changes to install vLLM 0.6.2 as it requires fastapi>=0.114.1 but xinference requires a dependency of fastapi==0.110.3 or smaller. I am using ubuntu 22.04 with python 3.11.9 using uv package manager.

I think the limitation of fastapi can be removed now IMO.

Ok will do that and commit again. I just fixed the trailing ',' error from the json files. JSON validator worked fine on it but trailing commas are ok in Python dictionaries not in JSON.

- Updated `setup.cfg` to require `fastapi>=0.114.1` to support the installation of `vllm>=0.6.2`, which depends on the updated FastAPI version.

vikrantrathore · 2024-10-11T02:25:09Z

@qinxuye All the CI jobs passed except the self_hosted GPU, the error is linked to ChatTTS module not connected to changes in this PR. So I believe you should be able to merge this PR unless you want to fix the ChatTTS related errors which might have been introduced by some other merged PR.

2024-10-10 08:50:48,819 xinference.core.model 3824773 ERROR    [request bf128510-86e4-11ef-8c2e-0a990c82cb6e] Leave speech, error: 'Chat' object has no attribute 'speaker', elapsed time: 0 s
Traceback (most recent call last):
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/utils.py", line 69, in wrapped
    ret = await func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 711, in speech
    return await self._call_wrapper_binary(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 410, in _call_wrapper_binary
    return await self._call_wrapper("binary", fn, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 120, in _async_wrapper
    return await fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 424, in _call_wrapper
    ret = await asyncio.to_thread(fn, *args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/model/audio/chattts.py", line 93, in speech
    rnd_spk_emb = self._model.sample_random_speaker()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/ChatTTS/core.py", line 160, in sample_random_speaker
    return self.speaker.sample_random()
           ^^^^^^^^^^^^
AttributeError: 'Chat' object has no attribute 'speaker'
2024-10-10 08:50:48,821 xinference.core.model 3824773 DEBUG    After request speech, current serve request count: 0 for the model ChatTTS-1-0
2024-10-10 08:50:48,825 xinference.api.restful_api 3824795 ERROR    [address=localhost:40523, pid=3824773] 'Chat' object has no attribute 'speaker'
Traceback (most recent call last):
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/api/restful_api.py", line 1465, in create_speech
    out = await model.speech(
          ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/context.py", line 231, in send
    return self._process_result_message(result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/pool.py", line 656, in send
    result = await self._run_coro(message.message_id, coro)
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/pool.py", line 367, in _run_coro
    return await coro
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 96, in wrapped_func
    ret = await fn(self, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/api.py", line 462, in _wrapper
    r = await func(self, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/utils.py", line 69, in wrapped
    ret = await func(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 711, in speech
    return await self._call_wrapper_binary(
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 410, in _call_wrapper_binary
    return await self._call_wrapper("binary", fn, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 120, in _async_wrapper
    return await fn(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 424, in _call_wrapper
    ret = await asyncio.to_thread(fn, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
      ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/model/audio/chattts.py", line 93, in speech
    rnd_spk_emb = self._model.sample_random_speaker()
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/ChatTTS/core.py", line 160, in sample_random_speaker
    return self.speaker.sample_random()
    ^^^^^^^^^^^^^^^^^
AttributeError: [address=localhost:40523, pid=3824773] 'Chat' object has no attribute 'speaker'
=============================== warnings summary ===============================
../../../../../miniconda3/envs/inference_test/lib/python3.11/site-packages/coverage/inorout.py:460
  /root/miniconda3/envs/inference_test/lib/python3.11/site-packages/coverage/inorout.py:460: CoverageWarning: --include is ignored because --source is set (include-ignored)
    self.warn("--include is ignored because --source is set", slug="include-ignored")

xinference/model/audio/tests/test_chattts.py::test_chattts
  /root/miniconda3/envs/inference_test/lib/python3.11/site-packages/passlib/utils/__init__.py:854: DeprecationWarning: 'crypt' is deprecated and slated for removal in Python 3.13
    from crypt import crypt as _crypt

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

---------- coverage: platform linux, python 3.11.5-final-0 -----------
Coverage XML written to file coverage.xml

=========================== short test summary info ============================
FAILED xinference/model/audio/tests/test_chattts.py::test_chattts - RuntimeError: Failed to speech the text, detail: [address=localhost:40523, pid=3824773] 'Chat' object has no attribute 'speaker'
======================== 1 failed, 2 warnings in 29.18s ========================
Error: Process completed with exit code 1.

qinxuye · 2024-10-11T04:34:07Z

@qinxuye All the CI jobs passed except the self_hosted GPU, the error is linked to ChatTTS module not connected to changes in this PR. So I believe you should be able to merge this PR unless you want to fix the ChatTTS related errors which might have been introduced by some other merged PR.

2024-10-10 08:50:48,819 xinference.core.model 3824773 ERROR    [request bf128510-86e4-11ef-8c2e-0a990c82cb6e] Leave speech, error: 'Chat' object has no attribute 'speaker', elapsed time: 0 s
Traceback (most recent call last):
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/utils.py", line 69, in wrapped
    ret = await func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 711, in speech
    return await self._call_wrapper_binary(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 410, in _call_wrapper_binary
    return await self._call_wrapper("binary", fn, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 120, in _async_wrapper
    return await fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 424, in _call_wrapper
    ret = await asyncio.to_thread(fn, *args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/model/audio/chattts.py", line 93, in speech
    rnd_spk_emb = self._model.sample_random_speaker()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/ChatTTS/core.py", line 160, in sample_random_speaker
    return self.speaker.sample_random()
           ^^^^^^^^^^^^
AttributeError: 'Chat' object has no attribute 'speaker'
2024-10-10 08:50:48,821 xinference.core.model 3824773 DEBUG    After request speech, current serve request count: 0 for the model ChatTTS-1-0
2024-10-10 08:50:48,825 xinference.api.restful_api 3824795 ERROR    [address=localhost:40523, pid=3824773] 'Chat' object has no attribute 'speaker'
Traceback (most recent call last):
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/api/restful_api.py", line 1465, in create_speech
    out = await model.speech(
          ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/context.py", line 231, in send
    return self._process_result_message(result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/pool.py", line 656, in send
    result = await self._run_coro(message.message_id, coro)
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/backends/pool.py", line 367, in _run_coro
    return await coro
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 96, in wrapped_func
    ret = await fn(self, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/xoscar/api.py", line 462, in _wrapper
    r = await func(self, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/utils.py", line 69, in wrapped
    ret = await func(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 711, in speech
    return await self._call_wrapper_binary(
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 410, in _call_wrapper_binary
    return await self._call_wrapper("binary", fn, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 120, in _async_wrapper
    return await fn(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/core/model.py", line 424, in _call_wrapper
    ret = await asyncio.to_thread(fn, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
      ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/root/xorbitsai/actions-runner/_work/inference/inference/xinference/model/audio/chattts.py", line 93, in speech
    rnd_spk_emb = self._model.sample_random_speaker()
    ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/inference_test/lib/python3.11/site-packages/ChatTTS/core.py", line 160, in sample_random_speaker
    return self.speaker.sample_random()
    ^^^^^^^^^^^^^^^^^
AttributeError: [address=localhost:40523, pid=3824773] 'Chat' object has no attribute 'speaker'
=============================== warnings summary ===============================
../../../../../miniconda3/envs/inference_test/lib/python3.11/site-packages/coverage/inorout.py:460
  /root/miniconda3/envs/inference_test/lib/python3.11/site-packages/coverage/inorout.py:460: CoverageWarning: --include is ignored because --source is set (include-ignored)
    self.warn("--include is ignored because --source is set", slug="include-ignored")

xinference/model/audio/tests/test_chattts.py::test_chattts
  /root/miniconda3/envs/inference_test/lib/python3.11/site-packages/passlib/utils/__init__.py:854: DeprecationWarning: 'crypt' is deprecated and slated for removal in Python 3.13
    from crypt import crypt as _crypt

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

---------- coverage: platform linux, python 3.11.5-final-0 -----------
Coverage XML written to file coverage.xml

=========================== short test summary info ============================
FAILED xinference/model/audio/tests/test_chattts.py::test_chattts - RuntimeError: Failed to speech the text, detail: [address=localhost:40523, pid=3824773] 'Chat' object has no attribute 'speaker'
======================== 1 failed, 2 warnings in 29.18s ========================
Error: Process completed with exit code 1.

This is a known issue, we can ignore it now, I will review this PR ASAP.

amumu96 · 2024-10-12T09:10:57Z

Does Llama 3.2-Vision-Instruct work well？

Merged with upstream changes and made modifications to VLLM_SUPPORTED_VISION_MODEL_LIST

Added space before VLLMModel class for flake8 rule

vikrantrathore · 2024-11-05T02:13:31Z

@qinxuye Any updates on this PR for Llama 3.2 Vision model?
Now I am using my branch for running it, if you merge can have the same code base as main.

xinference/model/llm/llm_family_modelscope.json

…sion Updated the model_id in modelscope model link for Llama-3.2-90B-Vision

qinxuye

LGTM

XprobeBot added the feature label Sep 27, 2024

XprobeBot added this to the v0.15 milestone Sep 27, 2024

vikrantrathore mentioned this pull request Sep 27, 2024

[Feature] Support for Llama 3.2 Multi-modal and Lightweight Models #2372

Closed

vikrantrathore added 3 commits September 27, 2024 12:10

Fix: fixes a whitespace error in the core.py file to ensure consisten…

a2a81ab

…t code style.

Merge branch 'main' into intervl_merge

74ea90e

Fix: Update core.py to fix the flake 8 space error

17e8bb1

fix: remove trailing commas from Llama-3.2 model configuration

2dbe66e

- Updated `llm_family.json` and `llm_family_modelscope.json` to remove trailing commas in the Llama-3.2 model configuration.

fix: update FastAPI version in setup.cfg for vllm compatibility

ca26687

- Updated `setup.cfg` to require `fastapi>=0.114.1` to support the installation of `vllm>=0.6.2`, which depends on the updated FastAPI version.

XprobeBot modified the milestones: v0.15, v0.16 Oct 30, 2024

vikrantrathore added 3 commits November 5, 2024 10:02

Merge branch 'main' into intervl_merge

5942b4b

Fixed errors and merged with upstream changes

449bbf7

Merged with upstream changes and made modifications to VLLM_SUPPORTED_VISION_MODEL_LIST

Update core.py for fixing flake8 error

9195d24

Added space before VLLMModel class for flake8 rule

qinxuye reviewed Nov 5, 2024

View reviewed changes

xinference/model/llm/llm_family_modelscope.json Outdated Show resolved Hide resolved

Update llm_family_modelscope.json to fix the link to LLama-3.2-90B-Vi…

f40d261

…sion Updated the model_id in modelscope model link for Llama-3.2-90B-Vision

qinxuye approved these changes Nov 5, 2024

View reviewed changes

qinxuye merged commit ee98bc4 into xorbitsai:main Nov 5, 2024
12 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add support for Llama 3.2-Vision models #2376

feat: Add support for Llama 3.2-Vision models #2376

vikrantrathore commented Sep 27, 2024

vikrantrathore commented Oct 10, 2024

qinxuye commented Oct 10, 2024

vikrantrathore commented Oct 10, 2024

qinxuye commented Oct 10, 2024

vikrantrathore commented Oct 10, 2024

qinxuye commented Oct 10, 2024

vikrantrathore commented Oct 10, 2024

vikrantrathore commented Oct 11, 2024

qinxuye commented Oct 11, 2024

amumu96 commented Oct 12, 2024

vikrantrathore commented Nov 5, 2024

qinxuye left a comment

feat: Add support for Llama 3.2-Vision models #2376

feat: Add support for Llama 3.2-Vision models #2376

Conversation

vikrantrathore commented Sep 27, 2024

vikrantrathore commented Oct 10, 2024

qinxuye commented Oct 10, 2024

vikrantrathore commented Oct 10, 2024

qinxuye commented Oct 10, 2024

vikrantrathore commented Oct 10, 2024

qinxuye commented Oct 10, 2024

vikrantrathore commented Oct 10, 2024

vikrantrathore commented Oct 11, 2024

qinxuye commented Oct 11, 2024

amumu96 commented Oct 12, 2024

vikrantrathore commented Nov 5, 2024

qinxuye left a comment

Choose a reason for hiding this comment