[Inference] Integrate vllm example #262

KepingYan · 2024-06-25T07:53:20Z

No description provided.

carsonwang · 2024-06-26T03:56:00Z

Thanks for the work! Can you also update all model yamls so they will use vllm by default unless one is not supported. Remove IPEX and Deepspeed related configs from the yaml and disable them by default.

KepingYan · 2024-08-01T09:56:08Z

Gently ping @xwu99 , review comments are all resolved.

integrate vllm example

444a38f

KepingYan added 22 commits June 26, 2024 13:23

Merge remote-tracking branch 'upstream/main' into integrate_vllm_example

bc8cd09

use vllm by default, remove ipex & deepspeed releted configs

0eb215a

fix ci

ca68fb2

Modify CI, the default is vllm engine

73203cf

fix ci

5bcad56

fix ci

6ac99ec

fix ci, remove google/gemma2 due to huggingface token

de052bf

fix ci & add params, placement_bundles is no longer used when tp=1

11ae87f

mv vllm engine into route actor

9bffc53

Merge remote-tracking branch 'upstream/main' into integrate_vllm_example

cc4f5ec

fix ci

cdc91b2

fix ci

87b0af9

fix ci

3e47c51

fix ci

457582f

fix ci

4f0a17e

Merge remote-tracking branch 'upstream/main' into integrate_vllm_example

f626139

fix ci

21db092

fix ci

8475805

fix ci

b56519d

fix ci

87ccf7d

revert precision

5d8c256

use fp32 because ipexbf16 needs AVX512

129cbcc

KepingYan marked this pull request as ready for review July 23, 2024 07:12

KepingYan requested review from carsonwang and xwu99 July 23, 2024 07:31

KepingYan added 2 commits August 1, 2024 15:37

Merge remote-tracking branch 'upstream/main' into integrate_vllm_example

124196b

address comments

6243b67

xwu99 approved these changes Aug 16, 2024

View reviewed changes

xwu99 merged commit da6d9f9 into intel:main Aug 16, 2024
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference] Integrate vllm example #262

[Inference] Integrate vllm example #262

KepingYan commented Jun 25, 2024

carsonwang commented Jun 26, 2024

KepingYan commented Aug 1, 2024

[Inference] Integrate vllm example #262

[Inference] Integrate vllm example #262

Conversation

KepingYan commented Jun 25, 2024

carsonwang commented Jun 26, 2024

KepingYan commented Aug 1, 2024