vLLM Windows CUDA support [tested] #2158

fenglui · 2025-03-23T02:46:35Z

vllm-windows Windows wheels by SystemPanic

install

conda create -n vllm python=3.12
conda activate vllm
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
pip install https://github.com/SystemPanic/vllm-windows/releases/download/v0.8.1/vllm-0.8.1+cu124-cp312-cp312-win_amd64.whl
pip install https://github.com/SystemPanic/flashinfer-windows/releases/download/v0.2.3/flashinfer_python-0.2.3+cu124torch2.6-cp312-cp312-win_amd64.whl
pip install --upgrade pillow
pip install --upgrade pandas
pip install --upgrade triton-windows
pip install grpcio==1.71.0
pip install "unsloth[windows] @ git+https://github.com/fenglui/unsloth.git"
pip install --no-deps git+https://github.com/huggingface/transformers.git
pip install trl==0.15.2

training test

download https://github.com/unslothai/notebooks/blob/main/nb/Qwen2.5_(3B)-GRPO.ipynb

remove the installation code block

add code block at top

import os
os.environ['UNSLOTH_DISABLE_AUTO_UPDATES'] = '1'
os.environ["VLLM_USE_V1"] = "0"
os.environ["VLLM_ATTENTION_BACKEND"] = "FLASHINFER"

# Disable libuv on Windows by default
os.environ["USE_LIBUV"] = os.environ.get("USE_LIBUV", "0")

then you can run the rest of the training code.

if the vllm serve faild, add env var "USE_LIBUV" and set value "0" to your windows system var

and that's it, we can run unsloth training with vllm support with Windows.

change vllm installed check by transformers utils function

void-mckenzie · 2025-03-24T17:32:57Z

Tested on windows 11, working as intended.

danielhanchen · 2025-03-25T11:37:31Z

Oh ok! Great vLLM works on Windows - it's maybe best we also add a print statement showing you can use https://github.com/SystemPanic/vllm-windows! Maybe we should add it in the readme!

Datta0 · 2025-04-15T03:52:48Z

unsloth/models/loader.py

+            from transformers.utils.import_utils import _is_package_available
+            _vllm_available = _is_package_available("vllm")
+            if _vllm_available == False:
+                print("Unsloth: vLLM is not installed! Will use Unsloth inference!")


Minor suggestion: can we have this check at the top of loader.py (like where we check transformers versions for model support)
And maybe set it as a constant that we can later reuse in llama.py as well instead of duplicating it?

Yes I think so.

now a new function is_vLLM_available was added to utils.

Datta0 · 2025-04-16T10:57:54Z

unsloth/models/loader.py

-            from transformers.utils.import_utils import _is_package_available
-            _vllm_available = _is_package_available("vllm")
-            if _vllm_available == False:
+            if is_vLLM_available() == False:


NIT: General python way is to use if not is_vllm_available()
But this is fine as well...

fenglui added 2 commits March 23, 2025 10:04

Update loader.py

26b60a2

change vllm installed check by transformers utils function

Update llama.py

5052e16

change vllm installed check by transformers utils function

add sample notebook

ddd8bf4

fenglui mentioned this pull request Apr 5, 2025

WIndows Install of Unsloth #2024

Closed

Merge branch 'unslothai:main' into main

8179aca

organics2016 mentioned this pull request Apr 8, 2025

[QST] Try running unsloth on Windows and RTX5090 #2313

Open

fenglui and others added 2 commits April 9, 2025 12:29

fix Indentation

2371a85

Merge branch 'unslothai:main' into main

3e4b70f

Datta0 reviewed Apr 15, 2025

View reviewed changes

fenglui and others added 2 commits April 16, 2025 05:04

Merge branch 'unslothai:main' into main

1fad4fb

add global is_vLLM_available function

2119c61

Datta0 reviewed Apr 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vLLM Windows CUDA support [tested] #2158

vLLM Windows CUDA support [tested] #2158

fenglui commented Mar 23, 2025 •

edited

Loading

void-mckenzie commented Mar 24, 2025

danielhanchen commented Mar 25, 2025

Datta0 Apr 15, 2025

fenglui Apr 15, 2025

fenglui Apr 15, 2025

Datta0 Apr 16, 2025

vLLM Windows CUDA support [tested] #2158

Are you sure you want to change the base?

vLLM Windows CUDA support [tested] #2158

Conversation

fenglui commented Mar 23, 2025 • edited Loading

void-mckenzie commented Mar 24, 2025

danielhanchen commented Mar 25, 2025

Datta0 Apr 15, 2025

Choose a reason for hiding this comment

fenglui Apr 15, 2025

Choose a reason for hiding this comment

fenglui Apr 15, 2025

Choose a reason for hiding this comment

Datta0 Apr 16, 2025

Choose a reason for hiding this comment

fenglui commented Mar 23, 2025 •

edited

Loading