Skip to content

Conversation

@tabtablabs-dev
Copy link

Summary

  • add a resolve_device helper so the Qwen HF path can run on CUDA, MPS, or CPU without changing settings
  • default to the model's loaded device or settings.TORCH_DEVICE when provided

Testing

  • ty check (fails: existing invalid-argument-type and typing errors unrelated to this change)

Copy link

@Yu-ChangCheng Yu-ChangCheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@andybbruno
Copy link

andybbruno commented Nov 4, 2025

On my Macbook Pro (M3 Pro with 18GB of RAM) it fails. The error I get is the following:

/AppleInternal/Library/BuildRoots/01adf19d-fba1-11ef-a947-f2a857e00a32/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Utility/MPSCommandBufferImageCache.mm:1420:failed assertion `Failed to allocate private MTLBuffer for size 12544000000

@andybbruno
Copy link

I somehow managed to let it work by removing the kwargs from the Qwen3VL model loading function (see below)

def load_model():
    # device_map = "auto"
    # if settings.TORCH_DEVICE:
    #     device_map = {"": settings.TORCH_DEVICE}

    # kwargs = {
    #     "dtype": settings.TORCH_DTYPE,
    #     "device_map": device_map,
    # }
    # if settings.TORCH_ATTN:
    #     kwargs["attn_implementation"] = settings.TORCH_ATTN

    model = Qwen3VLForConditionalGeneration.from_pretrained(
        settings.MODEL_CHECKPOINT,
        # **kwargs
    )

So, I guess either the dtypeor the device_map argument is causing some issues on Mac Silicon chips

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants