Skip to content

Conversation

0x404
Copy link
Contributor

@0x404 0x404 commented Sep 22, 2024

ModelLoaderHuggerFace currently only supports reading tensors from a checkpoint and loading them into the model, while keeping the tensor dtype as it is.

This PR adds an fp16_inference option, allowing ModelLoaderHuggerFace to load fp16 models for fp16 inference.

0x404 and others added 6 commits September 22, 2024 08:38
@ShawnXuan
Copy link
Contributor

加载模型后再转为fp16,内存会突然减小很多。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants