Skip to content

Commit

Permalink
support qwen1.5
Browse files Browse the repository at this point in the history
  • Loading branch information
wangzhaode committed Mar 13, 2024
1 parent 9f4d25b commit ee51b89
Show file tree
Hide file tree
Showing 4 changed files with 89 additions and 14 deletions.
20 changes: 18 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,13 @@ llm模型导出`onnx`和`mnn`模型请使用[llm-export](https://github.com/wang
| internlm-chat-7b | [![Download][download-internlm-7b-onnx]][release-internlm-7b-onnx] | [![Download][download-internlm-chat-7b-mnn]][release-internlm-chat-7b-mnn] |
| Yi-6B-Chat | [![Download][download-yi-6b-chat-onnx]][release-yi-6b-chat-onnx] | [![Download][download-yi-6b-chat-mnn]][release-yi-6b-chat-mnn] |
| deepseek-llm-7b-chat | [![Download][download-deepseek-7b-chat-onnx]][release-deepseek-7b-chat-onnx] | [![Download][download-deepseek-7b-chat-mnn]][release-deepseek-7b-chat-mnn] |
| Qwen-1_8B-Chat | [![Download][download-qwen-1.8b-onnx]][release-qwen-1.8b-onnx] | [![Download][download-qwen-1.8b-mnn]][release-qwen-1.8b-mnn] |
| Qwen-1.8B-Chat | [![Download][download-qwen-1.8b-onnx]][release-qwen-1.8b-onnx] | [![Download][download-qwen-1.8b-mnn]][release-qwen-1.8b-mnn] |
| phi-2 | [![Download][download-phi-2-onnx]][release-phi-2-onnx] | [![Download][download-phi2-mnn-int4]][release-phi2-mnn-int4] |
| bge-large-zh | [![Download][download-bge-large-zh-onnx]][release-bge-large-zh-onnx] | [![Download][download-bge-large-zh-mnn]][release-bge-large-zh-mnn] |
| TinyLlama-1.1B-Chat | [![Download][download-tinyllama-1.1b-chat-onnx]][release-tinyllama-1.1b-chat-onnx] | [![Download][download-tinyllama-1.1b-chat-mnn-int8]][release-tinyllama-1.1b-chat-mnn-int8] |
| Qwen1.5-0.5B-Chat | [![Download][download-qwen1.5-0.5b-onnx]][release-qwen1.5-0.5b-onnx] | [![Download][download-qwen1.5-0.5b-mnn]][release-qwen1.5-0.5b-mnn] |
| Qwen1.5-1.8B-Chat | [![Download][download-qwen1.5-1.8b-onnx]][release-qwen1.5-1.8b-onnx] | [![Download][download-qwen1.5-1.8b-mnn]][release-qwen1.5-1.8b-mnn] |
| Qwen1.5-4B-Chat | [![Download][download-qwen1.5-4b-onnx]][release-qwen1.5-4b-onnx] | [![Download][download-qwen1.5-4b-mnn]][release-qwen1.5-4b-mnn] |

其他版本:
- Qwen-1_8B-Chat-int8:[![Download][download-qwen-1.8b-mnn-int8]][release-qwen-1.8b-mnn-int8]
Expand All @@ -55,6 +58,10 @@ llm模型导出`onnx`和`mnn`模型请使用[llm-export](https://github.com/wang
[download-phi-2-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/phi-2-onnx/total
[download-bge-large-zh-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/bge-large-zh-onnx/total
[download-tinyllama-1.1b-chat-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/tinyllama-1.1b-chat-onnx/total
[download-phi-2-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/phi-2-onnx/total
[download-qwen1.5-0.5b-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/qwen1.5-0.5b-chat-onnx/total
[download-qwen1.5-1.8b-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/qwen1.5-1.8b-chat-onnx/total
[download-qwen1.5-4b-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/qwen1.5-4b-chat-onnx/total
[release-chatglm-6b-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/chatglm-6b-onnx
[release-chatglm2-6b-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/chatglm2-6b-onnx
[release-chatglm3-6b-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/chatglm3-6b-onnx
Expand All @@ -69,6 +76,9 @@ llm模型导出`onnx`和`mnn`模型请使用[llm-export](https://github.com/wang
[release-phi-2-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/phi-2-onnx
[release-bge-large-zh-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/bge-large-zh-onnx
[release-tinyllama-1.1b-chat-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/tinyllama-1.1b-chat-onnx
[release-qwen1.5-0.5b-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/qwen1.5-0.5b-chat-onnx
[release-qwen1.5-1.8b-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/qwen1.5-1.8b-chat-onnx
[release-qwen1.5-4b-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/qwen1.5-4b-chat-onnx
[download-chatglm-6b-mnn]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/chatglm-6b-mnn/total
[download-chatglm2-6b-mnn]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/chatglm2-6b-mnn/total
[download-chatglm3-6b-mnn]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/chatglm3-6b-mnn/total
Expand All @@ -85,6 +95,9 @@ llm模型导出`onnx`和`mnn`模型请使用[llm-export](https://github.com/wang
[download-qwen-1.8b-mnn-int8]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/qwen-1.8b-mnn-int8/total
[download-tinyllama-1.1b-chat-mnn-int8]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/tinyllama-1.1b-chat-mnn-int8/total
[download-qwen-1.8b-apk]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/qwen-1.8b-apk/total
[download-qwen1.5-0.5b-mnn]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/qwen1.5-0.5b-chat-mnn/total
[download-qwen1.5-1.8b-mnn]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/qwen1.5-1.8b-chat-mnn/total
[download-qwen1.5-4b-mnn]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/qwen1.5-4b-chat-mnn/total
[release-chatglm-6b-mnn]: https://github.com/wangzhaode/mnn-llm/releases/tag/chatglm-6b-mnn
[release-chatglm2-6b-mnn]: https://github.com/wangzhaode/mnn-llm/releases/tag/chatglm2-6b-mnn
[release-chatglm3-6b-mnn]: https://github.com/wangzhaode/mnn-llm/releases/tag/chatglm3-6b-mnn
Expand All @@ -101,6 +114,9 @@ llm模型导出`onnx`和`mnn`模型请使用[llm-export](https://github.com/wang
[release-qwen-1.8b-mnn-int8]: https://github.com/wangzhaode/mnn-llm/releases/tag/qwen-1.8b-mnn-int8
[release-tinyllama-1.1b-chat-mnn-int8]: https://github.com/wangzhaode/mnn-llm/releases/tag/tinyllama-1.1b-chat-mnn-int8
[release-qwen-1.8b-apk]: https://github.com/wangzhaode/mnn-llm/releases/tag/qwen-1.8b-apk
[release-qwen1.5-0.5b-mnn]: https://github.com/wangzhaode/mnn-llm/releases/tag/qwen1.5-0.5b-chat-mnn
[release-qwen1.5-1.8b-mnn]: https://github.com/wangzhaode/mnn-llm/releases/tag/qwen1.5-1.8b-chat-mnn
[release-qwen1.5-4b-mnn]: https://github.com/wangzhaode/mnn-llm/releases/tag/qwen1.5-4b-chat-mnn


### 速度
Expand Down Expand Up @@ -220,4 +236,4 @@ adb shell "cd /data/local/tmp && export LD_LIBRARY_PATH=. && ./cli_demo qwen-1.8
- [cpp-httplib](https://github.com/yhirose/cpp-httplib)
- [chatgpt-web](https://github.com/xqdoo00o/chatgpt-web)
- [ChatViewDemo](https://github.com/BrettFX/ChatViewDemo)
- [nlohmann/json](https://github.com/nlohmann/json)
- [nlohmann/json](https://github.com/nlohmann/json)
21 changes: 18 additions & 3 deletions README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,13 @@ Current supported models:
| internlm-chat-7b | [![Download][download-internlm-7b-onnx]][release-internlm-7b-onnx] | [![Download][download-internlm-chat-7b-mnn]][release-internlm-chat-7b-mnn] |
| Yi-6B-Chat | [![Download][download-yi-6b-chat-onnx]][release-yi-6b-chat-onnx] | [![Download][download-yi-6b-chat-mnn]][release-yi-6b-chat-mnn] |
| deepseek-llm-7b-chat | [![Download][download-deepseek-7b-chat-onnx]][release-deepseek-7b-chat-onnx] | [![Download][download-deepseek-7b-chat-mnn]][release-deepseek-7b-chat-mnn] |
| Qwen-1_8B-Chat | [![Download][download-qwen-1.8b-onnx]][release-qwen-1.8b-onnx] | [![Download][download-qwen-1.8b-mnn]][release-qwen-1.8b-mnn] |
| Qwen-1.8B-Chat | [![Download][download-qwen-1.8b-onnx]][release-qwen-1.8b-onnx] | [![Download][download-qwen-1.8b-mnn]][release-qwen-1.8b-mnn] |
| phi-2 | [![Download][download-phi-2-onnx]][release-phi-2-onnx] | [![Download][download-phi2-mnn-int4]][release-phi2-mnn-int4] |
| bge-large-zh | [![Download][download-bge-large-zh-onnx]][release-bge-large-zh-onnx] | [![Download][download-bge-large-zh-mnn]][release-bge-large-zh-mnn] |
| TinyLlama-1.1B-Chat | [![Download][download-tinyllama-1.1b-chat-onnx]][release-tinyllama-1.1b-chat-onnx] | [![Download][download-tinyllama-1.1b-chat-mnn-int8]][release-tinyllama-1.1b-chat-mnn-int8] |
| Qwen1.5-0.5B-Chat | [![Download][download-qwen1.5-0.5b-onnx]][release-qwen1.5-0.5b-onnx] | [![Download][download-qwen1.5-0.5b-mnn]][release-qwen1.5-0.5b-mnn] |
| Qwen1.5-1.8B-Chat | [![Download][download-qwen1.5-1.8b-onnx]][release-qwen1.5-1.8b-onnx] | [![Download][download-qwen1.5-1.8b-mnn]][release-qwen1.5-1.8b-mnn] |
| Qwen1.5-4B-Chat | [![Download][download-qwen1.5-4b-onnx]][release-qwen1.5-4b-onnx] | [![Download][download-qwen1.5-4b-mnn]][release-qwen1.5-4b-mnn] |

Other version:
- Qwen-1_8B-Chat-int8:[![Download][download-qwen-1.8b-mnn-int8]][release-qwen-1.8b-mnn-int8]
Expand All @@ -55,6 +58,10 @@ Other version:
[download-phi-2-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/phi-2-onnx/total
[download-bge-large-zh-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/bge-large-zh-onnx/total
[download-tinyllama-1.1b-chat-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/tinyllama-1.1b-chat-onnx/total
[download-phi-2-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/phi-2-onnx/total
[download-qwen1.5-0.5b-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/qwen1.5-0.5b-chat-onnx/total
[download-qwen1.5-1.8b-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/qwen1.5-1.8b-chat-onnx/total
[download-qwen1.5-4b-onnx]: https://img.shields.io/github/downloads/wangzhaode/llm-export/qwen1.5-4b-chat-onnx/total
[release-chatglm-6b-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/chatglm-6b-onnx
[release-chatglm2-6b-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/chatglm2-6b-onnx
[release-chatglm3-6b-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/chatglm3-6b-onnx
Expand All @@ -69,6 +76,9 @@ Other version:
[release-phi-2-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/phi-2-onnx
[release-bge-large-zh-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/bge-large-zh-onnx
[release-tinyllama-1.1b-chat-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/tinyllama-1.1b-chat-onnx
[release-qwen1.5-0.5b-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/qwen1.5-0.5b-chat-onnx
[release-qwen1.5-1.8b-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/qwen1.5-1.8b-chat-onnx
[release-qwen1.5-4b-onnx]: https://github.com/wangzhaode/llm-export/releases/tag/qwen1.5-4b-chat-onnx
[download-chatglm-6b-mnn]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/chatglm-6b-mnn/total
[download-chatglm2-6b-mnn]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/chatglm2-6b-mnn/total
[download-chatglm3-6b-mnn]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/chatglm3-6b-mnn/total
Expand All @@ -85,6 +95,9 @@ Other version:
[download-qwen-1.8b-mnn-int8]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/qwen-1.8b-mnn-int8/total
[download-tinyllama-1.1b-chat-mnn-int8]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/tinyllama-1.1b-chat-mnn-int8/total
[download-qwen-1.8b-apk]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/qwen-1.8b-apk/total
[download-qwen1.5-0.5b-mnn]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/qwen1.5-0.5b-chat-mnn/total
[download-qwen1.5-1.8b-mnn]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/qwen1.5-1.8b-chat-mnn/total
[download-qwen1.5-4b-mnn]: https://img.shields.io/github/downloads/wangzhaode/mnn-llm/qwen1.5-4b-chat-mnn/total
[release-chatglm-6b-mnn]: https://github.com/wangzhaode/mnn-llm/releases/tag/chatglm-6b-mnn
[release-chatglm2-6b-mnn]: https://github.com/wangzhaode/mnn-llm/releases/tag/chatglm2-6b-mnn
[release-chatglm3-6b-mnn]: https://github.com/wangzhaode/mnn-llm/releases/tag/chatglm3-6b-mnn
Expand All @@ -101,7 +114,9 @@ Other version:
[release-qwen-1.8b-mnn-int8]: https://github.com/wangzhaode/mnn-llm/releases/tag/qwen-1.8b-mnn-int8
[release-tinyllama-1.1b-chat-mnn-int8]: https://github.com/wangzhaode/mnn-llm/releases/tag/tinyllama-1.1b-chat-mnn-int8
[release-qwen-1.8b-apk]: https://github.com/wangzhaode/mnn-llm/releases/tag/qwen-1.8b-apk

[release-qwen1.5-0.5b-mnn]: https://github.com/wangzhaode/mnn-llm/releases/tag/qwen1.5-0.5b-chat-mnn
[release-qwen1.5-1.8b-mnn]: https://github.com/wangzhaode/mnn-llm/releases/tag/qwen1.5-1.8b-chat-mnn
[release-qwen1.5-4b-mnn]: https://github.com/wangzhaode/mnn-llm/releases/tag/qwen1.5-4b-chat-mnn

### Performance

Expand Down Expand Up @@ -213,4 +228,4 @@ adb shell "cd /data/local/tmp && export LD_LIBRARY_PATH=. && ./cli_demo qwen-1.8
- [cpp-httplib](https://github.com/yhirose/cpp-httplib)
- [chatgpt-web](https://github.com/xqdoo00o/chatgpt-web)
- [ChatViewDemo](https://github.com/BrettFX/ChatViewDemo)
- [nlohmann/json](https://github.com/nlohmann/json)
- [nlohmann/json](https://github.com/nlohmann/json)
47 changes: 42 additions & 5 deletions include/llm.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -240,18 +240,55 @@ class Llama2_7b : public Llm {
virtual bool is_stop(int token_id) override;
};

class Qwen2_4b : public Llama2_7b {
class Qwen2 : public Llama2_7b {
public:
Qwen2() {
model_name_ = "Qwen2";
tokenizer_.reset(new HuggingfaceTokenizer);
}
private:
virtual std::vector<int> tokenizer(const std::string& query) override;
virtual bool is_stop(int token_id) override;
};

class Qwen2_0_5b : public Qwen2 {
public:
Qwen2_0_5b() {
model_name_ = "Qwen2_0.5b";
layer_nums_ = 24;
key_value_shape_ = {2, 1, 16, 0, 64};
hidden_size_ = 1024;
}
};

class Qwen2_1_8b : public Qwen2 {
public:
Qwen2_1_8b() {
model_name_ = "Qwen2_1.8b";
layer_nums_ = 24;
key_value_shape_ = {2, 1, 16, 0, 128};
hidden_size_ = 2048;
}
};

class Qwen2_4b : public Qwen2 {
public:
Qwen2_4b() {
model_name_ = "Qwen2_4b";
layer_nums_ = 40;
key_value_shape_ = {2, 1, 20, 0, 128};
hidden_size_ = 2560;
tokenizer_.reset(new HuggingfaceTokenizer);
}
private:
virtual std::vector<int> tokenizer(const std::string& query) override;
virtual bool is_stop(int token_id) override;
};

class Qwen2_7b : public Qwen2 {
public:
Qwen2_7b() {
model_name_ = "Qwen2_7b";
layer_nums_ = 32;
key_value_shape_ = {2, 1, 32, 0, 128};
hidden_size_ = 4096;
}
};

class TinyLlama : public Llama2_7b {
Expand Down
15 changes: 11 additions & 4 deletions src/llm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,16 @@ Llm* Llm::createLLM(const std::string& path, std::string model_type) {
} else if (model_type.find("codegeex2") != std::string::npos) {
llm = new Chatglm2_6b;
llm->model_name_ = "Codegeex2_6b";
} else if (model_type.find("qwen2") != std::string::npos) {
if (model_type.find("4") != std::string::npos) {
} else if (model_type.find("qwen1.5") != std::string::npos ||
model_type.find("qwen2") != std::string::npos) {
if (model_type.find("0.5b") != std::string::npos) {
llm = new Qwen2_0_5b;
} else if (model_type.find("1.8b") != std::string::npos) {
llm = new Qwen2_1_8b;
} else if (model_type.find("4b") != std::string::npos) {
llm = new Qwen2_4b;
} else if (model_type.find("7b") != std::string::npos) {
llm = new Qwen2_7b;
}
} else if (model_type.find("qwen") != std::string::npos) {
if (model_type.find("1.8") != std::string::npos) {
Expand Down Expand Up @@ -734,15 +741,15 @@ bool Llama2_7b::is_stop(int token_id) {
return token_id == 2;
}

std::vector<int> Qwen2_4b::tokenizer(const std::string& query) {
std::vector<int> Qwen2::tokenizer(const std::string& query) {
auto ids = tokenizer_encode(query);
// auto prompt = "<|im_start|>user\n" + query + "<|im_end|>\n<|im_start|>assistant\n";
ids.insert(ids.begin(), {151644, 872, 198});
ids.insert(ids.end(), {151645, 198, 151644, 77091, 198});
return ids;
}

bool Qwen2_4b::is_stop(int token_id) {
bool Qwen2::is_stop(int token_id) {
return token_id == 151645 || token_id == 151643;
}

Expand Down

0 comments on commit ee51b89

Please sign in to comment.