Migrate to web-rwkv v0.2

Ai00-X · Aug 27, 2023 · a5ef311 · a5ef311
1 parent f92cafb
commit a5ef311
Show file tree

Hide file tree

Showing 6 changed files with 212 additions and 137 deletions.
diff --git a/Cargo.lock b/Cargo.lock
diff --git a/Cargo.toml b/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "ai00_server"
-version = "0.1.11"
+version = "0.1.12"
 edition = "2021"
 authors = ["Gu ZhenNiu <[email protected]>", "Zhang Zhenyuan <[email protected]>"]
 license = "MIT OR Apache-2.0"
@@ -18,7 +18,7 @@ axum = { git = "https://github.com/cryscan/axum", branch = "sse-leading-space" }
 tower = { version = "0.4", features = ["util"] }
 tower-http = { version = "0.4", features = ["full"] }
 tokio = { version = "1", features = ["full"] }
-web-rwkv = "0.1.18"
+web-rwkv = "0.2.0"
 memmap = "0.7"
 regex = "1.8"
 dialoguer = "0.10"

diff --git a/README.md b/README.md
@@ -97,14 +97,14 @@ QQ Group for communication: 30920262
 *   `--tokenizer`: Tokenizer path
 *   `--port`: Running port
 *   `--quant`: Specify the number of quantization layers
-*   `--adaptor`: Adapter (GPU and backend) selection options
+*   `--adapter`: Adapter (GPU and backend) selection options
 
 ### Example
 
-The server listens on port 3000, loads the full-layer quantized (32 > 24) 0.4B model, and selects adapter 0 (to get the specific adapter number, you can first not add this parameter, and the program will enter the adapter selection page).
+The server listens on port 3000, loads the full-layer quantized (32 > 24) 0.4B model, and selects the high-performance adapter.
 
 ```bash
-$ cargo run --release -- --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st --port 3000 --quant 32 --adaptor 0
+$ cargo run --release -- --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st --port 3000 --quant 32 --adapter auto
 ```
 
 ## 📙Currently Available APIs

diff --git a/README_jp.md b/README_jp.md
@@ -95,14 +95,14 @@ OpenAIのChatGPT APIインターフェースと互換性があります。
 *   `--tokenizer`: トークナイザーのパス
 *   `--port`: 実行ポート
 *   `--quant`: 量子化レイヤーの数を指定
-*   `--adaptor`: アダプター（GPUおよびバックエンド）の選択オプション
+*   `--adapter`: アダプター（GPUおよびバックエンド）の選択オプション
 
 ### 例
 
-サーバーはポート3000でリッスンし、全レイヤー量子化（32 > 24）の0.4Bモデルをロードし、アダプター0を選択します（特定のアダプター番号を取得するには、最初にこのパラメーターを追加せず、プログラムがアダプター選択ページに入るまで待ちます）。
+サーバーはポート3000でリッスンし、全レイヤー量子化（32 > 24）の0.4Bモデルをロードし、高性能アダプターの自動選択。
 
 ```bash
-$ cargo run --release -- --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st --port 3000 --quant 32 --adaptor 0
+$ cargo run --release -- --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st --port 3000 --quant 32 --adapter 0
 ```
 
 ## 📙現在利用可能なAPI

diff --git a/README_zh.md b/README_zh.md
@@ -104,13 +104,13 @@
 - `--tokenizer`: 词表路径
 - `--port`: 运行端口
 - `--quant`: 指定量化层数
-- `--adaptor`: 适配器（GPU和后端）选择项
+- `--adapter`: 适配器（GPU和后端）选择项
 
 ### 示例
 
-服务器监听3000端口，加载全部层量化（32 > 24）的0.4B模型，选择0号适配器（要查看具体适配器编号可以先不加该参数，程序会先进入选择页面）。
+服务器监听3000端口，加载全部层量化（32 > 24）的0.4B模型，自动选择高性能适配器。
 ```bash
-$ cargo run --release -- --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st --port 3000 --quant 32 --adaptor 0
+$ cargo run --release -- --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st --port 3000 --quant 32 --adapter auto
 ```