Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ yarn-error.log*
config.json
.node

# Local dev app data
src-tauri/com.mine-kb.app/

release

oblite.so
Expand Down
90 changes: 76 additions & 14 deletions README-ZH.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ MineKB 是一个基于 Tauri 构建的跨平台桌面应用,旨在帮助用户
- **向量搜索**:利用语义搜索技术,快速定位相关文档内容
- **流式输出**:实时流式展示 AI 生成的回答,提供流畅的用户体验
- **语音交互**:支持语音输入功能,让知识查询更便捷
- **本地存储**:所有数据存储在本地嵌入式数据库中,保护隐私安
- **本地存储**:所有数据存储在本地嵌入式数据库中,保护隐私安全

## 基本原理

Expand All @@ -46,7 +46,7 @@ MineKB 采用 RAG(Retrieval-Augmented Generation)架构,结合向量检索
- 使用阿里云百炼 API 生成文档的向量表示(Embeddings)

2. **向量存储**
- 使用 SeekDB 0.0.1.dev4 作为嵌入式向量数据库(通过 Python 子进程访问
- 使用 [seekdb-rs](https://github.com/ob-labs/seekdb-rs) 嵌入式向量数据库(Rust 原生,无 Python 依赖
- 原生支持向量类型和 HNSW 索引,实现高效的向量检索
- 支持项目级别的数据隔离和事务处理
- 支持向量列输出和数据库存在性验证
Expand Down Expand Up @@ -106,14 +106,12 @@ MineKB 采用 RAG(Retrieval-Augmented Generation)架构,结合向量检索
- `@tauri-apps/api 1.5` - 前端 API 调用库
- `@tauri-apps/cli 1.5` - 命令行工具
- 启用功能:`path-all`、`http-all`、`dialog-all`、`fs-all`、`shell-open`
- **Python 3.8+** - SeekDB 数据库操作(通过子进程通信)

**数据库**
- **SeekDB 0.0.1.dev4** (Python) - AI-Native 嵌入式向量数据库
- **seekdb-rs** (Rust) - AI-Native 嵌入式向量数据库,无 Python 依赖
- 原生支持向量类型和 HNSW 索引
- 支持混合检索和全文搜索
- 高性能向量相似度计算
- 通过 JSON-RPC 协议与 Rust 通信

### Rust 核心依赖

Expand All @@ -122,8 +120,7 @@ MineKB 采用 RAG(Retrieval-Augmented Generation)架构,结合向量检索
- `docx-rs 0.4` - Word 文档处理

**数据存储**
- `seekdb 0.0.1.dev4` (Python) - AI-Native 嵌入式数据库,原生支持向量索引和 HNSW 检索
- JSON 通信协议 - Rust 与 Python 子进程通信
- `seekdb-rs` (Rust) - AI-Native 嵌入式数据库,原生支持向量索引和 HNSW 检索,无 Python

**向量计算**
- SeekDB 原生向量索引 (HNSW) - 高效向量相似度搜索
Expand Down Expand Up @@ -166,25 +163,87 @@ MineKB 采用 RAG(Retrieval-Augmented Generation)架构,结合向量检索
## 系统架构

### 架构概览
<img src="https://mdn.alipayobjects.com/huamei_ytl0i7/afts/img/A*Cuf4RoPSfwMAAAAAT-AAAAgAejCYAQ/original">

```mermaid
graph TB
subgraph Frontend["前端层"]
UI[React UI 组件]
State[状态管理]
end

subgraph Command["命令层 (Tauri)"]
CMD_Project[项目命令]
CMD_Doc[文档命令]
CMD_Chat[对话命令]
CMD_Speech[语音命令]
end

subgraph Service["服务层 (Rust)"]
SVC_Project[ProjectService]
SVC_Doc[DocumentService]
SVC_Conv[ConversationService]
SVC_Embed[EmbeddingService]
SVC_LLM[LLMClient]
SVC_Speech[SpeechService]
end

subgraph Data["数据层"]
Adapter[SeekDbAdapter]
Client[seekdb-rs Client]
DB[(嵌入式 SeekDB)]
Tables[关系表]
VectorColl[向量集合 + HNSW]
end

subgraph External["外部服务"]
DashScope[阿里云百炼 API<br/>Embedding + LLM]
end

UI --> Command
State --> Command
CMD_Project --> SVC_Project
CMD_Doc --> SVC_Doc
CMD_Chat --> SVC_Conv
CMD_Speech --> SVC_Speech

SVC_Doc --> SVC_Embed
SVC_Conv --> SVC_LLM
SVC_Project --> Adapter
SVC_Doc --> Adapter
SVC_Conv --> Adapter

Adapter --> Client
Client --> DB
DB --> Tables
DB --> VectorColl

SVC_Embed --> DashScope
SVC_LLM --> DashScope
```

- **前端**:React + TypeScript,负责状态与界面。
- **命令层**:Tauri 命令(项目、文档、对话、语音)连接前端与 Rust 服务。
- **服务层**:ProjectService、DocumentService、ConversationService、EmbeddingService、LLMClient、SpeechService。
- **数据层**:SeekDbAdapter 通过 **seekdb-rs** 异步 Client 与嵌入式 SeekDB(SQL + 向量集合)通信,无 Python。
- **外部**:阿里云百炼 API 提供 Embedding 与 LLM。

## 快速开始

### 环境要求

- Node.js 16+
- Rust 1.70+
- Python 3.8+
**构建 / 开发环境**(本地开发或打包):

> **注意**: SeekDB 目前仅发布 Linux 版本,不久会支持 MacOS。MacOS 用户推荐使用 [UTM](https://mac.getutm.app) 虚拟机管理器运行 [Ubuntu 20.x 以上](https://mac.getutm.app/gallery/ubuntu-20-04)。
- Node.js 16+(前端与 Tauri CLI)
- Rust 1.70+(Tauri 后端)
- 无需 Python(seekdb-rs 为 Rust 原生)

### 安装依赖

```bash
# 安装前端依赖
npm install

# Rust 和 Python 依赖会在构建时自动安装
# Rust 依赖在构建时自动解析
```

### 配置
Expand All @@ -201,6 +260,9 @@ cp src-tauri/config.example.json src-tauri/config.json
```bash
# 启动开发服务器
tnpm run tauri:dev

# 自定义数据目录时可设置环境变量 CONFIG_DIR
CONFIG_DIR=/path/to/your/data tnpm run tauri:dev
```

### 构建应用
Expand Down Expand Up @@ -234,7 +296,7 @@ cd src-tauri && cargo test
- ✅ **HNSW 索引**:专业的向量索引算法,检索更快更准
- ✅ **AI-Native 特性**:内置全文检索、混合检索等 AI 能力
- ✅ **更好的扩展性**:支持更大规模的数据和更复杂的查询
- ✅ **最新版本特性**(0.0.1.dev4):向量列输出、数据库验证、USE 语句稳定支持
- ✅ **seekdb-rs**(Rust):嵌入式客户端,无 Python 依赖,支持向量列输出与数据库存在性验证

---

Expand Down
88 changes: 75 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ MineKB employs a RAG (Retrieval-Augmented Generation) architecture, combining ve
- Generation of document embeddings using Alibaba Cloud Bailian API

2. **Vector Storage**
- SeekDB 0.0.1.dev4 as an embedded vector database (accessed via Python subprocess)
- [seekdb-rs](https://github.com/ob-labs/seekdb-rs) embedded vector database (Rust native, no Python)
- Native support for vector types and HNSW indexing for efficient vector retrieval
- Project-level data isolation and transaction support
- Vector column output and database existence validation
Expand Down Expand Up @@ -106,14 +106,12 @@ MineKB employs a RAG (Retrieval-Augmented Generation) architecture, combining ve
- `@tauri-apps/api 1.5` - Frontend API library
- `@tauri-apps/cli 1.5` - Command-line tools
- Enabled features: `path-all`, `http-all`, `dialog-all`, `fs-all`, `shell-open`
- **Python 3.8+** - SeekDB database operations (via subprocess communication)

**Database**
- **SeekDB 0.0.1.dev4** (Python) - AI-Native embedded vector database
- **seekdb-rs** (Rust) - AI-Native embedded vector database, no Python dependency
- Native support for vector types and HNSW indexing
- Hybrid search and full-text search support
- High-performance vector similarity computation
- Communication with Rust via JSON-RPC protocol

### Rust Core Dependencies

Expand All @@ -122,8 +120,7 @@ MineKB employs a RAG (Retrieval-Augmented Generation) architecture, combining ve
- `docx-rs 0.4` - Word document processing

**Data Storage**
- `seekdb 0.0.1.dev4` (Python) - AI-Native embedded database with native vector indexing and HNSW retrieval
- JSON communication protocol - Rust to Python subprocess communication
- `seekdb-rs` (Rust) - AI-Native embedded database with native vector indexing and HNSW retrieval, no Python

**Vector Computation**
- SeekDB native vector indexing (HNSW) - Efficient vector similarity search
Expand Down Expand Up @@ -166,25 +163,87 @@ MineKB employs a RAG (Retrieval-Augmented Generation) architecture, combining ve
## System Architecture

### Architecture Overview
<img src="https://mdn.alipayobjects.com/huamei_ytl0i7/afts/img/A*wk6ST4g16wYAAAAAgFAAAAgAejCYAQ/original">

```mermaid
graph TB
subgraph Frontend["Frontend Layer"]
UI[React UI Components]
State[State Management]
end

subgraph Command["Command Layer (Tauri)"]
CMD_Project[Project Commands]
CMD_Doc[Document Commands]
CMD_Chat[Conversation Commands]
CMD_Speech[Speech Commands]
end

subgraph Service["Service Layer (Rust)"]
SVC_Project[ProjectService]
SVC_Doc[DocumentService]
SVC_Conv[ConversationService]
SVC_Embed[EmbeddingService]
SVC_LLM[LLMClient]
SVC_Speech[SpeechService]
end

subgraph Data["Data Layer"]
Adapter[SeekDbAdapter]
Client[seekdb-rs Client]
DB[(Embedded SeekDB)]
Tables[Relational Tables]
VectorColl[Vector Collection + HNSW]
end

subgraph External["External Services"]
DashScope[Aliyun Bailian API<br/>Embedding + LLM]
end

UI --> Command
State --> Command
CMD_Project --> SVC_Project
CMD_Doc --> SVC_Doc
CMD_Chat --> SVC_Conv
CMD_Speech --> SVC_Speech

SVC_Doc --> SVC_Embed
SVC_Conv --> SVC_LLM
SVC_Project --> Adapter
SVC_Doc --> Adapter
SVC_Conv --> Adapter

Adapter --> Client
Client --> DB
DB --> Tables
DB --> VectorColl

SVC_Embed --> DashScope
SVC_LLM --> DashScope
```

- **Frontend**: React + TypeScript; state and UI.
- **Command Layer**: Tauri commands (project, document, conversation, speech) bridge frontend and Rust services.
- **Service Layer**: ProjectService, DocumentService, ConversationService, EmbeddingService, LLMClient, SpeechService.
- **Data Layer**: SeekDbAdapter uses **seekdb-rs** async Client to talk to embedded SeekDB (SQL + vector collection); no Python.
- **External**: Aliyun Bailian API for embeddings and LLM.

## Quick Start

### Requirements

- Node.js 16+
- Rust 1.70+
- Python 3.8+
**Build / development environment** (for local dev or packaging):

> **Note**: SeekDB currently only releases Linux builds. macOS support is coming soon. macOS users are recommended to use [UTM](https://mac.getutm.app) virtual machine manager to run [Ubuntu 20.x or later](https://mac.getutm.app/gallery/ubuntu-20-04).
- Node.js 16+ (frontend and Tauri CLI)
- Rust 1.70+ (Tauri backend)
- No Python required (seekdb-rs is Rust-native)

### Install Dependencies

```bash
# Install frontend dependencies
npm install

# Rust and Python dependencies are automatically installed during build
# Rust dependencies are resolved at build time
```

### Configuration
Expand All @@ -201,6 +260,9 @@ cp src-tauri/config.example.json src-tauri/config.json
```bash
# Start development server
tnpm run tauri:dev

# 自定义数据目录时可设置环境变量 CONFIG_DIR
CONFIG_DIR=/path/to/your/data tnpm run tauri:dev
```

### Build Application
Expand Down Expand Up @@ -234,7 +296,7 @@ cd src-tauri && cargo test
- ✅ **HNSW Indexing**: Professional vector indexing algorithm for faster and more accurate retrieval
- ✅ **AI-Native Features**: Built-in full-text search, hybrid search, and other AI capabilities
- ✅ **Better Scalability**: Supports larger datasets and more complex queries
- ✅ **Latest Version Features** (0.0.1.dev4): Vector column output, database validation, stable USE statement support
- ✅ **seekdb-rs** (Rust): Embedded client, no Python dependency, vector column output and database validation

---

Expand Down
19 changes: 5 additions & 14 deletions docs/COMPLETE_FIX_SUMMARY.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
# 完整修复总结

> **历史文档**: 本文档记录了 2025-10-29 的修复过程,当时使用的是 SeekDB 0.0.1.dev2 版本。
> **当前版本**: 已升级到 SeekDB 0.0.1.dev4,模块名从 `oblite` 更改为 `seekdb`。
> **参考**: [SeekDB 0.0.1.dev4 升级指南](UPGRADE_SEEKDB_0.0.1.dev4.md)

## 概述

Expand Down Expand Up @@ -45,7 +42,6 @@
- ✅ `src-tauri/src/services/seekdb_package.rs`
- ✅ `src-tauri/src/services/python_env.rs`

**详细文档**: `docs/FIX_PIP_INSTALL_ERROR.md`

---

Expand All @@ -68,7 +64,7 @@ ModuleNotFoundError: No module named 'oblite'
import oblite # 失败

# 正确的方式
import seekdb # 先导入 seekdb
import pyseekdb
import oblite # 然后才能导入 oblite
```

Expand Down Expand Up @@ -169,17 +165,14 @@ ubuntu 53026 9.1 1.9 74151808 161456 ? Sl 03:45 0:06 mine-kb
### Python 代码

3. **`src-tauri/python/seekdb_bridge.py`**
- 修改导入顺序:先 `import seekdb`,再 `import oblite`
- 使用 pyseekdb 客户端连接
- 重写 `handle_init()` 方法,添加数据库自动创建逻辑
- 使用 `oblite.connect("")` 访问系统上下文
- 执行 `CREATE DATABASE IF NOT EXISTS` 确保数据库存在

### 文档

4. **`docs/FIX_PIP_INSTALL_ERROR.md`**
- pip 安装问题的详细分析和解决方案

5. **`docs/FIX_SEEKDB_DATABASE_ERROR.md`**
4. **`docs/FIX_SEEKDB_DATABASE_ERROR.md`**
- SeekDB 数据库问题的详细分析和解决方案

6. **`docs/COMPLETE_FIX_SUMMARY.md`** (本文档)
Expand Down Expand Up @@ -284,8 +277,7 @@ except Exception as e:

- [x] Python 虚拟环境存在
- [x] pip 可用(`python -m pip --version`)
- [x] seekdb 已安装(`python -c "import seekdb"`)
- [x] oblite 可导入(`python -c "import seekdb; import oblite"`)
- [x] pyseekdb 已安装(`python -c "import pyseekdb"`)
- [x] 数据库实例目录存在或会自动创建
- [x] 数据库会在初始化时自动创建

Expand All @@ -312,9 +304,8 @@ except Exception as e:
## 相关资源

### 文档
- [pip 安装问题修复](./FIX_PIP_INSTALL_ERROR.md)
- [SeekDB 数据库问题修复](./FIX_SEEKDB_DATABASE_ERROR.md)
- [SeekDB 自动安装文档](./SEEKDB_AUTO_INSTALL.md)
- [seekdb.md](./seekdb.md) - SeekDB / pyseekdb 文档

### 代码文件
- `src-tauri/src/services/python_env.rs` - Python 环境管理
Expand Down
Loading