Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions backend/database/attachment_db.py
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,7 @@ def get_content_type(file_path: str) -> str:
'.html': 'text/html',
'.htm': 'text/html',
'.json': 'application/json',
'.epub': 'application/epuub',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

拼写有误,应为 application/epub,请检查

'.xml': 'application/xml',
'.zip': 'application/zip',
'.rar': 'application/x-rar-compressed',
Expand Down
6 changes: 3 additions & 3 deletions doc/docs/en/sdk/data-process.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,10 @@ def file_process(self,

## 📁 Supported File Formats

- **Text files**: .txt, .md, .csv
- **Documents**: .pdf, .docx, .pptx
- **Text files**: .txt, .md, .csv, .json
- **Documents**: .pdf, .docx, .pptx, .epub
- **Images**: .jpg, .png, .gif (with OCR)
- **Web content**: HTML, URLs
- **Web content**: HTML, URLs, XML
- **Archives**: .zip, .tar

## 💡 Usage Examples
Expand Down
4 changes: 3 additions & 1 deletion doc/docs/en/user-guide/knowledge-base.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,14 @@ Create and manage knowledge bases, upload documents, and generate summaries. Kno
### Supported File Formats

Nexent supports multiple file formats, including:
- **Text:** .txt, .md
- **Text:** .txt, .md, .csv, .json
- **PDF:** .pdf
- **Word:** .docx
- **PowerPoint:** .pptx
- **EPUB:** .epub
- **Excel:** .xlsx
- **Data files:** .csv
- **Web content:** .html, .xml

## 📊 Knowledge Base Summary

Expand Down
4 changes: 2 additions & 2 deletions doc/docs/en/user-guide/start-chat.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,8 +79,8 @@ You can upload files during a chat so the agent can reason over their content:
- Or drag files directly into the chat area

2. **Supported File Formats**
- **Documents:** PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx)
- **Text:** Markdown (.md), Plain text (.txt)
- **Documents:** PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx), EPUB (.epub), HTML (.html), XML (.xml)
- **Text & Data:** Markdown (.md), Plain text (.txt), JSON (.json), CSV (.csv)
- **Images:** JPG, PNG, GIF, and other common formats

3. **File Processing Flow**
Expand Down
3 changes: 3 additions & 0 deletions doc/docs/zh/sdk/data-process.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,9 @@ def file_process(self,
- `.odt` - OpenDocument文本
- `.pptx` - PowerPoint 2007及更高版本
- `.ppt` - PowerPoint 97-2003版本
- `.xml` - XML数据文件
- `.json` - JSON数据文件
- `.csv` - 逗号分隔值文件

## 💡 使用示例

Expand Down
4 changes: 3 additions & 1 deletion doc/docs/zh/user-guide/knowledge-base.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,14 @@

Nexent支持多种文件格式,包括:

- **文本**: .txt, .md文件
- **文本**: .txt, .md, .json文件
- **PDF**: .pdf文件
- **Word**: .docx文件
- **PowerPoint**: .pptx文件
- **Excel**: .xlsx文件
- **EPUB** .epub文件
- **数据文件**: .csv文件
- **Web content**: .html, .xml文件

## 📊 知识库总结

Expand Down
4 changes: 2 additions & 2 deletions doc/docs/zh/user-guide/start-chat.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,8 +79,8 @@ Nexent支持语音输入功能,让您可以通过语音与智能体交互。
- 或直接将文件拖拽到对话区域

2. **支持的文件格式**
- **文档类**:PDF、Word (.docx)、PowerPoint (.pptx)、Excel (.xlsx)
- **文本类**:Markdown (.md)、纯文本 (.txt)
- **文档类**:PDF、Word (.docx)、PowerPoint (.pptx)、Excel (.xlsx), EPUB (.epub), HTML (.html), XML (.xml)
- **文本类**:Markdown (.md)、纯文本 (.txt), JSON (.json), CSV (.csv)
- **图片类**:JPG、PNG、GIF 等常见图片格式

3. **文件处理流程**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ const UploadArea = forwardRef<UploadAreaRef, UploadAreaProps>(
fileList,
onChange: handleChange,
customRequest: handleCustomRequest,
accept: ".pdf,.docx,.pptx,.xlsx,.md,.txt,.csv",
accept: ".pdf,.docx,.pptx,.xlsx,.md,.txt,.csv,.json,.epub,.xml,.html",
showUploadList: true,
disabled: disabled,
progress: {
Expand Down
9 changes: 5 additions & 4 deletions frontend/const/chatConfig.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ export const chatConfig = {
"application/json",
"application/xml",
"text/markdown",
"text/csv",
],

// Supported text file extensions
Expand Down Expand Up @@ -36,10 +37,10 @@ export const chatConfig = {
imageExtensions: ["jpg", "jpeg", "png", "gif", "webp", "svg", "bmp"],

// Supported document file extensions
documentExtensions: ["pdf", "doc", "docx", "xls", "xlsx", "ppt", "pptx"],
documentExtensions: ["pdf", "doc", "docx", "xls", "xlsx", "ppt", "pptx", "epub", "html", "xml"],

// Supported text document extensions
supportedTextExtensions: ["md", "markdown", "txt"],
supportedTextExtensions: ["md", "markdown", "txt", "csv", "json"],

// File icon mapping configuration
fileIcons: {
Expand All @@ -50,7 +51,7 @@ export const chatConfig = {
word: ["doc", "docx"],

// Plain text files
text: ["txt"],
text: ["txt", "epub"],

// Markdown files
markdown: ["md"],
Expand All @@ -62,7 +63,7 @@ export const chatConfig = {
powerpoint: ["ppt", "pptx"],

// HTML files
html: ["html", "htm"],
html: ["html", "htm", "xml"],

// Code files
code: ["css", "js", "ts", "jsx", "tsx", "php", "py", "java", "c", "cpp", "cs"],
Expand Down
19 changes: 17 additions & 2 deletions frontend/const/knowledgeBase.ts
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,12 @@ export const FILE_EXTENSIONS = {
PPT: 'ppt',
PPTX: 'pptx',
TXT: 'txt',
MD: 'md'
MD: 'md',
EPUB: 'epub',
CSV: 'csv',
HTML: 'html',
XML: 'xml',
JSON: 'json'
} as const;

// File type constants
Expand All @@ -131,6 +136,11 @@ export const FILE_TYPES = {
POWERPOINT: 'PowerPoint',
TEXT: 'Text',
MARKDOWN: 'Markdown',
EPUB: 'EPUB',
CSV: 'CSV',
JSON: 'JSON',
HTML: 'HTML',
XML: 'XML',
UNKNOWN: 'Unknown'
} as const;

Expand All @@ -144,5 +154,10 @@ export const EXTENSION_TO_TYPE_MAP = {
[FILE_EXTENSIONS.PPT]: FILE_TYPES.POWERPOINT,
[FILE_EXTENSIONS.PPTX]: FILE_TYPES.POWERPOINT,
[FILE_EXTENSIONS.TXT]: FILE_TYPES.TEXT,
[FILE_EXTENSIONS.MD]: FILE_TYPES.MARKDOWN
[FILE_EXTENSIONS.MD]: FILE_TYPES.MARKDOWN,
[FILE_EXTENSIONS.CSV]: FILE_TYPES.CSV,
[FILE_EXTENSIONS.JSON]: FILE_EXTENSIONS.JSON,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里似乎应该是 FILE_TYPES.JSON,请检查

[FILE_EXTENSIONS.HTML]: FILE_TYPES.HTML,
[FILE_EXTENSIONS.XML]: FILE_TYPES.XML,
[FILE_EXTENSIONS.EPUB]: FILE_TYPES.EPUB
} as const;
8 changes: 4 additions & 4 deletions frontend/public/locales/en/common.json
Original file line number Diff line number Diff line change
Expand Up @@ -75,10 +75,10 @@
"chatInput.thisFileTypeCannotBePreviewed": "This file type cannot be previewed",
"chatInput.fileCountExceedsLimit": "File count exceeds limit. Maximum {{count}} files allowed",
"chatInput.fileSizeExceedsLimit": "File \"{{name}}\" exceeds size limit. Maximum 10MB per file",
"chatInput.unsupportedFileType": "File \"{{name}}\" is not a supported file type. Supported formats: images, documents (PDF, Word, Excel, PPT), text files, CSV/TSV, Markdown",
"chatInput.unsupportedFileType": "File \"{{name}}\" is not a supported file type. Supported formats: images, documents (PDF, Word, Excel, PPT), text files, CSV/TSV, Markdown、JSON、HTML、XML",
"chatInput.unsupportedFileTypeSimple": "Unsupported file type",
"chatInput.dragAndDropFilesHere": "Drag and drop files here to upload",
"chatInput.supportedFileFormats": "Supported formats: images, documents (PDF, Word, Excel, PPT), text files, CSV/TSV, Markdown",
"chatInput.supportedFileFormats": "Supported formats: images, documents (PDF, Word, Excel, PPT, EPUB), text files, CSV/TSV, Markdown、JSON、HTML、XML",
"chatInput.sendMessageTo": "Send message to {{appName}}",
"chatInput.stopRecording": "Stop Recording",
"chatInput.startRecording": "Start Recording",
Expand Down Expand Up @@ -443,13 +443,13 @@
"knowledgeBase.hint.selectFirst": "Please select a knowledge base to upload files",
"knowledgeBase.hint.changeName": "Please modify the knowledge base name to continue",
"knowledgeBase.upload.dragHint": "Click or drag files to this area to upload and add knowledge to the knowledge base",
"knowledgeBase.upload.supportedFormats": "Supports PDF, Word, PPT, Excel, MD, TXT file formats",
"knowledgeBase.upload.supportedFormats": "Supports PDF, Word, PPT, Excel, MD, TXT, EPUB, CSV, JSON, HTML, XML file formats",
"knowledgeBase.upload.completed": "Upload completed",
"knowledgeBase.upload.fileCount": "{{count}} files",
"knowledgeBase.upload.status.uploading": "Uploading",
"knowledgeBase.upload.status.completed": "Completed",
"knowledgeBase.upload.status.failed": "Upload failed",
"knowledgeBase.upload.invalidFileType": "Only PDF, Word, PPT, Excel, MD, TXT, CSV file formats are supported!",
"knowledgeBase.upload.invalidFileType": "Only PDF, Word, PPT, Excel, MD, TXT, CSV, JSON, EPUB, HTML, XML file formats are supported!",
"knowledgeBase.check.nameError": "Failed to check knowledge base name",
"knowledgeBase.fetch.error": "Failed to fetch knowledge base information",
"knowledgeBase.fetch.retryError": "Failed to fetch knowledge base information, please try again later",
Expand Down
8 changes: 4 additions & 4 deletions frontend/public/locales/zh/common.json
Original file line number Diff line number Diff line change
Expand Up @@ -75,10 +75,10 @@
"chatInput.thisFileTypeCannotBePreviewed": "此文件类型无法预览",
"chatInput.fileCountExceedsLimit": "文件数量超过限制,最多只能上传{{count}}个文件",
"chatInput.fileSizeExceedsLimit": "文件\"{{name}}\"超过大小限制,单个文件最大10MB",
"chatInput.unsupportedFileType": "文件\"{{name}}\"不是支持的文件类型,支持的格式包括:图片、文档(PDF、Word、Excel、PPT)、纯文本、CSV/TSV、Markdown",
"chatInput.unsupportedFileType": "文件\"{{name}}\"不是支持的文件类型,支持的格式包括:图片、文档(PDF、Word、Excel、PPT、EPUB)、纯文本、CSV/TSV、Markdown、JSON、HTML、XML",
"chatInput.unsupportedFileTypeSimple": "不支持的文件类型",
"chatInput.dragAndDropFilesHere": "文件拖动到此处即可上传",
"chatInput.supportedFileFormats": "支持的格式包括:图片、文档(PDF、Word、Excel、PPT)、纯文本、CSV/TSV、Markdown",
"chatInput.supportedFileFormats": "支持的格式包括:图片、文档(PDF、Word、Excel、PPT、EPUB)、纯文本、CSV/TSV、Markdown、JSON、HTML、XML",
"chatInput.sendMessageTo": "给 {{appName}} 发送消息",
"chatInput.stopRecording": "停止录音",
"chatInput.startRecording": "开始录音",
Expand Down Expand Up @@ -446,13 +446,13 @@
"knowledgeBase.hint.selectFirst": "请先选择一个知识库以上传文件",
"knowledgeBase.hint.changeName": "请修改知识库名称后继续",
"knowledgeBase.upload.dragHint": "点击或拖拽文件到此区域上传,为知识库添加知识",
"knowledgeBase.upload.supportedFormats": "支持 PDF、Word、Excel、PPT、纯文本、CSV、TSV、Markdown 文件格式",
"knowledgeBase.upload.supportedFormats": "支持 PDF、Word、Excel、PPT、纯文本、CSV、TSV、Markdown、JSON、EPUB、HTML、XML 文件格式",
"knowledgeBase.upload.completed": "上传完成",
"knowledgeBase.upload.fileCount": "{{count}} 个文件",
"knowledgeBase.upload.status.uploading": "上传中",
"knowledgeBase.upload.status.completed": "已完成",
"knowledgeBase.upload.status.failed": "上传失败",
"knowledgeBase.upload.invalidFileType": "只支持 PDF、Word、PPT、Excel、MD、TXT、CSV 文件格式!",
"knowledgeBase.upload.invalidFileType": "只支持 PDF、Word、PPT、Excel、MD、TXT、CSV、JSON、EPUB、HTML、XML 文件格式!",
"knowledgeBase.check.nameError": "检查知识库名称失败",
"knowledgeBase.fetch.error": "获取知识库信息失败",
"knowledgeBase.fetch.retryError": "获取知识库信息失败,请稍后重试",
Expand Down
8 changes: 7 additions & 1 deletion frontend/services/uploadService.ts
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,13 @@ export const validateFileType = (file: File, t: TFunction, message: any): boolea
'text/markdown',
'text/plain',
'text/csv',
'application/csv'
'application/csv',
'application/epub',
'application/epub+zip',
'text/html',
'application/json',
'application/xml',
'text/xml'
];

// First check MIME type
Expand Down
6 changes: 5 additions & 1 deletion sdk/nexent/data_process/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ class DataProcessCore:

Supported file types:
- Excel files: .xlsx, .xls
- Generic files: .txt, .pdf, .docx, .doc, .html, .htm, .md, .rtf, .odt, .pptx, .ppt
- Generic files: .txt, .pdf, .docx, .doc, .html, .htm, .md, .rtf, .odt, .pptx, .ppt, .epub, .xml, .csv, .json

Supported input methods:
- In-memory byte data
Expand Down Expand Up @@ -147,6 +147,10 @@ def get_supported_file_types(self) -> Dict[str, List[str]]:
".odt",
".pptx",
".ppt",
".epub",
".json",
".xml",
".csv",
]

return {"excel": list(self.EXCEL_EXTENSIONS), "generic": generic_formats}
Expand Down
Loading
Loading