-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat: knowledge base for long-term memory (#1099) #1115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
bitloi
wants to merge
41
commits into
eigent-ai:main
Choose a base branch
from
bitloi:feature/knowledge-base-1099
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+235
−6
Open
Changes from 35 commits
Commits
Show all changes
41 commits
Select commit
Hold shift + click to select a range
2c18d7c
feat: knowledge base for long-term memory (issue #1099)
bitloi d07d875
fix: use os.environ in knowledge_base for testability; add unit and A…
bitloi f45ca7c
Merge origin/main into feature/knowledge-base-1099
bitloi ece99c9
fix: resolve chat_service.py conflict with main (keep KB integration)
bitloi 852ab43
Merge upstream/main, resolve chat_service.py (keep KB integration)
bitloi f678309
Merge branch 'main' into feature/knowledge-base-1099
bitloi 3b9c0e4
Merge branch 'main' into feature/knowledge-base-1099
bitloi 3d10e6d
Merge branch 'main' into feature/knowledge-base-1099
bitloi 47f4f10
Merge branch 'main' into feature/knowledge-base-1099
bitloi fbd480b
Merge branch 'main' into feature/knowledge-base-1099
bitloi b7af46a
Merge branch 'main' into feature/knowledge-base-1099
bitloi 7216ce8
fix: resolve merge conflicts with main (router.py, chat_service.py)
bitloi 4e2dcb2
Merge upstream/main into feature/knowledge-base-1099, resolve conflicts
bitloi c363187
Merge branch 'main' into feature/knowledge-base-1099
bitloi 3d2f4b6
Merge branch 'main' into feature/knowledge-base-1099
bitloi d328676
Merge branch 'main' into feature/knowledge-base-1099
bitloi b2fd959
PR feedback: rename to sqlite_toolkit, FTS5/BM25 search, add tool onl…
bitloi 70cd6df
Remove knowledge base from developer agent for now (per review)
bitloi b4bfc4e
Merge branch 'main' into feature/knowledge-base-1099
bitloi e195c47
Merge branch 'main' into feature/knowledge-base-1099
bitloi dd804f2
Merge branch 'main' into feature/knowledge-base-1099
bitloi 606533f
Merge branch 'main' into feature/knowledge-base-1099
bitloi 06d19b1
Merge branch 'main' into feature/knowledge-base-1099
Wendong-Fan b4fba05
Merge branch 'main' into feature/knowledge-base-1099
bitloi 1b0f125
PR feedback: remove KB from chat context, rename tool to store_projec…
bitloi e134793
Merge branch 'main' into feature/knowledge-base-1099
bitloi 52d06b2
Merge branch 'main' into feature/knowledge-base-1099
bitloi 3086710
Merge branch 'main' into feature/knowledge-base-1099
bitloi a275992
Merge upstream/main into feature/knowledge-base-1099
bitloi ba023f5
Merge branch 'feature/knowledge-base-1099' of https://github.com/bitl…
bitloi 8d8b573
Merge branch 'main' into feature/knowledge-base-1099
bitloi 3564835
refactor(knowledge-base): switch from SQLite to markdown file-based m…
bitloi 60367cb
Merge branch 'main' into feature/knowledge-base-1099
nitpicker55555 b2d810a
refactor(memory): index-only prompt, no dedicated tools (reviewer fee…
bitloi 3bd0880
Merge branch 'feature/knowledge-base-1099' of https://github.com/bitl…
bitloi 0971ae6
Address nitpicker55555 review: remove unused memory helpers, wire pro…
bitloi 4cbdaf2
chore(backend): remove ruff from dev dependencies
bitloi 7e7991e
Merge branch 'main' into feature/knowledge-base-1099
bitloi 516caa2
Replace knowledge_base_toolkit with use_project_memory flag
bitloi d000b44
Merge branch 'feature/knowledge-base-1099' of https://github.com/bitl…
bitloi dafac62
Revert linter-only changes in router.py (review feedback)
bitloi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,96 @@ | ||
| # ========= Copyright 2025-2026 @ Eigent.ai All Rights Reserved. ========= | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # ========= Copyright 2025-2026 @ Eigent.ai All Rights Reserved. ========= | ||
|
|
||
| """ | ||
| Long-term memory via markdown files (issue #1099). | ||
|
|
||
| Memory is architecture-level: .eigent/memory.md is the index; the agent | ||
| reads/writes .eigent/*.md via file operations. No tools are exposed; prompt | ||
| builders should inject MEMORY_ARCHITECTURE_PROMPT and get_index_for_prompt(). | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| import os | ||
| from typing import Final | ||
|
|
||
| from camel.toolkits.base import BaseToolkit | ||
| from camel.toolkits.function_tool import FunctionTool | ||
|
|
||
| from app.agent.toolkit.abstract_toolkit import AbstractToolkit | ||
| from app.component.environment import env | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| _DEFAULT_WORKING_DIR: Final[str] = "~/.eigent" | ||
|
|
||
|
|
||
| def _resolve_working_directory(working_directory: str | None) -> str: | ||
| if working_directory is None or not str(working_directory).strip(): | ||
| working_directory = env("file_save_path", os.path.expanduser(_DEFAULT_WORKING_DIR)) | ||
| resolved = os.path.expanduser(str(working_directory).strip()) | ||
| try: | ||
| os.makedirs(resolved, exist_ok=True) | ||
| except OSError as e: | ||
| logger.warning("Could not create working directory %s: %s", resolved, e) | ||
| return resolved | ||
|
|
||
|
|
||
| class KnowledgeBaseToolkit(BaseToolkit, AbstractToolkit): | ||
| """ | ||
| Project long-term memory (architecture-only). Adds no tools; the agent | ||
| uses file operations on .eigent/*.md. Kept so "knowledge_base_toolkit" | ||
| remains selectable and prompt builders can inject the memory index. | ||
| """ | ||
|
|
||
| def __init__( | ||
| self, | ||
| api_task_id: str, | ||
| working_directory: str | None = None, | ||
| agent_name: str | None = None, | ||
| timeout: float | None = None, | ||
| ) -> None: | ||
| api_task_id = (api_task_id or "").strip() | ||
| if not api_task_id: | ||
| raise ValueError("api_task_id cannot be empty") | ||
|
|
||
| super().__init__(timeout=timeout) | ||
| self.api_task_id = api_task_id | ||
| self.working_directory = _resolve_working_directory(working_directory) | ||
| self.agent_name = (agent_name or "agent").strip() or "agent" | ||
|
|
||
| logger.debug( | ||
| "KnowledgeBaseToolkit initialized", | ||
| extra={ | ||
| "api_task_id": self.api_task_id, | ||
| "working_directory": self.working_directory, | ||
| "agent_name": self.agent_name, | ||
| }, | ||
| ) | ||
|
|
||
| def get_tools(self) -> list[FunctionTool]: | ||
| return [] | ||
|
|
||
|
|
||
| def get_tools( | ||
| api_task_id: str, | ||
| working_directory: str | None = None, | ||
| agent_name: str | None = None, | ||
| ) -> list[FunctionTool]: | ||
| return KnowledgeBaseToolkit( | ||
| api_task_id=api_task_id, | ||
| working_directory=working_directory, | ||
| agent_name=agent_name, | ||
| ).get_tools() | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe rename to long_term_memory |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,185 @@ | ||
| # ========= Copyright 2025-2026 @ Eigent.ai All Rights Reserved. ========= | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # ========= Copyright 2025-2026 @ Eigent.ai All Rights Reserved. ========= | ||
|
|
||
| """ | ||
| Markdown-based long-term memory for agents (issue #1099). | ||
|
|
||
| memory.md in the project's .eigent/ directory acts as an index: only a short | ||
| prefix (e.g. first 200 lines) is passed into the system prompt. Topic-specific | ||
| memories live in other .md files under .eigent/; the agent reads and writes | ||
| them on demand via file operations (no dedicated remember/read tools). | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| import threading | ||
| from datetime import datetime | ||
| from pathlib import Path | ||
| from typing import Final | ||
|
|
||
| logger = logging.getLogger("memory_file") | ||
|
|
||
| _LOCK: Final[threading.Lock] = threading.Lock() | ||
| _MEMORY_FILENAME: Final[str] = "memory.md" | ||
| _EIGENT_DIR: Final[str] = ".eigent" | ||
| _DEFAULT_HEADER: Final[str] = "# Project Memory\n\nLong-term memory for this project.\n" | ||
| _MAX_ENTRY_LENGTH: Final[int] = 10000 | ||
| _DEFAULT_INDEX_LINES: Final[int] = 200 | ||
| _MAX_INDEX_LINES: Final[int] = 2000 | ||
|
|
||
| _CONTINUATION_NOTE: Final[str] = "\n\n...(further memory in .eigent/; read files as needed)\n" | ||
| _INDEX_HEADER: Final[str] = "=== Project memory index (.eigent/memory.md) ===\n" | ||
|
|
||
|
|
||
| class MemoryFileError(Exception): | ||
| """Base exception for memory file operations.""" | ||
|
|
||
|
|
||
| class MemoryReadError(MemoryFileError): | ||
| """Raised when reading the memory file fails.""" | ||
|
|
||
|
|
||
| class MemoryWriteError(MemoryFileError): | ||
| """Raised when writing or appending to the memory file fails.""" | ||
|
|
||
|
|
||
| def _validate_working_directory(working_directory: str) -> Path: | ||
| if not working_directory or not working_directory.strip(): | ||
| raise ValueError("working_directory cannot be empty") | ||
| path = Path(working_directory).expanduser().resolve() | ||
| if not path.exists(): | ||
| raise ValueError(f"working_directory does not exist: {path}") | ||
| if not path.is_dir(): | ||
| raise ValueError(f"working_directory is not a directory: {path}") | ||
| return path | ||
|
|
||
|
|
||
| def _validate_content(content: str, max_length: int = _MAX_ENTRY_LENGTH) -> str: | ||
| if not content or not content.strip(): | ||
| raise ValueError("content cannot be empty") | ||
| content = content.strip() | ||
| if len(content) > max_length: | ||
| raise ValueError(f"content exceeds maximum length of {max_length} characters") | ||
| return content | ||
|
|
||
|
|
||
| def get_memory_file_path(working_directory: str) -> Path: | ||
| """Return the path to the project's memory file (.eigent/memory.md).""" | ||
| base_path = _validate_working_directory(working_directory) | ||
| eigent_dir = base_path / _EIGENT_DIR | ||
| eigent_dir.mkdir(parents=True, exist_ok=True) | ||
| return eigent_dir / _MEMORY_FILENAME | ||
|
|
||
|
|
||
| def read_memory(working_directory: str) -> str | None: | ||
| """Read the full content of the memory file, or None if missing/invalid.""" | ||
| try: | ||
| memory_path = get_memory_file_path(working_directory) | ||
| except ValueError as e: | ||
| logger.warning("Invalid working directory: %s", e) | ||
| return None | ||
|
|
||
| if not memory_path.exists(): | ||
| return None | ||
|
|
||
| try: | ||
| content = memory_path.read_text(encoding="utf-8") | ||
| return content if content.strip() else None | ||
| except OSError as e: | ||
| logger.error("Failed to read memory file %s: %s", memory_path, e) | ||
| return None | ||
|
|
||
|
|
||
| def write_memory(working_directory: str, content: str) -> bool: | ||
| """Overwrite the memory file with the given content. Returns True on success.""" | ||
| try: | ||
| memory_path = get_memory_file_path(working_directory) | ||
| validated = _validate_content(content, max_length=_MAX_ENTRY_LENGTH * 10) | ||
| except ValueError as e: | ||
| logger.error("Validation failed: %s", e) | ||
| return False | ||
|
|
||
| with _LOCK: | ||
| try: | ||
| memory_path.write_text(validated, encoding="utf-8") | ||
| logger.info("Memory file updated", extra={"path": str(memory_path)}) | ||
| return True | ||
| except OSError as e: | ||
| logger.error("Failed to write memory file: %s", e) | ||
| return False | ||
|
|
||
|
|
||
| def append_memory(working_directory: str, entry: str) -> bool: | ||
| """Append a timestamped entry to the memory file. Returns True on success.""" | ||
| try: | ||
| memory_path = get_memory_file_path(working_directory) | ||
| validated = _validate_content(entry) | ||
| except ValueError as e: | ||
| logger.error("Validation failed: %s", e) | ||
| return False | ||
|
|
||
| timestamp = datetime.now().strftime("%Y-%m-%d %H:%M") | ||
| formatted = f"\n## {timestamp}\n\n{validated}\n" | ||
|
|
||
| with _LOCK: | ||
| try: | ||
| if not memory_path.exists(): | ||
bitloi marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| memory_path.write_text(_DEFAULT_HEADER, encoding="utf-8") | ||
| with memory_path.open("a", encoding="utf-8") as f: | ||
| f.write(formatted) | ||
| logger.info( | ||
| "Memory entry appended", | ||
| extra={"path": str(memory_path), "entry_length": len(validated)}, | ||
| ) | ||
| return True | ||
| except OSError as e: | ||
| logger.error("Failed to append to memory file: %s", e) | ||
| return False | ||
|
|
||
|
|
||
| MEMORY_ARCHITECTURE_PROMPT: Final[str] = """ | ||
| Project long-term memory lives under .eigent/ in the project directory. | ||
| - .eigent/memory.md is the index: it lists or summarizes memory topics (e.g. user_preferences.md, decisions.md). | ||
| - You can read any .eigent/*.md file when you need topic-specific information. | ||
| - To remember something: create or edit markdown files under .eigent/ (e.g. append to an existing topic file or create one). Use normal file operations (read/write/append) or shell commands; no dedicated memory tool is required. | ||
| """ | ||
|
|
||
|
|
||
| def get_index_for_prompt( | ||
| working_directory: str, | ||
| max_lines: int = _DEFAULT_INDEX_LINES, | ||
| ) -> str | None: | ||
| """ | ||
| Return the first max_lines of memory.md formatted for system-prompt injection. | ||
| Callers should use this instead of dumping the full file; topic-specific | ||
| content is read by the agent via file operations. | ||
| """ | ||
bitloi marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| if not working_directory or not working_directory.strip(): | ||
| return None | ||
| if max_lines <= 0: | ||
| return None | ||
| effective_max = min(max_lines, _MAX_INDEX_LINES) | ||
|
|
||
| content = read_memory(working_directory) | ||
| if not content: | ||
| return None | ||
|
|
||
| lines = content.splitlines() | ||
| if len(lines) > effective_max: | ||
| index_content = "\n".join(lines[:effective_max]) + _CONTINUATION_NOTE | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: |
||
| else: | ||
| index_content = content | ||
|
|
||
| return _INDEX_HEADER + index_content + "\n" | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.