An AI memory-enhanced chatbot framework designed specifically for role-playing scenarios, with seamless SillyTavern integration to give your AI companions persistent memory capabilities.
- Event-Driven Memory: Automatically summarizes conversations into structured events with persistent storage
- Vector Retrieval: Semantic search based on ChromaDB for intelligent memory recall
- Memory Routing: AI automatically determines when to create, update, or retrieve memories
- SillyTavern Compatible: Fully compatible with OpenAI API format, can directly replace SillyTavern's backend
- Character Memory Enhancement: Let AI characters remember every interaction and important event with users
- Emotional Memory: Supports recording emotional states, mood changes, and significant moments
- Home Device Support: Based on Ollama local deployment, no expensive GPU servers required
- Modular Design: Easily extensible with visual recognition, voice interaction, and other features
- Developer Friendly: Built on mature LangChain and LangGraph frameworks, easy for secondary development
- Python 3.8+
- Ollama (for local embedding models)
- Clone the Project
git clone https://github.com/yourusername/Somnia.git
cd Somnia- Install Dependencies
pip install -r requirements.txt-
Configure Models
- Edit
config/api_setting.yml - Set your API keys and model configurations
- Ensure Ollama is installed and running embedding models
- Edit
-
Start the Service
python server/start_api.py- Configure in SillyTavern
- Select "OpenAI" in SillyTavern's API settings
- Set API address to:
http://localhost:8004/v1 - Enjoy your AI companion's new experience with memory!
# Main chat model
provider: "deepseek"
base_url: "https://api.deepseek.com/v1"
api_key: "your-api-key"
model: "deepseek-chat"
# Local embedding model (via Ollama)
embedding:
model: "qwen3-embedding:0.6b"
base_url: "http://localhost:11434"# Memory-related settings
memory_recent_events_limit: 7 # Number of recent events
vector_search:
top_k: 1 # Number of retrieval results
similarity_threshold: 0.7 # Similarity threshold- LangChain & LangGraph: Workflow orchestration and AI application development
- ChromaDB: Vector database for storing and retrieving memories
- Ollama: Local model deployment
- LlamaIndex: Document indexing and retrieval
- FastAPI: High-performance API service
User Input → Memory Init → Event Summary → Memory Operation → Vector Retrieval → AI Reply → Logging
- Perfect replacement for SillyTavern's default backend
- Let characters remember your conversation history
- Support long-term memory and emotional development
- Vision Module: Integrate image recognition to let AI "see" your world
- Voice Module: Add voice synthesis and recognition features
- Custom Nodes: Develop specific functions based on BaseNode class
- Run on home computers
- Fully localized data for privacy protection
- Offline operation, no cloud service dependencies
After starting the service, visit: http://localhost:8004/docs
Compatible with OpenAI Chat Completions API format:
curl -X POST "http://localhost:8004/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{"role": "user", "content": "Hello!"}
],
"stream": true
}'Welcome to submit Issues and Pull Requests!
- Fork the project
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- st-memory-enhancement - Important inspiration source for memory enhancement ideas
- LangChain - AI application development framework
- SillyTavern - Role-playing chat interface
- ChromaDB - Vector database
- Ollama - Local model deployment
Give your AI companion true memory and start your intelligent conversation journey! 🚀✨