[Feature Request]: Automatic merging of the same entity under different names

### **Background**  
LightRAG currently merges entities solely based on exact name matches (including captions). This results in multiple disconnected nodes for the same entity under different names, and may even create isolated subgraphs for identical entities, ultimately degrading query performance.  

### **Automated Entity Merging for Variant Names**  

To address this, we propose an automated entity merging approach for differently named but identical entities:  

1. **Vector Node Database Utilization**:  
   - Modify  node vector DB implementation to store the embedded vector on entity name.  

2. **Similarity Threshold Configuration**:  
   - Set a minimum cosine similarity threshold (e.g., 0.8) for candidate selection.  

3. **Candidate Retrieval**:  
   - During merging, retrieve the top 10 most relevant nodes based on cosine similarity (above the threshold).  

4. **LLM-Based Merge Validation**:  
   - Submit the current entity’s name/description along with candidate entities’ names/descriptions to an LLM.  
   - Task the LLM to:  
     - Determine whether merging is justified,
     - If merging is approved, select a best candidate for merging, and return the consolidated entity name and description.  

5. Iterative Merging With Depth Limitation (optional):  
   - Repeat the merging validation process for the newly consolidated entity returned by the LLM.  


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Automatic merging of the same entity under different names #1323

Background

Automated Entity Merging for Variant Names

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request]: Automatic merging of the same entity under different names #1323

Description

Background

Automated Entity Merging for Variant Names

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions