-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Description
When adding a new edge with custom attributes defined in the edge type model, the extract_attributes function is not called if the edge is being created for the first time (i.e., no related edges exist in the graph). This means custom attributes are only extracted when similar edges already exist, which defeats the purpose of attribute extraction.
Steps to Reproduce
- Define a custom edge type with attributes:
from pydantic import BaseModel, Field
from typing import Optional, List
class HasSpec(BaseModel):
"""Product has specification relationship."""
value: str | List[str] = Field(description="Specification value")
unit: Optional[str] = Field(description="Specification unit like L, dB, ℃, W")
edge_types = {"HasSpec": HasSpec}
edge_type_map = {("Product", "Specification"): ["HasSpec"]}
- Add an episode with entities and relationships:
await graphiti.add_episode(
name="Product Info",
episode_body="DR-HHM003 has max runtime of 60 hours",
source=EpisodeType.text,
source_description="Product specification",
reference_time=datetime.now(),
entity_types=entity_types,
edge_types=edge_types,
edge_type_map=edge_type_map,
group_id=group_id
)
- Query the created edge:
edges = await graphiti.driver.execute_query("""
MATCH ()-[e:RELATES_TO]->()
WHERE e.name = 'HasSpec'
RETURN e.attributes
""")
Expected Behavior
The edge should have attributes populated with extracted values:
{
"value": "60",
"unit": "hours"
}
Actual Behavior
The edge is created with empty attributes:
{}
Root Cause
In graphiti_core/utils/maintenance/edge_operations.py, the resolve_extracted_edge function has an early return that skips attribute extraction for new edges:
File: graphiti_core/utils/maintenance/edge_operations.py
Lines: 479-480
async def resolve_extracted_edge(
llm_client: LLMClient,
extracted_edge: EntityEdge,
related_edges: list[EntityEdge],
existing_edges: list[EntityEdge],
episode: EpisodicNode,
edge_type_candidates: dict[str, type[BaseModel]] | None = None,
custom_edge_type_names: set[str] | None = None,
) -> tuple[EntityEdge, list[EntityEdge], list[EntityEdge]]:
if len(related_edges) == 0 and len(existing_edges) == 0:
return extracted_edge, [], [] # ❌ Returns early without attribute extraction
# ... rest of the function including:
# - Edge type classification (lines 532-538)
# - Attribute extraction (lines 583-603)When there are no related or existing edges (first-time edge creation), the function returns immediately without:
1. Calling the LLM to classify the edge type
2. Calling `extract_attributes` to populate custom attributes
Proposed Solution
Even when related_edges and existing_edges are empty, we should still perform edge type classification and attribute extraction. Here's a suggested fix:
async def resolve_extracted_edge(
llm_client: LLMClient,
extracted_edge: EntityEdge,
related_edges: list[EntityEdge],
existing_edges: list[EntityEdge],
episode: EpisodicNode,
edge_type_candidates: dict[str, type[BaseModel]] | None = None,
custom_edge_type_names: set[str] | None = None,
) -> tuple[EntityEdge, list[EntityEdge], list[EntityEdge]]:
# Handle edge type classification and attribute extraction even for new edges
if len(related_edges) == 0 and len(existing_edges) == 0:
# Fast path for truly new edges, but still extract attributes
if edge_type_candidates:
# Classify edge type
edge_types_context = [
{
'fact_type_name': type_name,
'fact_type_description': type_model.__doc__,
}
for type_name, type_model in edge_type_candidates.items()
]
context = {
'existing_edges': [],
'new_edge': extracted_edge.fact,
'edge_invalidation_candidates': [],
'edge_types': edge_types_context,
}
llm_response = await llm_client.generate_response(
prompt_library.dedupe_edges.resolve_edge(context),
response_model=EdgeDuplicate,
model_size=ModelSize.small,
prompt_name='dedupe_edges.resolve_edge',
)
fact_type = EdgeDuplicate(**llm_response).fact_type
candidate_type_names = set(edge_type_candidates.keys())
custom_type_names = custom_edge_type_names or set()
is_allowed_custom_type = fact_type in candidate_type_names
if is_allowed_custom_type:
extracted_edge.name = fact_type
edge_model = edge_type_candidates.get(fact_type)
if edge_model is not None and len(edge_model.model_fields) != 0:
edge_attributes_context = {
'episode_content': episode.content,
'reference_time': episode.valid_at,
'fact': extracted_edge.fact,
}
edge_attributes_response = await llm_client.generate_response(
prompt_library.extract_edges.extract_attributes(edge_attributes_context),
response_model=edge_model,
model_size=ModelSize.small,
prompt_name='extract_edges.extract_attributes',
)
extracted_edge.attributes = edge_attributes_response
return extracted_edge, [], []
# ... rest of existing logic for when related_edges or existing_edges exist### Environment
- Graphiti Core Version: 0.24.1
- Python Version: 3.12
Additional Context
This issue significantly impacts the usability of custom edge types with attributes, as attributes are only extracted after at least one duplicate edge attempt, rather than on first creation.