-
Notifications
You must be signed in to change notification settings - Fork 2.1k
fix: enforce custom edge types for FalkorDB and BFS search #1128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
david-morales
wants to merge
7
commits into
getzep:main
Choose a base branch
from
david-morales:fix/falkordb-custom-edge-types
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
fix: enforce custom edge types for FalkorDB and BFS search #1128
david-morales
wants to merge
7
commits into
getzep:main
from
david-morales:fix/falkordb-custom-edge-types
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This PR fixes custom edge type enforcement when using FalkorDB as the graph backend. ## Problem When using schema-defined edge types (e.g., SPOUSE_OF, BORN_IN, DIRECTED), all relationships were being saved as RELATES_TO instead of the defined types. ## Root Causes 1. The LLM prompt allowed inventing arbitrary edge types 2. FalkorDB's bulk save query hardcoded `:RELATES_TO` in the Cypher MERGE 3. No backend validation to catch LLM-invented types ## Solution ### 1. Stricter LLM Prompt (extract_edges.py) - Enforces exact `fact_type_name` values from the schema - Requires RELATES_TO as fallback when no schema type matches - Explicitly rejects invented types and modified names ### 2. Backend Safety Net (edge_operations.py) - Adds strict type checking when custom edge_types are defined - Converts any LLM-invented types not in schema to RELATES_TO - Ensures 100% schema compliance ### 3. FalkorDB Dynamic Edge Types (bulk_utils.py) - Groups edges by type and runs separate queries per type - FalkorDB requires static relationship types in Cypher MERGE statements - Also adds JSON serialization for complex attribute types (dict/list) ### 4. Updated Test (test_edge_operations.py) - Reflects new stricter behavior: unknown types convert to RELATES_TO 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
The BFS search queries were hardcoded to only traverse RELATES_TO edges,
which meant custom edge types (LOCATED_IN, MEMBER_OF, SPOUSE_OF, etc.)
were never discovered during multi-hop traversal.
## Changes
- Use wildcard traversal `[*1..N]` instead of `[:RELATES_TO|MENTIONS*1..N]`
- Match any edge type `[e {uuid: ...}]` instead of `[e:RELATES_TO {uuid: ...}]`
- The Entity-to-Entity filter naturally excludes MENTIONS edges
## Impact
- BFS now discovers all edge types during traversal
- Multi-hop reasoning can follow typed relationships
- Improves path recall by ~5% in evaluation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <[email protected]>
The edge_fulltext_search function was hardcoded to only search the RELATES_TO relationship type. This meant custom edge types like SANCTION were invisible to BM25 search even when they had fulltext indexes. Changes: - Add edge_types parameter to EdgeSearchConfig - Update edge_fulltext_search to query multiple relationship types - For FalkorDB, query each edge type's fulltext index and combine results - Update search.py to pass edge_types from config - Update search_interface to support edge_types parameter 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Verifies that the edge_types parameter correctly filters fulltext search results by relationship type. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
68cc16d to
66833b8
Compare
When custom edge_types are provided to add_episode or add_episode_bulk, automatically create fulltext indexes for each edge type if they don't already exist. This enables BM25 fulltext search on custom relationship types without requiring manual index creation. Changes: - Add ensure_edge_type_index() method to GraphDriver base class (no-op) - Implement ensure_edge_type_index() in FalkorDriver to create fulltext indexes on (name, fact, group_id) for custom edge types - Call ensure_edge_type_index() in add_episode and add_episode_bulk - Add test for ensure_edge_type_index functionality 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
The edge BFS search was only traversing outgoing edges (-[*1..N]->), which missed edges where the target node was the search origin. For example, searching from "LTTE" would not find the SANCTION edge (Liberation Tigers of Tamil Eelam)-[SANCTION]->(LTTE) because it was an incoming edge to LTTE. Changed the traversal pattern from directional (-[*1..N]->) to bidirectional (-[*1..N]-) in both the default and Neptune query paths. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
When bfs_origin_node_uuids is not provided, the search now runs in two phases: 1. Phase 1: Find nodes via BM25/cosine/BFS (parallel with episode/community) 2. Phase 2: Find edges using found node UUIDs as BFS origins This ensures that edge BFS can traverse from nodes found by node search. Previously, edge BFS only used source nodes from BM25/cosine edge results, missing edges connected to nodes found only by node search. Example: Searching "What organization is also known as LTTE?" would find the LTTE node but not traverse to its connected SANCTION edges. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Problem
When defining custom edge types in the schema:
RELATES_TOinstead of the specified types likeSPOUSE_OF,BORN_IN, etc.RELATES_TOedges, ignoring custom types during multi-hop traversalRoot Causes
:RELATES_TOin the Cypher MERGE[:RELATES_TO|MENTIONS*1..N]path traversalSolution
1. Stricter LLM Prompt (extract_edges.py)
fact_type_namevalues from the schema2. Backend Safety Net (edge_operations.py)
3. FalkorDB Dynamic Edge Types (bulk_utils.py)
4. BFS Search Fix (search_utils.py)
[*1..N]instead of[:RELATES_TO|MENTIONS*1..N][e {uuid: ...}]instead of[e:RELATES_TO {uuid: ...}]5. Updated Test (test_edge_operations.py)
BM25 Fulltext Search
Files:
Test: test_edge_fulltext_search_custom_edge_types
Note: Custom edge types must have fulltext indexes created for BM25 search to work.
Test plan