fix: handle Neo4j 5.x race condition during index creation #13

donbr · 2025-12-09T09:10:39Z

Summary

Fixes a race condition that causes the Graphiti MCP server to crash on startup when using Neo4j 5.x as the database backend.

Problem Description

Error: Neo.ClientError.Schema.EquivalentSchemaRuleAlreadyExists

Root Cause: Neo4j 5.x has a known issue where concurrent CREATE INDEX ... IF NOT EXISTS statements can fail even with the guard clause. The upstream graphiti_core library runs ~19 index creation queries in parallel via semaphore_gather(), triggering this race condition.

Impact: MCP server crashes during initialization, making Neo4j 5.x deployments non-functional.

Solution

Wrap build_indices_and_constraints() calls with try/except to catch and gracefully handle the EquivalentSchemaRuleAlreadyExists error. The indexes are successfully created by whichever parallel query completes first, so we can safely continue execution.

Changes

✅ src/server.py: Add race condition error handling (production entrypoint)
✅ src/graphiti_mcp_server.py: Add race condition error handling (legacy entrypoint)
✅ tests/test_neo4j_race_condition.py: Integration tests for error handling
✅ CLAUDE.md: Document the warning message in troubleshooting section

Testing

Before Fix:

$ docker compose -f docker/docker-compose-neo4j.yml up -d
# Container exits with code 1
neo4j.exceptions.ClientError: {neo4j_code: Neo.ClientError.Schema.EquivalentSchemaRuleAlreadyExists}

After Fix:

$ docker compose -f docker/docker-compose-neo4j.yml up -d
$ docker logs docker-graphiti-mcp-1
# Output:
2025-11-24 - graphiti_mcp_server - WARNING - Index creation race condition detected (Neo4j 5.x issue). Indexes likely already exist. Continuing...
2025-11-24 - graphiti_mcp_server - INFO - Successfully initialized Graphiti client

$ curl http://localhost:8000/health
{"status":"healthy","service":"graphiti-mcp"}

Integration Tests:

✅ Race condition error is caught and logged as warning
✅ Other errors are properly re-raised
✅ Server initializes successfully after handling race condition
✅ All existing tests pass

References

Upstream PR: fix(mcp): handle Neo4j 5.x index creation race condition getzep/graphiti#1081 (comment)
Neo4j Issue: Constraint creation is not idempotent operation neo4j/neo4j#13208

Checklist

Fix handles the specific EquivalentSchemaRuleAlreadyExists error
Other exceptions are re-raised (not silently swallowed)
Warning message is logged for visibility
Changes applied to both server entrypoints (factory + legacy)
Integration tests added
Documentation updated
Manual testing with Neo4j 5.26.0 successful
No impact on FalkorDB backend

🤖 Generated with Claude Code

## Summary - Wrap build_indices_and_constraints() with try/except in both entrypoints - Catch EquivalentSchemaRuleAlreadyExists error (Neo4j 5.x parallel index creation race condition) - Log warning and continue (indexes are created successfully despite error) - Add integration tests to verify error handling - Update CLAUDE.md troubleshooting documentation ## Changes - src/server.py: Add race condition error handling - src/graphiti_mcp_server.py: Add race condition error handling (legacy entrypoint) - tests/test_neo4j_race_condition.py: Integration tests for error handling - CLAUDE.md: Document the warning message in troubleshooting section ## Technical Details Neo4j 5.x has a known issue where concurrent CREATE INDEX ... IF NOT EXISTS statements can fail with EquivalentSchemaRuleAlreadyExists even though the guard clause should prevent this. The upstream graphiti_core library runs ~19 index creation queries in parallel via semaphore_gather(), triggering this race condition. The fix catches this specific error and continues execution, as the indexes are successfully created by whichever parallel query completes first. ## Testing - Integration tests verify graceful handling of race condition error - Integration tests verify other errors are properly re-raised - Manual testing with Neo4j 5.26.0 confirms server starts successfully - All existing tests pass ## References - Upstream PR: getzep/graphiti#1081 (comment) - Neo4j Issue: neo4j/neo4j#13208 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>

claude · 2025-12-09T09:12:06Z

Code Review: Neo4j 5.x Race Condition Fix

✅ Overall Assessment

This is a well-executed fix for a real production issue. The PR demonstrates good practices in error handling, testing, and documentation. The solution is pragmatic and appropriate for handling an upstream race condition.

Detailed Review

🎯 Code Quality: Excellent

Strengths:

✅ Consistent implementation across both entrypoints (src/server.py and src/graphiti_mcp_server.py)
✅ Clear, descriptive error messages with references to upstream issues
✅ Proper error propagation (only catches the specific race condition error)
✅ Minimal code changes focused on the problem
✅ Follows existing code style and conventions

Error Handling Pattern (src/server.py:157-168):

try:
    await self.client.build_indices_and_constraints()
except Exception as idx_error:
    if 'EquivalentSchemaRuleAlreadyExists' in str(idx_error):
        logger.warning(...)
    else:
        raise

This is the right approach - catch broadly, filter specifically, re-raise everything else.

🐛 Potential Issues: Minor Improvements Suggested

1. String-based Exception Detection

Issue: Using 'EquivalentSchemaRuleAlreadyExists' in str(idx_error) is fragile if the error message format changes.

Recommendation: Consider checking for the actual Neo4j exception type:

# More robust approach
try:
    from neo4j.exceptions import ClientError
    if isinstance(idx_error, ClientError) and 'EquivalentSchemaRuleAlreadyExists' in str(idx_error.code):
        # handle race condition
except ImportError:
    # Fallback to string checking if neo4j not installed
    if 'EquivalentSchemaRuleAlreadyExists' in str(idx_error):
        # handle race condition

However, given that this is an upstream Graphiti issue and the error comes wrapped through multiple layers, the current string-based approach is acceptable and pragmatic. Just document this tradeoff.

2. Test Specificity

Issue: Tests use generic Exception instead of actual Neo4j exception types.

Current (test_neo4j_race_condition.py:31):

mock_error = Exception('Neo.ClientError.Schema.EquivalentSchemaRuleAlreadyExists')

Suggestion: Import and use real Neo4j exceptions for more realistic testing:

try:
    from neo4j.exceptions import ClientError
    mock_error = ClientError.hydrate(
        code='Neo.ClientError.Schema.EquivalentSchemaRuleAlreadyExists',
        message='...'
    )
except ImportError:
    mock_error = Exception('Neo.ClientError.Schema.EquivalentSchemaRuleAlreadyExists')

Note: This is a nice-to-have, not a blocker. Current tests are sufficient.

⚡ Performance Considerations: No Concerns

Exception handling only occurs during initialization (one-time cost)
No impact on runtime performance after startup
No additional overhead in happy path
FalkorDB backend unaffected

🔒 Security Concerns: None

No security vulnerabilities introduced
Error messages don't leak sensitive information
Proper error propagation maintains security boundaries

🧪 Test Coverage: Good

Coverage:

✅ Test for successful race condition handling
✅ Test for non-race-condition errors being re-raised
✅ Test for proper warning log output
✅ Tests use proper mocking to avoid real database dependencies

Test Marker Usage:

@pytest.mark.integration
@pytest.mark.requires_neo4j

Appropriate markers for test categorization.

Minor suggestion: Consider adding a test that verifies the service continues to function normally after handling the race condition (e.g., verify a subsequent operation succeeds).

📚 Documentation: Excellent

CLAUDE.md update (lines 599-609):

✅ Clear explanation of the warning message
✅ Links to upstream issues (Neo4j and Graphiti)
✅ Reassures users it's harmless
✅ Well-placed in the Troubleshooting section

🎯 Best Practices Alignment

Checking against CLAUDE.md guidelines:

Practice	Status
Single quotes for strings	✅ Pass
100-char line length	✅ Pass
Proper error handling	✅ Pass
Async/await patterns	✅ Pass
Type hints	⚠️ Could add type hints to exception handlers
Google-style docstrings	✅ Pass (tests have clear docstrings)

📋 Recommendations

Priority: Optional Improvements

None of these block the PR - they're enhancements for future consideration:

Type Hints: Add type hints to exception variables

except Exception as idx_error:  # Could be: except Exception as idx_error: ClientError | Exception

Exception Specificity: Document the tradeoff of string-based detection in a code comment
Test Enhancement: Add a test verifying the service remains functional after handling the race condition

Logging Enhancement: Consider adding debug-level logging of the full exception for troubleshooting:

logger.warning('Index creation race condition detected...')
logger.debug(f'Full error: {idx_error}', exc_info=True)

✅ Approval Decision

APPROVED - Ready to Merge

This PR:

✅ Solves a critical production bug
✅ Includes comprehensive tests
✅ Has excellent documentation
✅ Follows project conventions
✅ Has no security concerns
✅ Minimal, focused changes

The minor suggestions above are optional enhancements that don't block merging.

🎉 Positive Highlights

Thoroughness: Both entrypoints updated consistently
Testing: Three well-designed tests covering different scenarios
Documentation: Clear troubleshooting section added
Pragmatism: Solves the immediate problem without over-engineering
Communication: Excellent PR description with before/after examples

Great work on this fix! 🚀

donbr merged commit 62768ca into main Dec 9, 2025
1 check passed

donbr deleted the fix/neo4j-race-condition branch December 10, 2025 06:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: handle Neo4j 5.x race condition during index creation #13

fix: handle Neo4j 5.x race condition during index creation #13

Uh oh!

donbr commented Dec 9, 2025

Uh oh!

claude bot commented Dec 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: handle Neo4j 5.x race condition during index creation #13

fix: handle Neo4j 5.x race condition during index creation #13

Uh oh!

Conversation

donbr commented Dec 9, 2025

Summary

Problem Description

Solution

Changes

Testing

References

Checklist

Uh oh!

claude bot commented Dec 9, 2025

Code Review: Neo4j 5.x Race Condition Fix

✅ Overall Assessment

Detailed Review

🎯 Code Quality: Excellent

🐛 Potential Issues: Minor Improvements Suggested

1. String-based Exception Detection

2. Test Specificity

⚡ Performance Considerations: No Concerns

🔒 Security Concerns: None

🧪 Test Coverage: Good

📚 Documentation: Excellent

🎯 Best Practices Alignment

📋 Recommendations

Priority: Optional Improvements

✅ Approval Decision

APPROVED - Ready to Merge

🎉 Positive Highlights

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants