- 
                Notifications
    You must be signed in to change notification settings 
- Fork 2
Closed
Description
Problem
The current implementation of query_sql in cratedb_mcp/__main__.py attempts to restrict queries to read-only SELECT statements but uses a weak validation mechanism that can be bypassed.
As demonstrated in test_query_sql_permitted_exploit(), the current validation only checks for the substring 'select' in the query string, which allows SQL injection attacks where the word 'select' is included in the values of an INSERT statement or other data modification queries.
Suggested Solution
Replace the naive string check with a more robust validation using a dedicated SQL parser library like sqlparse:
import sqlparse
def query_sql(query: str) -> Dict[str, Any]:
    # Parse the SQL statement
    parsed = sqlparse.parse(query.strip())
    
    # Check if it's a SELECT statement
    if not parsed or parsed[0].get_type().upper() != 'SELECT':
        raise ValueError('Only SELECT statements are allowed')
    
    # Rest of the function...Benefits of using sqlparse:
- Properly handles SQL syntax
- Correctly identifies statement types
- More resilient against bypass attempts
- Handles edge cases like comments and whitespace
Implementation Steps
- Add sqlparseas a dependency inpyproject.toml
- Modify query_sqlto usesqlparsefor statement validation
- Update tests to verify the improved validation works correctly
References
- Identified in PR: Boilerplate: Add software tests and CI configuration #9
- Discussion: Boilerplate: Add software tests and CI configuration #9 (comment)
- sqlparsedocumentation: https://sqlparse.readthedocs.io/
Metadata
Metadata
Assignees
Labels
No labels