Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add security rules for memcpy in C/C++ and new test cases #49

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ESS-ENN
Copy link
Collaborator

@ESS-ENN ESS-ENN commented Oct 28, 2024

Summary by CodeRabbit

  • New Features

    • Introduced new security rules for C and C++ to ensure proper use of the memcpy function, accounting for NUL terminators to prevent out-of-bounds reads.
  • Tests

    • Added comprehensive test cases to validate correct and incorrect usages of memcpy in both C and C++ contexts, ensuring that the NUL terminator is included in memory operations.
  • Documentation

    • Updated snapshots to reflect the new rules and test cases for better tracking and validation.

Copy link

coderabbitai bot commented Oct 28, 2024

Walkthrough

Two new rules have been added to enhance security in C and C++ code regarding the use of the memcpy function. These rules ensure that when copying strings, the length accounts for the NUL terminator to prevent out-of-bounds reads. Additionally, new test cases and snapshot files have been created to validate the correct usage of these rules, covering both valid and invalid scenarios for memory copying operations.

Changes

File Change Summary
rules/c/security/missing-nul-cpp-string-memcpy-copy-c.yml New rule added: id: missing-nul-cpp-string-memcpy-copy-c to check for NUL terminator in memcpy.
rules/cpp/security/missing-nul-cpp-string-memcpy-copy-cpp.yml New rule added: id: missing-nul-cpp-string-memcpy-copy-cpp to check for NUL terminator in memcpy.
tests/snapshots/missing-nul-cpp-string-memcpy-copy-c-snapshot.yml New snapshot added for testing memcpy with various parameters in C.
tests/snapshots/missing-nul-cpp-string-memcpy-copy-cpp-snapshot.yml New snapshot added for testing memcpy with various parameters in C++.
tests/c/missing-nul-cpp-string-memcpy-copy-c-test.yml New test case added to validate correct and incorrect usages of memcpy in C.
tests/cpp/missing-nul-cpp-string-memcpy-copy-cpp-test.yml New test case added to validate correct and incorrect usages of memcpy in C++.

Possibly related PRs

Suggested reviewers

  • petrisorcoderabbit

🐇 "In the code where strings do dwell,
A NUL terminator we must tell.
For memcpy's dance, let’s be wise,
To copy with care, avoid surprise!
With tests and rules, we hop along,
Ensuring our strings are safe and strong!" 🐇


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@coderabbitai coderabbitai bot changed the title @coderabbitai Add security rules for memcpy in C/C++ and new test cases Oct 28, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (8)
tests/c/missing-nul-cpp-string-memcpy-copy-c-test.yml (3)

2-6: Consider adding more edge cases to valid tests.

The current valid test cases correctly verify the inclusion of NUL terminator using both strlen() and size(). However, consider adding these additional test cases for better coverage:

  • Empty string: ""
  • String literal source
  • Very long string (near size_t max)

Add these test cases:

valid:
  - |
    memcpy(to, from.c_str(), strlen(from.c_str()) + 1);
  - |
    memcpy(to, from.c_str(), from.size() + 1);
+ - |
+   memcpy(to, "", 1);  // Empty string
+ - |
+   memcpy(to, "literal", 8);  // String literal
+ - |
+   memcpy(to, very_long.c_str(), very_long.size() + 1);  // Long string

7-19: Consider adding more security-critical invalid cases.

The current invalid test cases effectively catch missing NUL terminators and undefined lengths. However, consider adding these security-critical scenarios:

  • Partial string copy (e.g., memcpy(to, from.c_str(), n) where n < length)
  • Zero length copy
  • Integer overflow cases (e.g., size_t(-1))

Add these test cases:

invalid:
  - |
    memcpy(to, from.c_str(), from.size());
  - |
    memcpy(to, from.c_str(), strlen(from.c_str()));
  - |
    memcpy(to, from.c_str(), from.length());
  - |
    memcpy(to, from.c_str(), len_001);
  - |
    memcpy(to, from.c_str(), len_002);
  - |
    memcpy(to, from.c_str(), len_003);
+ - |
+   memcpy(to, from.c_str(), 5);  // Partial copy without NUL
+ - |
+   memcpy(to, from.c_str(), 0);  // Zero length
+ - |
+   memcpy(to, from.c_str(), SIZE_MAX);  // Overflow risk

1-19: Consider adding a comment about modern C++ alternatives.

While these tests are valuable for legacy code and C/C++ interop scenarios, consider adding a comment in the test file recommending modern C++ alternatives like std::string::copy() or std::copy() for new code. This helps developers make better choices when they reference these tests.

Add this comment at the top of the file:

+# Note: For modern C++ code, prefer std::string methods or std::copy over memcpy
+# These tests are primarily for C/C++ interoperability scenarios
 id: missing-nul-cpp-string-memcpy-copy-c
tests/cpp/missing-nul-cpp-string-memcpy-copy-cpp-test.yml (3)

2-6: Consider adding more comprehensive valid test cases.

The current valid cases correctly demonstrate including the NUL terminator using both strlen() and size(). Consider adding these additional scenarios to strengthen the test suite:

  1. Explicit size calculation with bounds checking
  2. Safe practices with destination buffer size verification

Add these additional valid test cases:

valid:
  - |
    memcpy(to, from.c_str(), strlen(from.c_str()) + 1);
  - |
    memcpy(to, from.c_str(), from.size() + 1);
+ - |
+   size_t size = from.size() + 1;
+   if (size <= to_size) {
+     memcpy(to, from.c_str(), size);
+   }
+ - |
+   const size_t src_size = from.size() + 1;
+   assert(to_size >= src_size);
+   memcpy(to, from.c_str(), src_size);

7-19: Enhance invalid test cases with more subtle scenarios.

The current invalid cases effectively cover basic mistakes and undefined variables. Consider adding these subtle scenarios that could slip through code review:

Add these additional invalid test cases:

invalid:
  # existing cases...
+ - |
+   memcpy(to, from.c_str(), from.size() + offset); // offset might be 0
+ - |
+   size_t len = from.size();
+   memcpy(to, from.c_str(), len); // indirect size without NUL
+ - |
+   memcpy(to, from.c_str(), from.size() + (condition ? 0 : 1)); // conditional NUL

These additions would help catch more subtle bugs where the NUL terminator might be missing in less obvious ways.


1-19: Consider adding documentation about security implications.

While the test cases effectively validate the rule, adding comments that explain the security implications would be valuable:

  1. Document the potential for buffer overflows
  2. Explain the risks of out-of-bounds reads
  3. Reference any relevant security standards or CVEs

This documentation would help developers understand why these checks are crucial.

Add a documentation section at the top of the file:

+# Security Test Cases for memcpy with C++ Strings
+# 
+# These tests validate proper handling of NUL terminators in string copies to prevent:
+# - Buffer overflows from missing string terminators
+# - Out-of-bounds reads leading to information disclosure
+# - Memory corruption in downstream string operations
+#
+# Related CWEs: CWE-119, CWE-120, CWE-126
+
 id: missing-nul-cpp-string-memcpy-copy-cpp
 valid:
   ...
rules/c/security/missing-nul-cpp-string-memcpy-copy-c.yml (1)

1-12: Consider enhancing the security context in messages.

While the current message and note are informative, consider:

  1. Adding severity impact to the message (e.g., "...can lead to buffer overflows and potential remote code execution")
  2. Including additional relevant CWE references:
    • CWE-119 (Improper Restriction of Operations within the Bounds of a Memory Buffer)
    • CWE-788 (Access of Memory Location After End of Buffer)
rules/cpp/security/missing-nul-cpp-string-memcpy-copy-cpp.yml (1)

159-163: Add pattern descriptions in comments.

The rule definition would benefit from comments explaining what each pattern catches, making it easier for future maintainers to understand the rule's coverage.

 rule:
   any:
+    # Catches direct strlen() usage with memcpy
     - matches: match_identifier
+    # Catches string size/length field usage
     - matches: match_identifier_from_size
+    # Catches specific length variable patterns
     - matches: match_identifier_len
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 3195f93 and dfa6f0f.

📒 Files selected for processing (6)
  • rules/c/security/missing-nul-cpp-string-memcpy-copy-c.yml (1 hunks)
  • rules/cpp/security/missing-nul-cpp-string-memcpy-copy-cpp.yml (1 hunks)
  • tests/snapshots/missing-nul-cpp-string-memcpy-copy-c-snapshot.yml (1 hunks)
  • tests/snapshots/missing-nul-cpp-string-memcpy-copy-cpp-snapshot.yml (1 hunks)
  • tests/c/missing-nul-cpp-string-memcpy-copy-c-test.yml (1 hunks)
  • tests/cpp/missing-nul-cpp-string-memcpy-copy-cpp-test.yml (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/snapshots/missing-nul-cpp-string-memcpy-copy-cpp-snapshot.yml
🔇 Additional comments (11)
tests/c/missing-nul-cpp-string-memcpy-copy-c-test.yml (1)

1-1: LGTM! Clear and descriptive test case identifier.

The test case ID accurately reflects its purpose of checking for missing NUL terminators in C++ string memcpy operations.

tests/cpp/missing-nul-cpp-string-memcpy-copy-cpp-test.yml (1)

1-1: LGTM! Clear and descriptive test identifier.

The identifier accurately reflects the security check being tested.

rules/c/security/missing-nul-cpp-string-memcpy-copy-c.yml (2)

1-163: Verify test coverage for the security rule.

Please ensure comprehensive test cases exist for:

  1. True positives: Various unsafe memcpy patterns
  2. True negatives: Safe implementations that shouldn't trigger
  3. Edge cases: Custom string classes, different string types

Let's check for existing test files:

#!/bin/bash
# Search for corresponding test files
fd -g "*missing-nul-cpp-string-memcpy*test*"

# Check for example code in test files
rg -l "memcpy.*c_str" test/

Would you like help creating comprehensive test cases for this rule?


13-158: Consider additional string handling scenarios.

The current patterns might miss or incorrectly flag:

  1. std::string_view conversions to c_str()
  2. Custom string classes with similar methods
  3. Safe implementations using sizeof() with character arrays

Let's check for potential false positives:

rules/cpp/security/missing-nul-cpp-string-memcpy-copy-cpp.yml (3)

121-158: ⚠️ Potential issue

Review the specific length variable pattern.

The pattern at line 157 uses very specific variable names (len_001$|len_002$|len_003$). This seems overly restrictive and might miss valid cases.

#!/bin/bash
# Check for length variable naming patterns in the codebase
rg -g '*.{cpp,h,hpp}' '\blen_\d+\b' -A 2

Consider using a more generic pattern:

-                regex: "^len_001$|len_002$|len_003$"
+                regex: "^len_\d+$"

1-163: Verify rule coverage with common string handling patterns.

Please ensure the rule catches these common patterns:

  1. Direct string usage: memcpy(dst, str.c_str(), str.length())
  2. Temporary strings: memcpy(dst, std::string("test").c_str(), 4)
  3. String view usage: memcpy(dst, sv.data(), sv.length())
#!/bin/bash
# Search for various memcpy patterns with strings in the codebase
echo "Checking direct string usage..."
ast-grep --pattern 'memcpy($DST, $STR.c_str(), $STR.length())'

echo "Checking temporary string usage..."
ast-grep --pattern 'memcpy($DST, std::string($STR).c_str(), $LEN)'

echo "Checking string_view usage..."
ast-grep --pattern 'memcpy($DST, $STR.data(), $STR.length())'

14-67: 🛠️ Refactor suggestion

Enhance match_identifier pattern to handle string::length().

The current pattern only catches strlen() calls. Consider adding a pattern to catch std::string::length() usage as well, as it's a common C++ idiom.

tests/__snapshots__/missing-nul-cpp-string-memcpy-copy-c-snapshot.yml (4)

1-2: LGTM: Clear snapshot identifier.

The snapshot file is properly identified for the C-specific rule testing memcpy operations with C++ strings.


1-316: Verify test coverage for secure patterns.

While these test cases appear to demonstrate unsafe patterns, we should ensure there are corresponding positive test cases that show the secure way to handle string copying (i.e., including space for the NUL terminator).

Let's check for positive test cases:

#!/bin/bash
# Search for test cases that might include NUL terminator handling
echo "Searching for positive test cases..."
rg -A 5 "memcpy.*\+ 1\)" tests/

113-206: Verify the length variables' initialization.

The test cases using len_001 and len_002 need verification of how these variables are initialized. If they don't include space for the NUL terminator, they could lead to the same security issues.

Let's check the variable definitions:

#!/bin/bash
# Search for len_001 and len_002 variable definitions
echo "Searching for length variable definitions..."
rg "len_00[12] = " -A 2

3-57: Potential security risk: Length calculation doesn't account for NUL terminator.

The test case memcpy(to, from.c_str(), from.length()) demonstrates a common security vulnerability. Using from.length() excludes the NUL terminator, which could lead to:

  1. Buffer overflow if the destination buffer size matches the string length
  2. String truncation if the code assumes NUL termination

Let's confirm if this is intended as a negative test case:

✅ Verification successful

The previous command failed, so let's try a more targeted search to understand if this is a test case demonstrating a security issue or if we need to flag it as a real concern.


The output shows this is part of security rules and has corresponding test files. Let's examine the rule definition and test case to confirm if this is intentionally demonstrating a vulnerability pattern.


Security rule verification confirmed: NUL terminator omission is a valid concern

This is a snapshot of a security rule test case that correctly identifies a potential buffer overflow vulnerability. The rule definition and test file confirm that:

  • The pattern memcpy(to, from.c_str(), from.length()) is explicitly listed as an invalid case
  • Valid cases require adding 1 to the length to include the NUL terminator
  • The rule is designed to detect and prevent out-of-bounds reads (CWE-125)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for the corresponding test file and rule definition
echo "Searching for related test and rule files..."
fd -e yml -e yaml . | rg -l "missing-nul-cpp-string-memcpy-copy"

Length of output: 291


Script:

#!/bin/bash
# Let's try a different approach to find the test case and rule definition
# First, let's check if this is a snapshot file and find its corresponding test
rg -l "memcpy.*from\.c_str.*from\.length" --type yaml

# Also search for any rule definitions related to this pattern
rg -l "missing-nul|cpp.*string.*memcpy" --type yaml

Length of output: 737


Script:

#!/bin/bash
# Let's check the rule definition to confirm this is indeed a security rule
cat rules/cpp/security/missing-nul-cpp-string-memcpy-copy-cpp.yml

# Also check the test file to see if this is marked as a negative test case
cat tests/cpp/missing-nul-cpp-string-memcpy-copy-cpp-test.yml

Length of output: 5550

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant