Releases · marirs/capa-rs

This PR introduces a series of comprehensive updates aimed at improving the efficiency, accuracy, and user control over feature detection and JSON output generation within our project. The changes span across various components, refining both the underlying logic for string extraction and the mechanisms for data representation. Below is a summary of the key enhancements:
Feature Detection Improvements

Optimized Unicode String Extraction: We've refined the extract_unicode_strings function to better handle UTF-16LE and UTF-16BE encodings, employing targeted regex patterns that enhance the accuracy of our string detection efforts.
Advanced Bytes Feature Evaluation: The evaluation method in BytesFeature now utilizes a sliding window approach, allowing us to detect specified byte sequences more flexibly across different contexts.

JSON Output Management

Enhanced JSON Generation for Map Features: With the new -f parameter, users can now filter map features by type, making the JSON output more relevant and manageable. This feature is triggered by the -m flag and requires specifying an output path using the -o parameter.
Clean String Function: We've added a function to sanitize extracted strings, ensuring the output is free from null characters and non-printable ASCII characters.

Safety and Usability Enhancements

Boundary Checks and Error Handling: Significant updates have been made to prevent buffer over-reads and integer overflows, particularly in the detect_ascii_len function, enhancing the overall safety of our operations.
CLI Options Expansion: The introduction of filter_map_features in CliOpts allows for even finer control over the features to be processed.

Why This Matters

These updates collectively represent a significant leap forward in our project's capability to accurately detect and represent data features, catering to a broader range of encoding scenarios and user needs. By improving efficiency, accuracy, and control, we are setting a solid foundation for future developments and applications of our project.
Testing

Test cases cover a variety of scenarios, including different encoding formats, feature types, and JSON output configurations.

Assets 5

15 Feb 13:31

marirs

v0.3.12

2ac0eb2

v0.3.12

Thanks again to @jorgeaduran for his contribution :)

Enhanced Feature Extraction and Output Customization in Capa CLI

This PR introduces a series of optimizations and enhancements to the Capa CLI tool, focusing on improving the feature extraction process, particularly with .NET files, and adding new CLI options for better output management and feature data inclusion.
Key Changes
.NET-Aware Feature Extraction

The feature extraction logic has been optimized to include conditional checks for .NET files, ensuring that file features are accurately extracted based on the file type. This enhances the tool's ability to work with a broader range of executable formats and improves the overall accuracy of the analysis.
CLI Options for Output Customization

JSON Output Path (-o or --output): Users can now specify a custom path for the JSON output using the -o or --output option followed by the desired file path. This allows for greater flexibility in how and where analysis results are saved. For example, specifying -o="path_to_json" will direct the tool to save the JSON output to the specified path.
Feature Map Inclusion (-m or --map-features): With the new --map-features flag, users can opt to include a comprehensive map of features found during the analysis in the JSON output. This feature is particularly useful for detailed analyses where understanding the specific features matched is crucial. To include the feature map in the JSON output, simply add --m to the command line.

Implementation Details

The feature collection process now leverages Rust's efficient data handling capabilities to streamline the aggregation and indexing of rule matches, significantly reducing the complexity and improving the performance of the analysis.
Conditional logic has been added to ensure that file features are only included for .NET files, addressing the unique analysis requirements of these files.
The introduction of CLI options for output customization provides users with enhanced control over the analysis process, enabling more tailored and detailed examination of binary files.

Usage

To utilize the new output customization features, you can specify the JSON output path and decide whether to include the feature map in the output as follows:

capa_cli -o="path_to_json" --m [other options] <file_to_analyze>

Contributors

jorgeaduran

Assets 5

12 Feb 16:17

marirs

v0.3.11

87400ed

v0.3.11

Thanks to @jorgeaduran for some more code optmisations.

Contributors

jorgeaduran

Assets 5

11 Feb 16:08

marirs

v0.3.10

bbeab68

v0.3.10

Some code optimisations

Assets 5

11 Feb 06:13

marirs

v0.3.9

47901db

v0.3.9

Merge Pull Request #6 from @jorgeaduran . Thanks to @jorgeaduran.
Thanks to @mnaza for his contribution for refactor for bytes ABI in fancy_regex!

Key Improvements
.NET Analysis

Analysis Issues: Addressed critical bugs in .NET analysis, ensuring more accurate and reliable outcomes.
RwLock Usage: Transitioned to parking_lot for RwLock, enhancing concurrency control throughout the codebase.

Error Handling and Feature Extraction

Error Handling: Refined error handling mechanisms, particularly during feature extraction, to provide clearer insights into processing failures.
Feature Enhancements:
    Improved internal naming conventions for class features, ensuring consistency and readability.
    Enhanced the JSON output format, making the data more accessible and easier to integrate with other tools.
    Fixed the handling of RuleFeatureType::Namespace, correcting inaccuracies in feature categorization.

Optimization and Refactoring

PE Header Parsing: Replaced carve_pe with find_embedded_pe_headers, streamlining the extraction process.
Extractor Optimization: Modified the extractor to minimize redundant reads, improving performance.
New Features: Added StringFeature and updated extract_insn_api_features to include ApiFeature split by ::, broadening the analysis scope.

Code Quality

Number Parsing Logic: Fixed sign handling errors and introduced parse_operand_to_number for more efficient number parsing from instruction operands.
Export Name Extraction: Optimized the extraction of export names, enhancing the clarity and utility of the analysis results.

Impact

These changes are expected to significantly improve the framework's usability, accuracy, and performance. They address known issues, introduce new capabilities, and set the stage for future enhancements.

Contributors

mnaza and jorgeaduran

Assets 5

27 Nov 12:36

marirs

v0.3.8

1c8c3c7

v0.3.8

Updated

Assets 5

14 Oct 16:13

marirs

v0.3.7

7d58e11

v0.3.7

Added new features: property read/write

Assets 5

09 Aug 02:26

marirs

v0.3.6

9c20195

v0.3.6

Fixes to earlier release

Assets 5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributors

Contributors

Contributors

Contributors

Releases: marirs/capa-rs

v0.3.15

Contributors

v0.3.14

v0.3.13

v0.3.12

Contributors

v0.3.11

Contributors

v0.3.10

v0.3.9

Contributors

v0.3.8

v0.3.7

v0.3.6