Gas Benchmarking System

This directory contains the gas benchmarking infrastructure for Remitwise smart contracts. The system tracks CPU and memory costs for critical operations to detect performance regressions early in development.

Overview

Gas benchmarking helps ensure that contract operations remain efficient and predictable. Each benchmark measures:

CPU Instructions: Computational cost of operations
Memory Usage: Storage and temporary memory allocation costs

Structure

benchmarks/
├── README.md           # This documentation
├── baseline.json       # Baseline measurements for all operations
├── thresholds.json     # Regression detection thresholds
└── history/           # Historical benchmark data

Configuration Files

baseline.json

Contains baseline CPU and memory costs for each benchmarked operation. These values are updated when legitimate performance improvements are made.

thresholds.json

Defines regression detection thresholds as percentage increases from baseline:

default: 10% increase triggers warning for most operations
contract_specific: Custom thresholds per contract
method_specific: Custom thresholds per method

Running Benchmarks

Individual Contract Benchmarks

# Run remittance_split schedule operation benchmarks
RUST_TEST_THREADS=1 cargo test -p remittance_split --test gas_bench -- --nocapture

# Run bill_payments benchmarks
RUST_TEST_THREADS=1 cargo test -p bill_payments --test gas_bench -- --nocapture

# Run reporting aggregation benchmarks
RUST_TEST_THREADS=1 cargo test -p reporting --test gas_bench -- --nocapture

All Benchmarks

# Run all gas benchmarks across contracts
./scripts/run_all_benchmarks.sh

Benchmark Output Format

Each benchmark outputs JSON with the following structure:

{
  "contract": "remittance_split",
  "method": "create_remittance_schedule", 
  "scenario": "single_recurring_schedule",
  "cpu": 12345,
  "mem": 6789
}

For CI parsing, gas suites may also emit lines prefixed with:

GAS_BENCH_RESULT : machine-readable benchmark result with baseline/threshold metadata
cpu regression ... / mem regression ...: assertion failures when thresholds are exceeded

This keeps --nocapture logs easy to scrape in CI while preserving normal Rust test output.

Remittance Split Schedule Operations

The remittance split contract includes comprehensive benchmarks for schedule lifecycle operations:

Create Operations

create_remittance_schedule/single_recurring_schedule: Basic schedule creation
create_remittance_schedule/11th_schedule_with_existing: Scaling with existing schedules

Modify Operations

modify_remittance_schedule/single_schedule_modification: Update existing schedule

Cancel Operations

cancel_remittance_schedule/single_schedule_cancellation: Cancel active schedule

Query Operations

get_remittance_schedules/empty_schedules: Query with no schedules
get_remittance_schedules/5_schedules_with_isolation: Query with data isolation
get_remittance_schedules/50_schedules_worst_case: Worst-case query performance
get_remittance_schedule/single_schedule_lookup: Single schedule retrieval

Security Considerations

All benchmarks include security validations:

Authorization: Tests verify proper authentication and authorization
Data Isolation: Ensures users can only access their own data
Input Validation: Tests with valid parameters to ensure proper validation
Edge Cases: Covers boundary conditions and error scenarios

Bill Payments Archive and Batch Suite

bill_payments/tests/gas_bench.rs includes dedicated regression coverage for:

archive_paid_bills/120_paid_1_unpaid_preserved
restore_bill/single_archived_owner_restore
bulk_cleanup_bills/mixed_age_20_of_30_deleted
batch_pay_bills/mixed_batch_50_partial_success

Security assumptions validated in these benches:

Archive and cleanup are maintenance operations over paid/archived data only
Restore is owner-only
Batch pay preserves owner isolation and deterministic partial success
Oversized batches are rejected (BatchTooLarge)

Regression Detection

The system automatically detects regressions by comparing current measurements against baselines:

Green: Within threshold (no action needed)
Yellow: Exceeds threshold but < 25% increase (review recommended)
Red: > 25% increase (investigation required)

Reporting Aggregation Benchmarks

The reporting contract benchmarks cover the three heavy aggregation paths identified in issue #317, each run at three data sizes (small/medium/large) to expose O(n) complexity growth.

get_remittance_summary

Scenario	Description
`no_addresses_baseline`	Addresses not configured – O(1) storage miss, returns Missing
`with_split_4_categories`	Two cross-contract calls + four-category breakdown loop

get_trend_analysis_multi

Scenario	Items	Windows
`5_periods`	5	4
`25_periods`	25	24
`50_periods`	50	49

Pure in-contract computation; no cross-contract calls. Scales linearly with history length.

get_financial_health_report

Scenario	Goals	Bills	Policies
`small_5_items`	5	5	5
`medium_25_items`	25	25	25
`large_50_items`	50	50	50

Issues nine cross-contract calls per invocation: get_all_goals ×2, get_unpaid_bills ×1, get_active_policies ×2, get_split ×1, calculate_split ×1, get_all_bills_for_owner ×1, get_total_monthly_premium ×1.

archive_old_reports

Scenario	Stored reports
`5_stored_reports`	5
`25_stored_reports`	25
`50_stored_reports`	50

Dual O(n) map iteration: first over REPORTS to find candidates, then over to_remove to delete them.

get_storage_stats

after_25_archived – O(1) single instance-storage key read. Used to confirm the stats endpoint stays flat regardless of archive depth.

Adding New Benchmarks

When adding new contract methods:

Create benchmark test in contracts/{contract}/tests/gas_bench.rs
Add baseline entry in baseline.json
Set thresholds in thresholds.json if non-standard
Document security assumptions in test comments

Benchmark Test Template

/// Benchmark: {Operation description}
/// Security: {Security validations performed}
#[test]
fn bench_{operation_name}() {
    let env = bench_env();
    let contract_id = env.register_contract(None, YourContract);
    let client = YourContractClient::new(&env, &contract_id);

    // Setup test data
    let owner = <Address as AddressTrait>::generate(&env);
    
    let (cpu, mem, result) = measure(&env, || {
        client.your_method(&owner, &param1, &param2)
    });
    
    // Validate result
    assert!(result.is_ok());

    println!(
        r#"{{"contract":"your_contract","method":"your_method","scenario":"test_scenario","cpu":{},"mem":{}}}"#,
        cpu, mem
    );
}

Best Practices

Consistent Environment: Use bench_env() for reproducible conditions
Realistic Data: Test with representative data sizes and patterns
Worst-Case Scenarios: Include stress tests with maximum realistic loads
Security Validation: Always verify security assumptions in benchmarks
Clear Naming: Use descriptive scenario names that indicate test conditions

Monitoring and Alerts

Benchmark results are tracked in CI/CD pipelines
Significant regressions trigger build failures
Historical data enables trend analysis
Performance improvements can be validated before deployment

Troubleshooting

High Variance in Results

Ensure RUST_TEST_THREADS=1 for consistent execution
Check for external factors affecting test environment
Verify test data setup is deterministic

Unexpected Regressions

Review recent code changes for performance impacts
Check if test scenarios still match actual usage patterns
Validate that baseline measurements are still accurate

Adding Contract-Specific Thresholds

Some operations may have inherently higher variance:

Iteration-heavy operations (higher CPU threshold)
Dynamic memory allocation (higher memory threshold)
Complex calculations (higher CPU threshold)

Update thresholds.json with appropriate values based on operation characteristics.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gas Benchmarking System

Overview

Structure

Configuration Files

baseline.json

thresholds.json

Running Benchmarks

Individual Contract Benchmarks

All Benchmarks

Benchmark Output Format

Remittance Split Schedule Operations

Create Operations

Modify Operations

Cancel Operations

Query Operations

Security Considerations

Bill Payments Archive and Batch Suite

Regression Detection

Reporting Aggregation Benchmarks

get_remittance_summary

get_trend_analysis_multi

get_financial_health_report

archive_old_reports

get_storage_stats

Adding New Benchmarks

Benchmark Test Template

Best Practices

Monitoring and Alerts

Troubleshooting

High Variance in Results

Unexpected Regressions

Adding Contract-Specific Thresholds

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Gas Benchmarking System

Overview

Structure

Configuration Files

baseline.json

thresholds.json

Running Benchmarks

Individual Contract Benchmarks

All Benchmarks

Benchmark Output Format

Remittance Split Schedule Operations

Create Operations

Modify Operations

Cancel Operations

Query Operations

Security Considerations

Bill Payments Archive and Batch Suite

Regression Detection

Reporting Aggregation Benchmarks

get_remittance_summary

get_trend_analysis_multi

get_financial_health_report

archive_old_reports

get_storage_stats

Adding New Benchmarks

Benchmark Test Template

Best Practices

Monitoring and Alerts

Troubleshooting

High Variance in Results

Unexpected Regressions

Adding Contract-Specific Thresholds