Skip to content

paiml/compiled-rust-benchmarking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Compiled Rust Binary Optimization Benchmarking

Version: 1.0.0 (All Phases Complete) Status: ✅ Production Ready Quality Standard: EXTREME TDD - Grade A

Rust Version License: MIT TDG Grade Tests Coverage


Executive Summary

Complete, production-ready scientific test harness for benchmarking Rust binary optimization techniques. Demonstrates 5-51x performance improvements and 91.7% binary size reductions through systematic optimization with full statistical validation.

Key Results

  • 🚀 51.33x maximum speedup (quicksort with lto-fat)
  • 15.06x average speedup (lto-fat across all workloads)
  • 📦 91.7% binary size reduction (size-ultra profile)
  • 📊 Statistical rigor: ANOVA, t-tests, confidence intervals
  • 100% success rate: 150/150 jobs completed, 580 measurements

Optimization Strategies & Results

Optimization Speedup

Figure: Comprehensive optimization results showing 15× average speedup (lto-fat), SIMD acceleration with Trueno (2-7× additional boost), and six proven optimization strategies validated across 10 workloads.


Detailed Analysis

Benchmark Results

Figure: Statistical analysis showing (A) Pareto frontier for speed vs size tradeoffs, (B) Top profile rankings with confidence intervals, (C) Maximum speedup by workload type, and (D) Performance heatmap across benchmarks and profiles.


Quick Start

Tiered Testing (certeza-style) 🚀

This project integrates certeza, trueno, and ruchy-docker best practices for sustainable quality:

# Tier 1: ON-SAVE (sub-second feedback)
make tier1   # Quick checks, unit tests, format - perfect for rapid iteration

# Tier 2: ON-COMMIT (1-5 minutes)
make tier2   # Full tests, coverage, property tests - run before committing

# Tier 3: ON-MERGE (hours)
make tier3   # Mutation testing, exhaustive validation - CI/CD only

# See all options
make help

Learn more: See INTEGRATION.md for comprehensive integration guide.

View Results Immediately

# Text-based visualizations (no setup required)
python3 text_visualizations.py

# Statistical analysis
cargo run --bin analyze-results

# Quick demo (6 jobs, ~10 seconds)
cargo run --bin pathfinder-demo

Run Full Study

# Execute all 150 pathfinder jobs (~8 minutes)
cargo run --bin full-pathfinder-execution

# Output: pathfinder_results.json with 580 measurements

Generate Visualizations

# One-time setup
python3 -m venv venv
source venv/bin/activate
pip install pandas matplotlib numpy

# Generate publication-ready plots (300 DPI)
python3 generate_visualizations.py

Key Findings

Best Overall Profile: lto-fat

  • 15.06x average speedup across all workloads
  • 53.4% smaller binaries than baseline
  • Winner or top-3 for most workload types

Workload-Specific Recommendations

Workload Type Best Profile Speedup
Memory random access lto-fat 51.33x
CPU iterative perf-ultra 25.81x
Memory cache-sensitive opt-s 22.64x
Data structures standard-release 11.69x
Serialization lto-fat 11.88x
String processing perf-ultra 11.04x
CPU recursive opt-s 4.32x
I/O bound size-z-native 1.99x

Size-Constrained Systems: size-ultra

  • 2.16x speedup (vs baseline)
  • 314 KB binary (91.7% size reduction)
  • Perfect for embedded/mobile applications

Statistical Validation

ANOVA: Workload Type Matters

  • F-statistic: 19.87 (highly significant, p < 0.001)
  • η² = 0.986: Workload type explains 98.6% of variance
  • Conclusion: Memory-bound workloads benefit far more than CPU-bound

Profile Comparisons

  • Top 5 profiles achieve 13-15x average speedup
  • No statistically significant difference overall (overlapping 95% CIs)
  • Differences emerge at workload-specific level

Measurement Quality

  • 88.7% stable measurements (CV < 10%)
  • CPU-bound benchmarks: 0.3-2% coefficient of variation
  • Memory-bound benchmarks: 1-5% CV
  • Excellent reproducibility across all non-I/O workloads

Project Structure

compiled-rust-benchmarking/
├── crates/
│   ├── analysis/          # Statistical analysis (32 tests)
│   ├── harness/           # Build, measure, execute (76 tests)
│   └── stats/             # Statistical utilities (8 tests)
├── benchmarks/            # 10 diverse benchmarks (49 tests)
│   ├── ackermann/         # CPU-bound recursive
│   ├── fibonacci/         # CPU-bound recursive
│   ├── prime-sieve/       # CPU-bound iterative
│   ├── matrix-mult/       # Memory-bound cache-sensitive
│   ├── quicksort/         # Memory-bound random access
│   ├── hashmap-ops/       # Data structure operations
│   ├── btreemap-ops/      # Data structure operations
│   ├── string-parse/      # String processing
│   ├── json-parse/        # Serialization
│   └── file-io/           # I/O-bound
├── visualizations/        # Generated plots (PNG, 300 DPI)
├── docs/                  # 12 comprehensive documentation files
│   ├── COMPLETE_PROJECT_SUMMARY.md
│   ├── FULL_PATHFINDER_RESULTS.md
│   ├── PHASE_4_SUMMARY.md
│   ├── EXECUTION_GUIDE.md
│   └── VISUALIZATION_GUIDE.md
├── *.csv                  # Visualization data exports
├── pathfinder_results.json    # 580 measurements
└── binary_sizes_fibonacci.json # Binary size data

Tools Provided

Execution Tools (8 binaries)

  1. pathfinder-demo - Quick validation (6 jobs)
  2. full-pathfinder-execution - Complete study (150 jobs)
  3. multi-benchmark-study - Cross-benchmark comparison (30 jobs)
  4. analyze-results - Statistical analysis
  5. show-build-matrix - Display 800-job matrix
  6. generate-configs - Export 80 configurations
  7. run-pathfinder - Pathfinder overview

Analysis Tools

  • text_visualizations.py - Terminal-based charts (no dependencies)
  • generate_visualizations.py - Publication-quality plots
  • export_visualization_data.py - CSV export
  • analyze_speedups.py - Quick speedup analysis

Quality Metrics

Metric Value Status
Total Tests 165 ✅ All passing
Test Coverage ~87% ✅ Exceeds 85% target
Clippy Warnings 0 ✅ Clean
Formatting 0 violations ✅ Clean
SATD 0 ✅ No debt
Dead Code 0% ✅ Clean
TDG Grade A ✅ 85+ points
Job Success Rate 100% ✅ 150/150
Measurement Stability 88.7% ✅ CV < 10%

Phases Complete

Phase Description Status Tests Achievement
0 Quality Infrastructure 13 PMAT + TDG
1 Benchmark Suite 58 10 benchmarks
2 Build Infrastructure 32 800-job matrix
2.5 Pathfinder 26 15 profiles
2.6 Validation - 30 tests
3 Execution - 51.33x speedup
4 Statistical Analysis 32 ANOVA, CIs
5 Visualizations - Text + graphical

All objectives achieved


Documentation

Getting Started

  • README.md - This file
  • EXECUTION_GUIDE.md - How to run studies
  • VISUALIZATION_GUIDE.md - How to create plots

Results

  • FULL_PATHFINDER_RESULTS.md - Complete analysis (420 lines)
  • MULTI_BENCHMARK_RESULTS.md - Cross-benchmark insights (285 lines)
  • PHASE_4_SUMMARY.md - Statistical analysis (550 lines)

Technical

  • PHASE_4_DESIGN.md - Statistical analysis design (380 lines)
  • PROJECT_STATUS.md - Development history (380 lines)

Summaries

  • COMPLETE_PROJECT_SUMMARY.md - Final comprehensive summary
  • FINAL_PROJECT_STATUS.md - Production readiness checklist

Total Documentation: ~3,900 lines across 12 files


Toyota Way Principles

This project embodies the Toyota Production System:

Genchi Genbutsu (現地現物) - Go and See

  • 580 real measurements collected
  • Binary sizes measured, not estimated
  • 51.33x max speedup discovered empirically

Muda (無駄) - Waste Elimination

  • Pathfinder reduced validation by 81% (150 vs 800 jobs)
  • Adaptive iterations saved 22% of execution time
  • Completed in 8 minutes (vs estimated 1-3 hours)

Kaizen (改善) - Continuous Improvement

  • Iterative optimization: 1x → 15x → 51x
  • Statistical rigor ensures improvements are real
  • CIs quantify uncertainty for better decisions

Jidoka (自働化) - Built-in Quality

  • 165 tests all passing
  • EXTREME TDD throughout all phases
  • Automatic stability detection (CV < 10%)

Muri (無理) - No Overburden

  • Efficient resource usage
  • Clear progress indicators and ETAs
  • Text visualizations work immediately (no setup)

Use Cases

For Rust Developers

Apply to your projects:

  1. Identify your workload type (CPU/Memory/I/O)
  2. Check recommendations table above
  3. Apply optimal profile to Cargo.toml
  4. Measure production impact

Example for memory-bound random access:

[profile.release]
inherits = "release"
opt-level = 3
lto = "fat"
codegen-units = 16

For Researchers

Extend the study:

  • Run full 800-job matrix (all 80 configurations)
  • Implement Bayesian analysis (factor importance)
  • Measure compile times
  • Add hardware profiling

Data available:

  • pathfinder_results.json - 580 measurements
  • 4 CSV files for visualization
  • Binary size data for 15 profiles

For Performance Engineers

Benchmark methodology:

  • Adaptive iteration (3-10 runs based on stability)
  • Structured output parsing
  • Statistical significance testing
  • Comprehensive documentation

Related Projects


Citation

If you use this infrastructure in research, please cite:

Rust Binary Optimization Benchmarking Infrastructure (2025)
Scientific test harness demonstrating 5-51x performance improvements
through systematic optimization with statistical validation.

License

MIT - see LICENSE file.


Contributing

Contributions welcome! Please ensure:

  • All tests pass (cargo test --workspace)
  • No clippy warnings (cargo clippy --workspace --all-targets -- -D warnings)
  • Code is formatted (cargo fmt --all)
  • EXTREME TDD maintained (test-first development)

Acknowledgments

Built with EXTREME TDD methodology, achieving:

  • 165 comprehensive tests
  • 87% test coverage
  • Grade A quality (TDG)
  • Zero warnings, zero debt
  • 100% reproducible results

🏆 EXTREME TDD Maintained Throughout All Phases! 🏆


Version: 1.0.0 Status: ✅ Production Ready - All Phases Complete Last Updated: 2025-11-10


For detailed instructions, see COMPLETE_PROJECT_SUMMARY.md

About

A project to speedup compiled rust and reduce size of compiles

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •