Skip to content

paiml/batuta

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

70 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Batuta 🎡

Orchestration framework for converting ANY project (Python, C/C++, Shell) to modern, first-principles Rust

License: MIT Rust CI/CD Docker WASM Book TDG Score Unit Coverage Core Modules Tests Pre-commit Quality

πŸ”’ Quality Standards

Batuta enforces rigorous quality standards:

  • βœ… 529 total tests (487 unit + 36 integration + 6 benchmarks)
  • πŸš€ Coverage target: 90% minimum, 95% preferred - approaching target
  • βœ… Core modules: 90-100% coverage (all converters, plugin, parf, backend, tools, types, report) - TARGET MET
  • βœ… Mutation testing validates test quality (100% on converters)
  • βœ… Zero defects tolerance via Certeza validation
  • βœ… Performance benchmarks (sub-nanosecond backend selection)
  • βœ… Security audits (0 vulnerabilities)

Coverage Breakdown:

  • Config module: 100% coverage
  • Analyzer module: 82.76% coverage
  • Types module: ~95% coverage
  • Report module: ~95% coverage
  • Backend module: ~95% coverage
  • Tools module: ~95% coverage
  • ML Converters (NumPy, sklearn, PyTorch): ~90-95% coverage
  • Plugin architecture: ~90% coverage
  • PARF analyzer: ~90% coverage
  • CLI (main.rs): 0% unit (covered by 36 integration tests)

Quality Validation:

# Run certeza quality checks before committing
cd ../certeza && cargo run -- check ../Batuta

See IMPLEMENTATION.md for full quality metrics and improvement plans.


Batuta orchestrates 9 Pragmatic AI Labs transpiler and foundation library tools to enable semantic-preserving conversion of legacy codebases to high-performance Rust, complete with GPU acceleration, SIMD optimization, and ML inference capabilities.

πŸš€ Quick Start

# Install Batuta
cargo install batuta

# Analyze your project
batuta analyze --languages --dependencies --tdg

# Convert to Rust (coming soon)
batuta transpile --incremental --cache

# Optimize with GPU/SIMD (coming soon)
batuta optimize --enable-gpu --profile aggressive

# Validate equivalence (coming soon)
batuta validate --trace-syscalls --benchmark

# Build final binary (coming soon)
batuta build --release

πŸ“– Documentation

Read The Batuta Book - Comprehensive guide covering:

  • Philosophy and core principles (Toyota Way applied to code migration)
  • The 5-phase workflow (Analysis β†’ Transpilation β†’ Optimization β†’ Validation β†’ Deployment)
  • Tool ecosystem deep-dives (all 9 Pragmatic AI Labs tools)
  • Practical examples and case studies
  • Configuration reference and best practices

🎯 What is Batuta?

Batuta is named after the conductor's baton – it orchestrates multiple specialized tools to convert legacy code to Rust while maintaining semantic equivalence. Unlike simple transpilers, Batuta:

  • Preserves semantics through IR-based analysis and validation
  • Optimizes automatically with SIMD/GPU acceleration via Trueno
  • Provides gradual migration through Ruchy scripting language
  • Applies Toyota Way principles (Muda, Jidoka, Kaizen) for quality

🧩 Architecture

Batuta orchestrates 9 core components from Pragmatic AI Labs:

Transpilers

  • Decy - C/C++ β†’ Rust with ownership inference
  • Depyler - Python β†’ Rust with type inference
  • Bashrs - Shell scripts β†’ Rust CLI

Foundation Libraries

  • Trueno - Multi-target compute (CPU SIMD, GPU, WASM)
  • Aprender - First-principles ML in Rust
  • Realizar - ML inference runtime

Quality & Support Tools

  • Ruchy - Rust-oriented scripting for gradual migration
  • PMAT - Quality analysis & roadmap generation
  • Renacer - Syscall tracing for validation

πŸ“Š Commands

batuta analyze

Analyze your project to understand languages, dependencies, and code quality.

# Full analysis
batuta analyze --languages --dependencies --tdg

# Just detect languages
batuta analyze --languages

# Calculate TDG score only
batuta analyze --tdg

Output includes:

  • Language breakdown with line counts and percentages
  • Primary language detection
  • Transpiler recommendations
  • Dependency manager detection (pip, Cargo, npm, etc.)
  • Package counts per dependency file
  • TDG quality score (0-100) with letter grade
  • ML framework detection
  • Next steps guidance

batuta init (Coming Soon)

Initialize a Batuta project and set up conversion configuration.

batuta init --source ./my-python-app --output ./my-rust-app

batuta transpile (Coming Soon)

Convert source code to Rust with incremental compilation and caching.

# Basic transpilation
batuta transpile

# Incremental mode with caching
batuta transpile --incremental --cache

# Specific modules only
batuta transpile --modules auth,api,db

# Generate Ruchy for gradual migration
batuta transpile --ruchy --repl

batuta optimize (Coming Soon)

Apply performance optimizations with GPU/SIMD acceleration.

# Balanced optimization (default)
batuta optimize

# Aggressive optimization
batuta optimize --profile aggressive --enable-gpu

# Custom GPU threshold
batuta optimize --enable-gpu --gpu-threshold 1000

Optimization profiles:

  • fast - Quick compilation, basic optimizations
  • balanced - Default, good compilation/performance trade-off
  • aggressive - Maximum performance, slower compilation

batuta validate (Coming Soon)

Verify semantic equivalence between original and transpiled code.

# Full validation suite
batuta validate --trace-syscalls --diff-output --run-original-tests --benchmark

# Quick syscall validation
batuta validate --trace-syscalls

batuta build (Coming Soon)

Build optimized Rust binaries with cross-compilation support.

# Release build
batuta build --release

# Cross-compile
batuta build --target x86_64-unknown-linux-musl

# WebAssembly
batuta build --wasm

batuta report (Coming Soon)

Generate comprehensive migration reports.

# HTML report (default)
batuta report

# Markdown for documentation
batuta report --format markdown --output MIGRATION.md

# JSON for CI/CD
batuta report --format json --output report.json

πŸ—οΈ 5-Phase Workflow

Batuta implements a 5-phase Kanban workflow based on Toyota Way principles:

Phase 1: Analysis

  • Detect project languages and structure
  • Calculate technical debt grade (TDG)
  • Identify dependencies and frameworks
  • Recommend transpilation strategy

Phase 2: Transpilation

  • Convert code to Rust/Ruchy using appropriate transpiler
  • Preserve semantics through IR analysis
  • Generate human-readable output
  • Support incremental compilation

Phase 3: Optimization

  • Apply SIMD vectorization (via Trueno)
  • Enable GPU acceleration for compute-heavy code
  • Optimize memory layout
  • Select backends via Mixture-of-Experts routing

Phase 4: Validation

  • Trace syscalls to verify equivalence (via Renacer)
  • Run original test suite
  • Compare outputs and performance
  • Generate diff reports

Phase 5: Deployment

  • Build optimized binaries
  • Cross-compile for target platforms
  • Package for distribution
  • Generate migration documentation

πŸŽ“ Toyota Way Principles

Batuta applies Lean Manufacturing principles to code migration:

Muda (Waste Elimination)

  • StaticFixer integration - Eliminate duplicate static analysis (~40% reduction)
  • PMAT adaptive analysis - Focus on critical code, skip boilerplate
  • Decy diagnostics - Clear, actionable error messages reduce confusion

Jidoka (Built-in Quality)

  • Ruchy strictness levels - Gradual quality at migration boundaries
  • Pipeline validation - Quality checks at each phase
  • Semantic equivalence - Automated verification via syscall tracing

Kaizen (Continuous Improvement)

  • MoE optimization - Continuous performance tuning
  • Incremental features - Deliver value progressively
  • Feedback loops - Learn from each migration

Heijunka (Level Scheduling)

  • Batuta orchestrator - Balanced load across transpilers
  • Parallel processing - Efficient resource utilization

Kanban (Visual Workflow)

  • 5-phase tracking - Clear stage visibility
  • Dependency management - Automatic task ordering

Andon (Problem Visualization)

  • Renacer integration - Runtime behavior analysis
  • TDG scoring - Quality visibility

πŸ“ˆ Example: Python ML Project

# 1. Analyze the project
$ batuta analyze --languages --dependencies --tdg

πŸ“Š Analysis Results
==================================================
Primary language: Python
Total files: 127
Total lines: 8,432

Dependencies:
  β€’ pip (42 packages)
    File: "./requirements.txt"
  β€’ β„Ή ML frameworks detected - consider Aprender/Realizar for ML code

Quality Score:
  β€’ TDG Score: 73.2/100 (B)

Recommended transpiler: Depyler (Python β†’ Rust)

# 2. Transpile to Rust (coming soon)
$ batuta transpile --incremental

πŸ”„ Transpiling with Depyler...
  βœ“ Converted 127 files (3,891 warnings, 42 errors addressed)
  βœ“ NumPy β†’ Trueno: 23 operations
  βœ“ sklearn β†’ Aprender: 5 models
  βœ“ PyTorch β†’ Realizar: 2 inference pipelines

# 3. Optimize (coming soon)
$ batuta optimize --enable-gpu --profile aggressive

⚑ Optimizing...
  βœ“ SIMD vectorization: 234 loops optimized
  βœ“ GPU dispatch: 12 operations (threshold: 500 elements)
  βœ“ Memory layout: 18 structs optimized

# 4. Validate (coming soon)
$ batuta validate --trace-syscalls --benchmark

βœ… Validation passed!
  βœ“ Syscall equivalence: 100%
  βœ“ Output identical: βœ“
  βœ“ Performance: 4.2x faster, 62% less memory

πŸ› οΈ Development Status

Current Version: 0.1.0 (Alpha)

  • βœ… Phase 1: Analysis - Complete

    • βœ… Language detection
    • βœ… Dependency analysis
    • βœ… TDG scoring
    • βœ… Transpiler recommendations
  • 🚧 Phase 2: Core Orchestration - In Progress

    • ⏳ CLI scaffolding (complete)
    • ⏳ Transpilation engine
    • ⏳ 5-phase workflow
    • ⏳ PMAT integration
  • πŸ“‹ Phase 3: Advanced Pipelines - Planned

    • πŸ“‹ NumPy β†’ Trueno
    • πŸ“‹ sklearn β†’ Aprender
    • πŸ“‹ PyTorch β†’ Realizar
  • πŸ“‹ Phase 4: Enterprise Features - Future

    • πŸ“‹ Renacer tracing
    • πŸ“‹ PARF reference finder

See roadmap.yaml for complete ticket breakdown (12 tickets, 572 hours).

πŸ“– Documentation

🀝 Contributing

Batuta is part of the Pragmatic AI Labs ecosystem. Contributions are welcome!

# Clone and build
git clone https://github.com/paiml/Batuta.git
cd Batuta
cargo build --release

# Run tests
cargo test

# Install locally
cargo install --path .

πŸ“„ License

MIT License - see LICENSE for details.

πŸ”— Related Projects

  • Decy - C/C++ to Rust transpiler
  • Depyler - Python to Rust transpiler
  • Trueno - Multi-target compute library
  • PMAT - Quality analysis toolkit

πŸ™ Acknowledgments

Batuta applies principles from:

  • Toyota Production System - Muda, Jidoka, Kaizen, Heijunka, Kanban, Andon
  • Lean Software Development - Value stream optimization
  • First Principles Thinking - Rebuild from fundamental truths

Batuta - Because every great orchestra needs a conductor. 🎡

About

Orchestration of many projects from Pragmatic AI Labs

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •