Skip to content

v0.3.0 β€” MetaClaw Integration, CodeAgent v2, 50+ Bug Fixes

Choose a tag to compare

@Jiaaqiliu Jiaaqiliu released this 17 Mar 13:50
· 200 commits to main since this release

What's New in v0.3.0

MetaClaw Cross-Run Learning Integration

  • New researchclaw/metaclaw_bridge/ module: skill injection, lesson-to-skill conversion, PRM quality gates, session lifecycle management
  • Pipeline failures β†’ structured lessons β†’ reusable skills, injected into all 23 stages
  • +18.3% pipeline robustness in controlled experiments
  • Opt-in via metaclaw_bridge.enabled: true, fully backward-compatible

CodeAgent v2 β€” Enhanced Code Generation

  • Enhanced Blueprint: deep implementation specs with per-file pseudocode, tensor shapes, generation order
  • Sequential File Generation: dependency-ordered with AST-based CodeMem
  • Hard Validation Gates: block identical ablations, hardcoded metrics, cross-file import errors
  • Targeted Error Repair: parse traceback to fix surgically instead of full regeneration

BenchmarkAgent & FigureAgent Improvements

  • BenchmarkAgent: domain-aware benchmarks, import validation, pretrained resize
  • FigureAgent: LLM output type safety, Paul Tol colorblind-safe palette, heatmap/ablation chart types
  • visualize.py full rewrite: academic styling, 300 DPI, 6 enhanced chart types

50+ Pipeline Bug Fixes (BUG-06 through BUG-51)

  • Metric direction, citation verify, CodeGen guard, condition drift, RL stability
  • BST ordering, raw metrics, bracket citations, arXiv categories
  • Docker HF permission, KD stability, ablation detection
  • references.bib fallback generation, ICML LaTeX template fix

Paper Quality Hardening (4 Rounds)

  • Post-compilation quality checks, weasel/duplicate word lint, NeurIPS checklist
  • LaTeX escaping, 7-dim AI-Scientist-style review, AI-slop detection (50+ phrase blocklist)
  • Related work depth checker, stats rigor validator, anti-boilerplate prompts
  • Cross-discipline support for 7 domains

Docker Sandbox & Infrastructure

  • Network-policy-aware code generation & sandbox execution
  • Rate-limit defense for literature search APIs (OpenAlex β†’ Semantic Scholar β†’ arXiv cascade)

Full Changelog: v0.2.0...v0.3.0