diff --git a/research-papers/index.html b/research-papers/index.html index 730ec29..232e2e5 100644 --- a/research-papers/index.html +++ b/research-papers/index.html @@ -70,6 +70,14 @@

Research Papers

Synthesis of Toby Ord's half-life framework with METR's exponential growth analysis. Reveals AI agents fail at constant rate per minute (half-life model) while capabilities double every 7 months. Projects specific reliability thresholds: 90% reliability requires 1/7 task duration reduction, current models achieve 50-minute tasks at 50% success. Predicts month-long task automation by 2030, with practical architecture patterns for current reliability levels. + + + + DreamGym: Scaling Agent Learning via Experience Synthesis + + + Breakthrough framework for training AI agents through synthetic experience synthesis. Introduces reasoning-based experience model that simulates environment dynamics, enabling scalable reinforcement learning without costly real-world interactions. Achieves 30%+ improvement on non-RL-ready tasks like WebArena using zero real environment interactions, while matching state-of-the-art on traditional benchmarks. Addresses four critical challenges: costly rollouts, limited task diversity, unreliable rewards, and infrastructure complexity. +