Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions research-papers/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,14 @@ <h1 class="heading-2">Research Papers</h1>
</td>
<td class="description">Synthesis of Toby Ord's half-life framework with METR's exponential growth analysis. Reveals AI agents fail at constant rate per minute (half-life model) while capabilities double every 7 months. Projects specific reliability thresholds: 90% reliability requires 1/7 task duration reduction, current models achieve 50-minute tasks at 50% success. Predicts month-long task automation by 2030, with practical architecture patterns for current reliability levels.</td>
</tr>
<tr>
<td>
<a href="dreamgym_report.html" target="_blank" class="paper-link">
<strong>DreamGym: Scaling Agent Learning via Experience Synthesis</strong>
</a>
</td>
<td class="description">Breakthrough framework for training AI agents through synthetic experience synthesis. Introduces reasoning-based experience model that simulates environment dynamics, enabling scalable reinforcement learning without costly real-world interactions. Achieves 30%+ improvement on non-RL-ready tasks like WebArena using zero real environment interactions, while matching state-of-the-art on traditional benchmarks. Addresses four critical challenges: costly rollouts, limited task diversity, unreliable rewards, and infrastructure complexity.</td>
</tr>
<tr>
<td>
<a href="deepseek_ocr_report.html" target="_blank" class="paper-link">
Expand Down
Loading