Coding agent visual + runtime verification: screenshot + console errors + simulator/emulator testing

## Problem

The coding agent produced a Snake game that has runtime JavaScript errors and renders a blank canvas. Without verification, this becomes training data that teaches the model to produce broken code. Loss going down means NOTHING if the output doesn't work.

## Current State

- Code generated: looks correct syntactically
- Runtime errors: `Cannot read properties of undefined (reading 'set')` line 345, `Cannot read properties of undefined (reading 'length')` line 147
- Visual result: blank black canvas (no snake, no food, no overlay)
- Training capture: `captureTraining=true` saved this as a POSITIVE example — it should be NEGATIVE

## What's Needed

### 1. Browser Console Error Capture
After the coding agent opens a file in browser:
- Capture `console.error` and uncaught exceptions via jtag
- Inject a small error collector script before loading the page
- Return errors as structured data in the tool result

### 2. Screenshot Verification
- `interface/screenshot` of the rendered page
- VisionDescriptionService describes what it sees
- Compare description against the prompt requirements
- "Blank canvas" ≠ "Snake game with score display"

### 3. Automated Grading
- Runtime errors = automatic failure
- Visual mismatch = failure (blank when should be game)
- Passing = no errors + visual matches description
- Failed examples become negative training data (or are discarded)

### 4. Simulator/Emulator Testing (future)
For iOS/Android apps:
- iOS Simulator driven tests (xcrun simctl)
- Android Emulator driven tests (adb)
- Secure enclave testing for biometric features
- Screenshot capture from simulator/emulator
- Same visual verification pipeline

### 5. Fix-and-Retry Loop
When verification fails:
- Feed errors back to the coding agent
- "Your code has these errors: [errors]. The screenshot shows: [description]. Fix it."
- Retry up to N times
- Only successful, verified code becomes training data

## The Rule

**No blind training.** Every code example must be:
1. Executed (not just compiled)
2. Visually verified (screenshot + description)
3. Error-free (no console errors, no crashes)
4. Functionally correct (output matches prompt)

Training on unverified code is training on garbage.

## Related
- #409 (sensory system — visual verification)
- #377 (Academy e2e — needs verification in the loop)
- #440 (<think> tags — model reasoning about its own code quality)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Coding agent visual + runtime verification: screenshot + console errors + simulator/emulator testing #453

Problem

Current State

What's Needed

1. Browser Console Error Capture

2. Screenshot Verification

3. Automated Grading

4. Simulator/Emulator Testing (future)

5. Fix-and-Retry Loop

The Rule

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Coding agent visual + runtime verification: screenshot + console errors + simulator/emulator testing #453

Description

Problem

Current State

What's Needed

1. Browser Console Error Capture

2. Screenshot Verification

3. Automated Grading

4. Simulator/Emulator Testing (future)

5. Fix-and-Retry Loop

The Rule

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions