Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions notes/pr8_update_summary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# PR #8 Update Summary

## What this PR adds

- a trusted-remote handoff resume gate after TPI-008 recovery
- a stricter intake rule for concrete provider-supplied handoff data
- an unchanged execution-resume contract for TPI-004 once landing is verified
- a decision surface that prevents reopening Git auth work without fresh evidence

## What still remains

- a concrete provider-supplied attach or SSH route
- verified landing in `/workspace/parameter-golf`
- actual resume of the unchanged TPI-004 baseline/candidate evidence pass

## First required external handoff

The next turn should begin from the exact provider-supplied attach route or SSH tuple and immediately verify landing in `/workspace/parameter-golf`.

## Boundary carried forward

- remote review surfaces are now treated as trustworthy again
- Git auth recovery is not reopened unless a new concrete regression appears
- monkey model framing remains the only public-facing concept

## Current turn result

- the trusted remote handoff gate remains valid
- no concrete provider-supplied handoff package is present in this turn
- the branch therefore stays in an absent-state wait
- current label remains `continue_sharpening`
65 changes: 65 additions & 0 deletions notes/tpi_009_resume_contract.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# TPI-009 Resume Contract

## Objective

Define the first commands to run once a concrete provider-supplied handoff package is accepted on top of the restored remote review surfaces.

## Preconditions

- PR #6 is trusted as the latest TPI-007 absent-handoff surface
- PR #7 is trusted as the TPI-008 recovery surface
- PR #8 is trusted as the current TPI-009 handoff gate
- the next turn contains a concrete attach route or exact SSH tuple

## First verification commands

```bash
pwd
ls /workspace
cd /workspace/parameter-golf
git rev-parse --abbrev-ref HEAD
python3 -c "import torch, datasets, sentencepiece; print('deps-ok')"
nvidia-smi
```

## Resume commands after landing verification

### Baseline
```bash
RUN_ID=tpi004_baseline_stride1024 \
DATA_PATH=/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/ \
TOKENIZER_PATH=/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model \
VOCAB_SIZE=1024 \
TRAIN_SEQ_LEN=1024 \
EVAL_STRIDE=1024 \
MAX_WALLCLOCK_SECONDS=600 \
TRAIN_LOG_EVERY=50 \
VAL_LOSS_EVERY=200 \
torchrun --standalone --nproc_per_node=1 train_gpt.py
```

### Candidate
```bash
RUN_ID=tpi004_candidate_stride128 \
DATA_PATH=/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/ \
TOKENIZER_PATH=/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model \
VOCAB_SIZE=1024 \
TRAIN_SEQ_LEN=1024 \
EVAL_STRIDE=128 \
MAX_WALLCLOCK_SECONDS=600 \
TRAIN_LOG_EVERY=50 \
VAL_LOSS_EVERY=200 \
torchrun --standalone --nproc_per_node=1 train_gpt.py
```

## Gate

Do not reopen Git auth or provider rediscovery after a valid handoff package is supplied. Attach and verify landing immediately.

Do not run the baseline or candidate command before:

- `pwd` and `ls /workspace` succeed
- `/workspace/parameter-golf` is reachable
- branch identity is confirmed
- dependency check succeeds
- `nvidia-smi` confirms the expected runtime
64 changes: 64 additions & 0 deletions notes/tpi_009_resume_decision.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# TPI-009 Resume Decision

## Status

waiting_external_handoff

## Objective

Record whether the trusted remote review surfaces can now be used to resume the unchanged TPI-004 evidence pass.

## Required decision fields

- remote review surface trusted or not: yes
- concrete provider handoff package received or not: not yet
- landing path verified or not: not yet
- TPI-004 resume ready or not: not yet

## Classification

- `promote_to_execution_resume`
- `continue_sharpening`
- `move_to_failure_ledger`

## Accepted branch states

- `accepted`
- concrete provider handoff package is present
- landing is verified
- promote immediately to `promote_to_execution_resume`
- `absent`
- no concrete provider handoff package is present
- stop cleanly
- do not reopen rediscovery
- do not start execution
- `failure-after-attach`
- a concrete provider handoff package is present
- attach or landing is actually attempted and fails
- move to `move_to_failure_ledger`

## Current reading

- TPI-008 removed the push/auth blocker.
- The current blocker is now external provider handoff only.
- No new attach route is included in this PR itself.

## Classification result for the branch state

- `continue_sharpening`

## Absent-state handling

- stop cleanly at the trusted remote handoff gate
- no Git auth reopening
- no environment reselection
- no model-side changes
- no unchanged TPI-004 execution until landing is verified

## Promotion condition

Promote to `promote_to_execution_resume` immediately once the handoff package is concrete enough to attach and verify `/workspace/parameter-golf`.

## Failure condition

Move to `move_to_failure_ledger` only if a newly attempted concrete handoff route fails after trust in the remote review surface has already been restored.
56 changes: 56 additions & 0 deletions notes/tpi_009_trusted_remote_handoff_gate.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# TPI-009 Trusted Remote Handoff Gate

## Objective

Resume the suspended TPI-004 evidence pass only from a trusted remote review surface and a concrete provider-supplied handoff package.

## Public-facing name

`MonkeyModel_TrustedRemoteHandoff`

## Starting assumptions

- TPI-008 restored remote reflection and Git auth for normal HTTPS push
- PR #6 is again a trustworthy surface for the latest TPI-007 absent-handoff state
- no model-side or tokenizer-side change is needed before attach and landing verification

## Accepted handoff package

Any acceptable handoff in this loop must include at least one of the following:

1. exact provider-supplied attach command, or
2. exact SSH tuple with host, username, and port

and must also include:

- pod identifier or display name
- expected landing path or explicit confirmation that `/workspace/parameter-golf` is the target workspace

This minimum package is mandatory. Partial metadata does not advance the loop.

## Rejection rule

Reject all of the following as insufficient for progress in this loop:

- repeating that local SSH keys exist
- generic Runpod instructions without the concrete current pod route
- generic provider dashboard advice without the concrete current pod route
- reopening Git auth inspection without a fresh concrete regression
- reopening old endpoint rediscovery without new concrete provider data
- reopening environment selection or model changes before landing verification

## Immediate next action after acceptance

1. run `pwd`
2. run `ls /workspace`
3. `cd /workspace/parameter-golf`
4. run `git rev-parse --abbrev-ref HEAD`
5. run `python3 -c "import torch, datasets, sentencepiece; print('deps-ok')"`
6. run `nvidia-smi`
7. only after landing and dependency verification, resume the unchanged TPI-004 baseline/candidate evidence pass

## Current branch reading

- trusted remote surfaces are restored
- no concrete provider-supplied handoff package is present in this turn
- current label remains `continue_sharpening`