Skip to content

Latest commit

 

History

History

README.md

Examples

Examples produced from natural-language prompts using these skills. Some folders contain end-to-end Substreams projects (buildable + runnable); others are writeups or operational notes documenting what the agent did, what worked, and what didn't.

All runs use claude-sonnet-4-6. For the buildable Substreams project examples, the code is reproducible: cd into the folder and substreams build.

Ethereum

Example Skill(s) Result
T1.1 — Block stats substreams-dev Build · Run · 100% match
T1.2 — USDC transfers substreams-dev Build · Run · 100% match
T2.1 — NFT mints substreams-dev Build · Run · 100% match
T2.2 — Uniswap V2 swaps + token metadata substreams-dev Build · Run · 96% match (MKR bytes32 symbol() edge case)
T2.3 — Postgres SQL sink substreams-dev, substreams-sql Build · Run · 100% match
T3.1 — Uniswap V3 swaps + USD price substreams-dev Build · Run · 100% match
T3.2 — Cross-DEX volume aggregation (graph_out) substreams-dev Build · Run · 100% match
T6.1 — Uniswap V2 swaps from Solidity source (no ABI) substreams-dev Build · Run · 100% match

Solana

Example Skill(s) Result
T5.1 — Slot stats substreams-dev Build · Run · 100% match
T5.2 — SPL USDC transfers substreams-dev Build · Run · 100% match
T5.3 — Raydium CLMM swaps substreams-dev Build · Run · 100% match
T5.4 — Pump.fun launches (Anchor) substreams-dev Build · Run · 100% match
T6.2 — Marinade deposits from Anchor source (no IDL) substreams-dev Build · Run · 100% match

Sink deployment

Example Skill(s) Result
T7.1 — Deploy SQL sink to Postgres substreams-sink-deploy Sink installed, schema applied, 537 rows match golden

Cautionary tales (vague prompts)

Example What happened
T4.1 — "Track whale activity" Agent shipped silently with hardcoded threshold, token universe, time range. Did not ask clarifying questions.
T4.2 — "I want Uniswap data in my database" Agent built full V3→Postgres pipeline without asking about chain, version, fields.

These illustrate a known limitation: the skill text suggests asking for clarification on vague prompts, but model posture is dominant. Vague prompts produce confident guesses, not questions. Be specific.

Related

  • /EVAL.md — summary of the test pass against this skill set