Skip to content
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
6a01e9f
Commit base e2e across SDKs
patniko Feb 18, 2026
5f008ad
Add C# samples to 8 representative scenarios
patniko Feb 18, 2026
f006da3
Fix go.mod replace paths for test scenarios
patniko Feb 19, 2026
674719c
Replace copilot-core with Copilot CLI across all test scenarios
patniko Feb 19, 2026
1db7315
Add scenario build verification workflow for PR checks
patniko Feb 19, 2026
e3856b1
Add full language parity: all 34 scenarios × 4 languages
patniko Feb 19, 2026
3275b12
Add CI caching and justfile targets for scenario builds
patniko Feb 19, 2026
db37c0f
Quality fixes: C# in all verify.sh, fix stubs, consistent patterns
patniko Feb 19, 2026
6cee34a
Strengthen Python CI: py_compile + import check instead of AST-only
patniko Feb 19, 2026
e61e9a6
Update scenario model references to claude-sonnet-4.6
patniko Feb 19, 2026
a00d0d7
fix: remove soft-pass fallbacks in verify.sh scenario scripts
patniko Feb 19, 2026
49f199a
Strengthen tools scenario verifications
patniko Feb 19, 2026
997a73b
Support latest .NET
patniko Feb 19, 2026
0f6ecd3
Move to haiku
patniko Feb 19, 2026
e88df1b
Fix scenario tests: paths, verifications, streaming, and parallel exe…
patniko Feb 19, 2026
08adc4e
Merge remote-tracking branch 'origin/main' into add-test-scenarios
patniko Feb 19, 2026
39f0ec4
Restore go.sum files needed for CI builds
patniko Feb 19, 2026
d3cd6a3
Revert ToolName addition to Go PermissionRequest — use Extra map instead
patniko Feb 19, 2026
ccda964
fix: use o4-mini for reasoning-effort scenario tests
patniko Feb 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
183 changes: 183 additions & 0 deletions .github/workflows/scenario-builds.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
name: "Scenario Build Verification"

on:
pull_request:
paths:
- "test/scenarios/**"
- "nodejs/src/**"
- "python/copilot/**"
- "go/**/*.go"
- "dotnet/src/**"
- ".github/workflows/scenario-builds.yml"
push:
branches:
- main
paths:
- "test/scenarios/**"
- ".github/workflows/scenario-builds.yml"
workflow_dispatch:
merge_group:

permissions:
contents: read

jobs:
# ── TypeScript ──────────────────────────────────────────────────────
build-typescript:
name: "TypeScript scenarios"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6

- uses: actions/setup-node@v6
with:
node-version: 22

- uses: actions/cache@v4
with:
path: ~/.npm
key: ${{ runner.os }}-npm-scenarios-${{ hashFiles('test/scenarios/**/package.json') }}
restore-keys: |
${{ runner.os }}-npm-scenarios-

# Build the SDK so local file: references resolve
- name: Build SDK
working-directory: nodejs
run: npm ci --ignore-scripts

- name: Build all TypeScript scenarios
run: |
PASS=0; FAIL=0; FAILURES=""
for dir in $(find test/scenarios -path '*/typescript/package.json' -exec dirname {} \; | sort); do
scenario="${dir#test/scenarios/}"
echo "::group::$scenario"
if (cd "$dir" && npm install --ignore-scripts 2>&1); then
echo "✅ $scenario"
PASS=$((PASS + 1))
else
echo "❌ $scenario"
FAIL=$((FAIL + 1))
FAILURES="$FAILURES\n $scenario"
fi
echo "::endgroup::"
done
echo ""
echo "TypeScript builds: $PASS passed, $FAIL failed"
if [ "$FAIL" -gt 0 ]; then
echo -e "Failures:$FAILURES"
exit 1
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit odd - is it really doing a build? Looks like it just does npm install


# ── Python ──────────────────────────────────────────────────────────
build-python:
name: "Python scenarios"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6

- uses: actions/setup-python@v6
with:
python-version: "3.12"

- name: Syntax-check all Python scenarios
run: |
PASS=0; FAIL=0; FAILURES=""
for main in $(find test/scenarios -path '*/python/main.py' | sort); do
dir=$(dirname "$main")
scenario="${dir#test/scenarios/}"
echo "::group::$scenario"
if python3 -c "import ast, sys; ast.parse(open('$main').read()); print('syntax ok')" 2>&1; then
echo "✅ $scenario"
PASS=$((PASS + 1))
else
echo "❌ $scenario"
FAIL=$((FAIL + 1))
FAILURES="$FAILURES\n $scenario"
fi
echo "::endgroup::"
done
echo ""
echo "Python builds: $PASS passed, $FAIL failed"
if [ "$FAIL" -gt 0 ]; then
echo -e "Failures:$FAILURES"
exit 1
fi

# ── Go ──────────────────────────────────────────────────────────────
build-go:
name: "Go scenarios"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6

- uses: actions/setup-go@v6
with:
go-version: "1.24"
cache: true
cache-dependency-path: test/scenarios/**/go.sum

- name: Build all Go scenarios
run: |
PASS=0; FAIL=0; FAILURES=""
for mod in $(find test/scenarios -path '*/go/go.mod' | sort); do
dir=$(dirname "$mod")
scenario="${dir#test/scenarios/}"
echo "::group::$scenario"
if (cd "$dir" && go build ./... 2>&1); then
echo "✅ $scenario"
PASS=$((PASS + 1))
else
echo "❌ $scenario"
FAIL=$((FAIL + 1))
FAILURES="$FAILURES\n $scenario"
fi
echo "::endgroup::"
done
echo ""
echo "Go builds: $PASS passed, $FAIL failed"
if [ "$FAIL" -gt 0 ]; then
echo -e "Failures:$FAILURES"
exit 1
fi

# ── C# ─────────────────────────────────────────────────────────────
build-csharp:
name: "C# scenarios"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6

- uses: actions/setup-dotnet@v5
with:
dotnet-version: "8.0.x"

- uses: actions/cache@v4
with:
path: ~/.nuget/packages
key: ${{ runner.os }}-nuget-scenarios-${{ hashFiles('test/scenarios/**/*.csproj') }}
restore-keys: |
${{ runner.os }}-nuget-scenarios-

- name: Build all C# scenarios
run: |
PASS=0; FAIL=0; FAILURES=""
for proj in $(find test/scenarios -name '*.csproj' | sort); do
dir=$(dirname "$proj")
scenario="${dir#test/scenarios/}"
echo "::group::$scenario"
if (cd "$dir" && dotnet build --nologo 2>&1); then
echo "✅ $scenario"
PASS=$((PASS + 1))
else
echo "❌ $scenario"
FAIL=$((FAIL + 1))
FAILURES="$FAILURES\n $scenario"
fi
echo "::endgroup::"
done
echo ""
echo "C# builds: $PASS passed, $FAIL failed"
if [ "$FAIL" -gt 0 ]; then
echo -e "Failures:$FAILURES"
exit 1
fi
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@

# Documentation validation output
docs/.validation/
.DS_Store
109 changes: 109 additions & 0 deletions justfile
Original file line number Diff line number Diff line change
Expand Up @@ -117,3 +117,112 @@ validate-docs-go:
validate-docs-cs:
@echo "=== Validating C# documentation ==="
@cd scripts/docs-validation && npm run validate:cs

# Build all scenario samples (all languages)
scenario-build:
#!/usr/bin/env bash
set -euo pipefail
echo "=== Building all scenario samples ==="
TOTAL=0; PASS=0; FAIL=0

build_lang() {
local lang="$1" find_expr="$2" build_cmd="$3"
echo ""
echo "── $lang ──"
while IFS= read -r target; do
[ -z "$target" ] && continue
dir=$(dirname "$target")
scenario="${dir#test/scenarios/}"
TOTAL=$((TOTAL + 1))
if (cd "$dir" && eval "$build_cmd" >/dev/null 2>&1); then
printf " ✅ %s\n" "$scenario"
PASS=$((PASS + 1))
else
printf " ❌ %s\n" "$scenario"
FAIL=$((FAIL + 1))
fi
done < <(find test/scenarios $find_expr | sort)
}

# TypeScript: npm install
(cd nodejs && npm ci --ignore-scripts --silent 2>/dev/null) || true
build_lang "TypeScript" "-path '*/typescript/package.json'" "npm install --ignore-scripts"

# Python: syntax check
build_lang "Python" "-path '*/python/main.py'" "python3 -c \"import ast; ast.parse(open('main.py').read())\""

# Go: go build
build_lang "Go" "-path '*/go/go.mod'" "go build ./..."

# C#: dotnet build
build_lang "C#" "-name '*.csproj' -path '*/csharp/*'" "dotnet build --nologo -v quiet"

echo ""
echo "══════════════════════════════════════"
echo " Scenario build summary: $PASS passed, $FAIL failed (of $TOTAL)"
echo "══════════════════════════════════════"
[ "$FAIL" -eq 0 ]

# Run the full scenario verify orchestrator (build + E2E, needs real CLI)
scenario-verify:
@echo "=== Running scenario verification ==="
@bash test/scenarios/verify.sh

# Build scenarios for a single language (typescript, python, go, csharp)
scenario-build-lang LANG:
#!/usr/bin/env bash
set -euo pipefail
echo "=== Building {{LANG}} scenarios ==="
PASS=0; FAIL=0

case "{{LANG}}" in
typescript)
(cd nodejs && npm ci --ignore-scripts --silent 2>/dev/null) || true
for target in $(find test/scenarios -path '*/typescript/package.json' | sort); do
dir=$(dirname "$target"); scenario="${dir#test/scenarios/}"
if (cd "$dir" && npm install --ignore-scripts >/dev/null 2>&1); then
printf " ✅ %s\n" "$scenario"; PASS=$((PASS + 1))
else
printf " ❌ %s\n" "$scenario"; FAIL=$((FAIL + 1))
fi
done
;;
python)
for target in $(find test/scenarios -path '*/python/main.py' | sort); do
dir=$(dirname "$target"); scenario="${dir#test/scenarios/}"
if python3 -c "import ast; ast.parse(open('$target').read())" 2>/dev/null; then
printf " ✅ %s\n" "$scenario"; PASS=$((PASS + 1))
else
printf " ❌ %s\n" "$scenario"; FAIL=$((FAIL + 1))
fi
done
;;
go)
for target in $(find test/scenarios -path '*/go/go.mod' | sort); do
dir=$(dirname "$target"); scenario="${dir#test/scenarios/}"
if (cd "$dir" && go build ./... >/dev/null 2>&1); then
printf " ✅ %s\n" "$scenario"; PASS=$((PASS + 1))
else
printf " ❌ %s\n" "$scenario"; FAIL=$((FAIL + 1))
fi
done
;;
csharp)
for target in $(find test/scenarios -name '*.csproj' -path '*/csharp/*' | sort); do
dir=$(dirname "$target"); scenario="${dir#test/scenarios/}"
if (cd "$dir" && dotnet build --nologo -v quiet >/dev/null 2>&1); then
printf " ✅ %s\n" "$scenario"; PASS=$((PASS + 1))
else
printf " ❌ %s\n" "$scenario"; FAIL=$((FAIL + 1))
fi
done
;;
*)
echo "Unknown language: {{LANG}}. Use: typescript, python, go, csharp"
exit 1
;;
esac

echo ""
echo "{{LANG}} scenarios: $PASS passed, $FAIL failed"
[ "$FAIL" -eq 0 ]
84 changes: 84 additions & 0 deletions test/scenarios/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Dependencies
node_modules/
.venv/
vendor/

# E2E run artifacts (agents may create files during verify.sh runs)
**/sessions/**/plan.md
**/tools/**/plan.md
**/callbacks/**/plan.md
**/prompts/**/plan.md

# Build output
dist/
target/
build/
*.exe
*.dll
*.so
*.dylib

# Go
*.test
fully-bundled-go
app-direct-server-go
container-proxy-go
container-relay-go
app-backend-to-server-go
custom-agents-go
mcp-servers-go
no-tools-go
virtual-filesystem-go
system-message-go
skills-go
streaming-go
attachments-go
tool-filtering-go
permissions-go
hooks-go
user-input-go
concurrent-sessions-go
session-resume-go
stdio-go
tcp-go
gh-app-go
cli-preset-go
filesystem-preset-go
minimal-preset-go

# Python
__pycache__/
*.pyc
*.pyo
*.egg-info/
*.egg
.eggs/

# TypeScript
*.tsbuildinfo
package-lock.json

# C# / .NET
bin/
obj/
*.csproj.nuget.*

# IDE / OS
.DS_Store
.idea/
.vscode/
*.swp
*.swo
*~

# Multi-user scenario temp directories
**/sessions/multi-user-long-lived/tmp/

# Logs
*.log
npm-debug.log*
infinite-sessions-go
reasoning-effort-go
reconnect-go
byok-openai-go
token-sources-go
Loading
Loading