Skip to content

fix(ci): bootstrap npm from tarball when runner's bundled npm is broken#3010

Merged
amikofalvy merged 1 commit intomainfrom
fix/release-npm-bootstrap
Apr 4, 2026
Merged

fix(ci): bootstrap npm from tarball when runner's bundled npm is broken#3010
amikofalvy merged 1 commit intomainfrom
fix/release-npm-bootstrap

Conversation

@amikofalvy
Copy link
Copy Markdown
Collaborator

Summary

  • The GitHub runner image ubuntu24/20260329.72 ships a broken npm 10.9.7 (missing promise-retry module), causing npm install -g npm@latest to fail with MODULE_NOT_FOUND in the release workflow
  • 5 consecutive release failures since ~19:59 UTC today — all blocked at the "Ensure npm 11.5.1+ for OIDC support" step
  • Adds a fallback that downloads the npm tarball directly from the registry and bootstraps from it, bypassing the broken bundled npm entirely
  • Normal upgrade path is tried first so the fallback only activates when needed

Test plan

  • Merge to main and verify the release workflow succeeds (the changeset will trigger a publish run)
  • Confirm npm version is 11.5.1+ in the workflow logs

🤖 Generated with Claude Code

The GitHub runner image ubuntu24/20260329.72 ships a broken npm 10.9.7
(missing promise-retry module), causing `npm install -g npm@latest` to
fail with MODULE_NOT_FOUND. This blocks all releases.

Add a fallback that downloads the npm tarball directly from the registry
and uses it to self-install, bypassing the broken bundled npm entirely.
The normal path is tried first so the fallback is only used when needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Apr 4, 2026 1:06am
agents-docs Ready Ready Preview, Comment Apr 4, 2026 1:06am
agents-manage-ui Ready Ready Preview, Comment Apr 4, 2026 1:06am

Request Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 4, 2026

🦋 Changeset detected

Latest commit: d221765

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 10 packages
Name Type
@inkeep/agents-core Patch
@inkeep/agents-api Patch
@inkeep/agents-manage-ui Patch
@inkeep/agents-cli Patch
@inkeep/agents-sdk Patch
@inkeep/agents-work-apps Patch
@inkeep/ai-sdk-provider Patch
@inkeep/create-agents Patch
@inkeep/agents-email Patch
@inkeep/agents-mcp Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog bot commented Apr 4, 2026

TL;DR — Adds a fallback mechanism to the release workflow so that npm can be upgraded even when the runner's bundled npm binary is broken. If npm install -g npm@latest fails, the script fetches the latest npm tarball directly from the registry and bootstraps from it.

Key changes

  • Add tarball-based npm bootstrap fallback in release.yml — wraps the existing npm install -g npm@latest in a conditional; on failure, resolves the latest npm version via the registry API, downloads the tarball, and installs npm from the extracted npm-cli.js.
  • Add changeset for @inkeep/agents-core — patch-level changeset documenting the CI fix.

Summary | 2 files | 1 commit | base: mainfix/release-npm-bootstrap


Resilient npm upgrade in the release workflow

Before: The release workflow ran npm install -g npm@latest unconditionally — if the runner's bundled npm was corrupted or broken, the step would fail and block the entire release pipeline.
After: The step first attempts the normal upgrade; if that fails, it fetches the latest npm version from the registry, downloads the tarball with curl, extracts it, and uses node to run the extracted npm-cli.js for a clean install — then logs the resulting version either way.

The fallback uses node -e "fetch(...)" to resolve the latest version (Node 18+ has global fetch), avoiding a dependency on jq or other tools that may not be present on the runner.

.github/workflows/release.yml · .changeset/frightened-white-guineafowl.md

Pullfrog  | View workflow run | Triggered by Pullfrog | Using Claude Opus𝕏

Copy link
Copy Markdown
Contributor

@pullfrog pullfrog bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid workaround for a real runner breakage. The fallback logic is sound — try the happy path first, bootstrap from tarball if it fails. Two items to address: the changeset shouldn't exist (this is a CI-only change, not a published package change), and the fallback could fail silently if the registry fetch returns empty.

Pullfrog  | Fix all ➔Fix 👍s ➔View workflow run | Using Claude Opus𝕏

"@inkeep/agents-core": patch
---

Fix release workflow npm bootstrap for OIDC publishing
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changeset should be removed. The change is to .github/workflows/release.yml — internal CI tooling, not published code in @inkeep/agents-core. Per repo conventions, internal tooling/scripts changes do not get changesets. Merging this would trigger a spurious patch release of agents-core with zero code changes to the package.

If the goal is to trigger a publish run to verify the fix, a workflow_dispatch on the release workflow achieves the same thing without a no-op version bump.

echo "npm upgraded via bundled npm"
else
echo "::warning::Bundled npm broken, bootstrapping from tarball"
NPM_VER=$(node -e "fetch('https://registry.npmjs.org/npm/latest').then(r=>r.json()).then(d=>console.log(d.version))")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If fetch fails (network error, registry outage, non-JSON response), NPM_VER will be empty and the subsequent curl + install commands will silently produce garbage URLs or cryptic errors. Add a guard:

Suggested change
NPM_VER=$(node -e "fetch('https://registry.npmjs.org/npm/latest').then(r=>r.json()).then(d=>console.log(d.version))")
NPM_VER=$(node -e "fetch('https://registry.npmjs.org/npm/latest').then(r=>r.json()).then(d=>console.log(d.version))")
if [ -z "$NPM_VER" ]; then echo '::error::Failed to resolve npm version from registry'; exit 1; fi
Suggested change
NPM_VER=$(node -e "fetch('https://registry.npmjs.org/npm/latest').then(r=>r.json()).then(d=>console.log(d.version))")
NPM_VER=$(node -e "fetch('https://registry.npmjs.org/npm/latest').then(r=>r.json()).then(d=>console.log(d.version))")
if [ -z "$NPM_VER" ]; then echo '::error::Failed to resolve npm version from registry'; exit 1; fi

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(3) Total Issues | Risk: Medium

🟠⚠️ Major (2) 🟠⚠️

Inline Comments:

  • 🟠 Major: .github/workflows/release.yml:49 Error suppression via 2>/dev/null hides legitimate failures
  • 🟠 Major: .changeset/frightened-white-guineafowl.md:1-5 Changeset should not exist for CI-only changes

🟡 Minor (1) 🟡

Inline Comments:

  • 🟡 Minor: .github/workflows/release.yml:53 Consider using curl for version lookup instead of Node fetch

🚫 REQUEST CHANGES

Summary: The PR correctly addresses a legitimate CI failure caused by a broken npm in the GitHub runner image. The fallback mechanism is sound in principle, but needs two fixes before merging:

  1. Remove the changeset — Per AGENTS.md, "Internal tooling/scripts changes" should NOT have changesets. This workflow fix doesn't change any code in @inkeep/agents-core.

  2. Improve error detection — The blanket 2>/dev/null error suppression masks legitimate failures. Consider checking for the specific MODULE_NOT_FOUND error pattern so genuine network/auth failures are still surfaced.

The curl-based version lookup is a nice-to-have improvement but not blocking.

Discarded (2)
Location Issue Reason Discarded
.github/workflows/release.yml:53-55 Tarball download lacks checksum verification Acceptable for temporary workaround; HTTPS + official registry provides reasonable security, and adding dynamic hash verification from the same registry doesn't add meaningful security.
.github/workflows/release.yml:53 Node fetch() may fail in older runners Duplicate of the curl suggestion which was placed as Minor
Reviewers (2)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-devops 4 0 0 0 3 0 1
pr-review-standards 1 0 0 0 0 0 1
Total 5 0 0 0 3 0 2

Note: Changeset issue was raised by both reviewers; deduplicated in final output.

- name: Ensure npm 11.5.1+ for OIDC support
run: npm install -g npm@latest
run: |
if npm install -g npm@latest 2>/dev/null; then
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Error suppression hides legitimate failures

Issue: The 2>/dev/null redirection discards all stderr output, hiding legitimate errors unrelated to the broken npm issue (network failures, permission issues, registry auth problems).

Why: When the primary upgrade path fails for any reason other than the specific MODULE_NOT_FOUND bug, the workflow silently falls back to the tarball path. This masks real problems and makes debugging harder.

Fix: Consider checking for the specific error pattern instead of blanket suppression:

if npm install -g npm@latest 2>&1 | tee /tmp/npm-upgrade.log; then
  echo "npm upgraded via bundled npm"
elif grep -qE 'MODULE_NOT_FOUND|Cannot find module' /tmp/npm-upgrade.log; then
  echo "::warning::Bundled npm broken, bootstrapping from tarball"
  # ... fallback logic
else
  echo "::error::npm upgrade failed for unknown reason"
  cat /tmp/npm-upgrade.log
  exit 1
fi

Refs:

echo "npm upgraded via bundled npm"
else
echo "::warning::Bundled npm broken, bootstrapping from tarball"
NPM_VER=$(node -e "fetch('https://registry.npmjs.org/npm/latest').then(r=>r.json()).then(d=>console.log(d.version))")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Consider using curl for version lookup

Issue: The fallback uses node -e "fetch(...)" which requires Node.js 18+ with native fetch. While the workflow specifies Node 22, using curl would be more robust and consistent with the tarball download.

Why: This avoids any Node.js environment dependency in the fallback path. If npm is broken, other aspects of the Node environment might also be affected.

Fix:

Suggested change
NPM_VER=$(node -e "fetch('https://registry.npmjs.org/npm/latest').then(r=>r.json()).then(d=>console.log(d.version))")
NPM_VER=$(curl -fsSL https://registry.npmjs.org/npm/latest | jq -r .version)

Refs:

  • The runner already has jq available for JSON parsing

Comment on lines +1 to +5
---
"@inkeep/agents-core": patch
---

Fix release workflow npm bootstrap for OIDC publishing
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Changeset should not exist for CI-only changes

Issue: This changeset attributes a CI workflow fix to @inkeep/agents-core, but per AGENTS.md, changesets should NOT be created for "Internal tooling/scripts changes".

Why: Creating a changeset for this workflow-only change will trigger an unnecessary patch version bump of @inkeep/agents-core. This creates changelog noise and version churn for consumers who see no actual code change in the package.

Fix: Delete this changeset file. The workflow fix does not require a version bump of any package.

From AGENTS.md:

When NOT to create a changeset:

  • Documentation-only changes
  • Test-only changes
  • Internal tooling/scripts changes
  • Changes to ignored packages

@github-actions github-actions bot deleted a comment from claude bot Apr 4, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

Preview URLs

Use these stable preview aliases for testing this PR:

These point to the same Vercel preview deployment as the bot comment, but they stay stable and easier to find.

Raw Vercel deployment URLs

@amikofalvy amikofalvy added this pull request to the merge queue Apr 4, 2026
Merged via the queue into main with commit 3237c45 Apr 4, 2026
38 checks passed
@amikofalvy amikofalvy deleted the fix/release-npm-bootstrap branch April 4, 2026 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant