Skip to content

fix(ci): unblock docs-only PRs from required check deadlock#2641

Closed
koala73 wants to merge 1 commit intomainfrom
fix/ci-docs-only-skip
Closed

fix(ci): unblock docs-only PRs from required check deadlock#2641
koala73 wants to merge 1 commit intomainfrom
fix/ci-docs-only-skip

Conversation

@koala73
Copy link
Copy Markdown
Owner

@koala73 koala73 commented Apr 3, 2026

Summary

Fix: Move path filtering from workflow-level paths-ignore to job-level if: conditions via a lightweight changes job that queries the GitHub API for changed files. When a job is skipped via if:, GitHub reports it as passed for required checks.

Also updates deploy-gate to treat skipped conclusions as passing.

Test plan

Move paths-ignore from workflow triggers to job-level `if:` conditions.
When a workflow uses paths-ignore at the trigger level, GitHub never
creates the check run, so required checks stay "Waiting" forever.

Job-level skips via `if:` report as passed, satisfying branch protection.

Also update deploy-gate to treat "skipped" conclusions as passing.
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
worldmonitor Ignored Ignored Apr 3, 2026 5:08am

Request Review

@koala73
Copy link
Copy Markdown
Owner Author

koala73 commented Apr 3, 2026

Superseded — CI fix cherry-picked into #2639 so the workflow changes trigger real GitHub Actions check runs (manual status posts don't satisfy branch protection's app requirement).

@koala73 koala73 closed this Apr 3, 2026
@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 3, 2026

Greptile Summary

This PR fixes a well-known GitHub Actions deadlock: when paths-ignore prevents a workflow from firing at all, GitHub never creates the required check run, leaving branch protection stuck "Waiting for status to be reported" forever. The fix moves path-filtering logic from the workflow trigger level into a lightweight changes job that calls the GitHub Pulls API, then gates biome, unit, and typecheck on its output via job-level if: conditions. Because GitHub creates a check run with conclusion: skipped for jobs whose if: condition evaluates to false, required-check branch protection correctly treats them as passed. The deploy-gate is also updated to accept skipped alongside success when evaluating whether to post a green commit status.

Key changes:

  • lint-code.yml, test.yml, typecheck.yml: Remove paths-ignore from the pull_request trigger; introduce a shared changes job that outputs code=true/false by calling gh api …/pulls/{number}/files and filtering with the same path pattern that was previously in paths-ignore.
  • deploy-gate.yml: Replace the two-condition if with a for loop so any value that is neither success nor skipped triggers a failure status — enabling docs-only PRs whose jobs were skipped to receive a green gate.

Issues found:

  • If the changes job itself fails (transient API error, rate limit, jq parse failure), GitHub marks all downstream jobs (unit, typecheck, biome) as skipped due to dependency failure. The deploy-gate then sees skipped conclusions and posts success, silently bypassing tests on a code PR.
  • ${{ github.event_name }}, ${{ github.repository }}, and ${{ github.event.number }} are interpolated directly into the run: shell scripts rather than being surfaced as env: variables, which is against GitHub's security hardening recommendations.

Confidence Score: 4/5

Safe to merge with awareness of the changes-job failure edge case, which has a low probability but could allow a code PR to bypass checks if the GitHub API call fails transiently.

The core logic is correct and well-reasoned: job-level if: skipping produces check runs that branch protection accepts, and the gate correctly handles skipped conclusions. The one actionable concern is that a failing changes job causes downstream jobs to be skipped, which the gate treats as passing — low-probability but high-impact. The expression interpolation issue is a best-practice violation but not exploitable with these specific context values.

.github/workflows/test.yml (and identically lint-code.yml, typecheck.yml) — the changes job's error-handling path deserves a safe-by-default fallback.

Important Files Changed

Filename Overview
.github/workflows/deploy-gate.yml Adds skipped as an accepted conclusion in the for-loop that checks unit/typecheck results, correctly unblocking docs-only PRs from the gate; logic is sound but a transient changes-job failure upstream could cause the gate to incorrectly pass.
.github/workflows/lint-code.yml Replaces workflow-level paths-ignore with a changes job that calls the GitHub API; biome is now conditionally skipped via if:, which resolves the required-check deadlock. Contains the expression-interpolation and failure-fallback concerns shared with all three workflow files.
.github/workflows/test.yml Same changes-job pattern as lint-code.yml; unit job correctly gated on changes.outputs.code == 'true'. Shares the failure-fallback and expression-injection concerns.
.github/workflows/typecheck.yml Identical changes-job pattern applied to the typecheck job; straightforward and consistent with the other workflows.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    PR[Pull Request opened / updated] --> TW[Test workflow triggered]
    PR --> TCW[Typecheck workflow triggered]
    PR --> LCW[Lint Code workflow triggered]

    TW --> C1[changes job\ngh api …/pulls/N/files]
    TCW --> C2[changes job\ngh api …/pulls/N/files]
    LCW --> C3[changes job\ngh api …/pulls/N/files]

    C1 -->|code=true| U[unit job\nruns normally]
    C1 -->|code=false| US[unit job\nskipped → conclusion: skipped]

    C2 -->|code=true| TC[typecheck job\nruns normally]
    C2 -->|code=false| TCS[typecheck job\nskipped → conclusion: skipped]

    C3 -->|code=true| B[biome job\nruns normally]
    C3 -->|code=false| BS[biome job\nskipped → conclusion: skipped]

    U --> DG[deploy-gate\nworkflow_run completed]
    US --> DG
    TC --> DG
    TCS --> DG

    DG --> CHECK{unit == success or skipped?\nAND typecheck == success or skipped?}
    CHECK -->|yes| GS[gate: success ✅\nbranch protection passes]
    CHECK -->|no| GF[gate: failure ❌\nbranch protection blocks]
    CHECK -->|either pending| GP[gate: pending ⏳\nwait for next trigger]
Loading

Comments Outside Diff (1)

  1. .github/workflows/deploy-gate.yml, line 47-53 (link)

    P2 pending description is misleading for docs-only PRs

    For docs-only PRs the "Test" and "Typecheck" workflows both complete almost simultaneously (only the lightweight changes job runs). On the first gate fire, if the second workflow hasn't posted its check-run yet, unit or typecheck might still be pending. The gate posts pending and exits, then correctly posts success on the second trigger — this is functionally fine.

    However, the description field on the pending status still says "Waiting for unit + typecheck to complete" even when both will ultimately be skipped. This is cosmetically misleading but functionally harmless — worth noting if these descriptions are surfaced in a dashboard.

Reviews (1): Last reviewed commit: "fix(ci): unblock docs-only PRs from requ..." | Re-trigger Greptile

Comment on lines +29 to +30
CODE=$(echo "$FILES" | grep -vcE '\.md$|^docs/|^src-tauri/|^CHANGELOG\.md$|^LICENSE$|\.github/workflows/(build-desktop|docker-publish)\.yml$' || echo 0)
echo "code=$( [ "$CODE" -gt 0 ] && echo true || echo false )" >> "$GITHUB_OUTPUT"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 changes failure silently skips required checks

If the changes job fails (e.g. transient gh api rate-limit, network error, or a jq parse error), GitHub marks the downstream unit / typecheck / biome jobs as skipped because their needs: dependency did not succeed. The deploy-gate then sees unit=skipped and typecheck=skipped, treats both as passing, and posts a success gate status — effectively letting a broken code PR through without any test or type coverage.

Consider adding an explicit fallback that sets code=true on any API failure so the downstream jobs run rather than skip:

- id: diff
  env:
    GH_TOKEN: ${{ github.token }}
  run: |
    if [ "${{ github.event_name }}" = "push" ]; then
      echo "code=true" >> "$GITHUB_OUTPUT"
      exit 0
    fi
    FILES=$(gh api "repos/${{ github.repository }}/pulls/${{ github.event.number }}/files" \
      --paginate --jq '.[].filename') || { echo "code=true" >> "$GITHUB_OUTPUT"; exit 0; }
    CODE=$(echo "$FILES" | grep -vcE '\.md$|^docs/|^src-tauri/|^CHANGELOG\.md$|^LICENSE$|\.github/workflows/(build-desktop|docker-publish)\.yml$' || echo 0)
    echo "code=$( [ "$CODE" -gt 0 ] && echo true || echo false )" >> "$GITHUB_OUTPUT"

The same applies to the identical changes jobs in lint-code.yml and typecheck.yml.

Comment on lines +23 to +30
if [ "${{ github.event_name }}" = "push" ]; then
echo "code=true" >> "$GITHUB_OUTPUT"
exit 0
fi
FILES=$(gh api "repos/${{ github.repository }}/pulls/${{ github.event.number }}/files" \
--paginate --jq '.[].filename')
CODE=$(echo "$FILES" | grep -vcE '\.md$|^docs/|^src-tauri/|^CHANGELOG\.md$|^LICENSE$|\.github/workflows/(build-desktop|docker-publish)\.yml$' || echo 0)
echo "code=$( [ "$CODE" -gt 0 ] && echo true || echo false )" >> "$GITHUB_OUTPUT"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Expression interpolation inside run: shell scripts

${{ github.event_name }}, ${{ github.repository }}, and ${{ github.event.number }} are interpolated directly into the shell script before execution. While all three are GitHub-controlled values (not user-supplied), this pattern is flagged by actionlint and GitHub's own security guidance because it bypasses shell quoting entirely. The preferred approach is to surface them as env: variables (as done for GH_TOKEN) and reference them as $VAR in the shell.

- id: diff
  env:
    GH_TOKEN: ${{ github.token }}
    EVENT_NAME: ${{ github.event_name }}
    GH_REPO: ${{ github.repository }}
    PR_NUMBER: ${{ github.event.number }}
  run: |
    if [ "$EVENT_NAME" = "push" ]; then
      echo "code=true" >> "$GITHUB_OUTPUT"
      exit 0
    fi
    FILES=$(gh api "repos/$GH_REPO/pulls/$PR_NUMBER/files" \
      --paginate --jq '.[].filename')
    CODE=$(echo "$FILES" | grep -vcE '\.md$|^docs/|^src-tauri/|^CHANGELOG\.md$|^LICENSE$|\.github/workflows/(build-desktop|docker-publish)\.yml$' || echo 0)
    echo "code=$( [ "$CODE" -gt 0 ] && echo true || echo false )" >> "$GITHUB_OUTPUT"

The same pattern appears in lint-code.yml (lines 23-30) and typecheck.yml (lines 23-30).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant