Skip to content

fix(leaderboard): add GitLab repository support for URLs and display names#350

Merged
kami619 merged 3 commits intoambient-code:mainfrom
kami619:bugfix/gitlab-leaderboard-support
Mar 26, 2026
Merged

fix(leaderboard): add GitLab repository support for URLs and display names#350
kami619 merged 3 commits intoambient-code:mainfrom
kami619:bugfix/gitlab-leaderboard-support

Conversation

@kami619
Copy link
Copy Markdown
Collaborator

@kami619 kami619 commented Mar 26, 2026

Description

The leaderboard pipeline assumed all repositories were on GitHub, causing GitLab-hosted repos to display broken links (https://github.com/redhat/builder instead of https://gitlab.com/redhat/rhel-ai/wheels/builder) and truncated names.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test coverage improvement

Related Issues

Affected entries: redhat/rhel-ai/wheels/builder (#2), redhat/rhel-ai/rhai/pipeline (#11) Unblocks PR #347 (13 more GitLab repos)

Changes Made

Changes:

  • Leaderboard generator now reads repository.url from assessment JSON and converts SSH URLs to HTTPS, with fallback to GitHub for backwards compat
  • Validation workflow detects GitHub vs GitLab and uses git ls-remote for non-GitHub repo verification
  • Submit CLI accepts GitLab SSH/HTTPS URLs with deep paths
  • Added 17 regression tests for URL conversion and GitLab support

Testing

  • Unit tests pass (pytest)
  • Integration tests pass
  • Manual testing performed
  • No new warnings or errors

Checklist

  • My code follows the project's code style
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published

Screenshots (if applicable)

Additional Notes

…names

The leaderboard pipeline assumed all repositories were on GitHub, causing
GitLab-hosted repos to display broken links (https://github.com/redhat/builder
instead of https://gitlab.com/redhat/rhel-ai/wheels/builder) and truncated
names.

Changes:
- Leaderboard generator now reads repository.url from assessment JSON and
  converts SSH URLs to HTTPS, with fallback to GitHub for backwards compat
- Validation workflow detects GitHub vs GitLab and uses git ls-remote for
  non-GitHub repo verification
- Submit CLI accepts GitLab SSH/HTTPS URLs with deep paths
- Added 17 regression tests for URL conversion and GitLab support

Affected entries: redhat/rhel-ai/wheels/builder (#2), redhat/rhel-ai/rhai/pipeline (ambient-code#11)
Unblocks PR ambient-code#347 (13 more GitLab repos)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 26, 2026

Warning

Rate limit exceeded

@kami619 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 48 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fd06b0a3-e3cc-41b8-9c24-6f5ddf1fdc9d

📥 Commits

Reviewing files that changed from the base of the PR and between ef09358 and 858e10b.

📒 Files selected for processing (1)
  • src/agentready/cli/submit.py

Warning

.coderabbit.yaml has a parsing error

The CodeRabbit configuration file in this repository has a parsing error and default settings were used instead. Please fix the error(s) in the configuration file. You can initialize chat with CodeRabbit to get help with the configuration file.

💥 Parsing errors (1)
Validation error: String must contain at most 250 character(s) at "tone_instructions"
⚙️ Configuration instructions
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Walkthrough

This pull request extends the submission and leaderboard system to support both GitHub and GitLab repositories. Changes include host detection in CI workflows, multi-host URL parsing helpers, conditional verification logic branched by repository host type, and updated tests covering both platforms.

Changes

Cohort / File(s) Summary
Workflow Infrastructure
.github/workflows/leaderboard.yml
Added "Detect repository host" step with URL pattern matching to identify GitHub vs. GitLab. Reworked verification steps to branch by HOST: GitHub uses GitHub API for public/private checks, non-GitHub uses git ls-remote for public accessibility. Updated host-specific access verification and adjusted clone operations to use derived HTTPS clone_url.
Leaderboard Data Generation
scripts/generate-leaderboard-data.py
Added URL parsing helpers (git_url_to_https(), repo_display_name_from_url()) to convert git remote URLs to HTTPS browse URLs and extract host/org[/repo...] display paths. Updated leaderboard entry construction to derive repository metadata from latest["repository"]["url"] with fallback to GitHub-style assumptions when URL data is unavailable.
CLI Submission Module
src/agentready/cli/submit.py
Extended host support beyond GitHub with SUPPORTED_HOSTS constant and repository URL parser (_parse_repo_url()). Added _repo_browse_url() helper and updated extract_repo_info() to return 6 values including host and full_path. Modified submission functions to branch verification logic: GitHub API checks for GitHub, git ls-remote for non-GitHub hosts. Updated PR body generation to include browse URL and host labeling.
Unit Tests
tests/unit/test_cli_submit.py, tests/unit/test_generate_leaderboard.py
Updated existing extract_repo_info tests to unpack new 6-value return tuple and assert host/full_path values. Added GitLab SSH and HTTPS repository parsing test cases. Created new test module validating URL parsing helpers and end-to-end leaderboard generation for both GitHub and GitLab with fallback behavior.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant CLI as CLI (submit.py)
    participant GitHub as GitHub API
    participant GitLab as GitLab (git ls-remote)
    participant Repo as Repository

    User->>CLI: submit(repo_url)
    CLI->>CLI: extract_repo_info(assessment_data)
    CLI->>CLI: _parse_repo_url(repo_url) → host, org/repo
    CLI->>CLI: _repo_browse_url(host, path) → browse_url
    
    alt host == "github.com"
        CLI->>GitHub: verify repo (API)
        GitHub->>GitHub: check public/private, collaborator access
        GitHub-->>CLI: accessibility confirmed
    else host == "gitlab.com"
        CLI->>GitLab: git ls-remote --exit-code clone_url HEAD
        GitLab->>Repo: check public access
        Repo-->>GitLab: HEAD ref found
        GitLab-->>CLI: public repo confirmed
    end
    
    CLI->>CLI: generate_pr_body(host, full_path, browse_url)
    CLI->>Repo: clone/submit assessment
    Repo-->>CLI: submission complete
    CLI-->>User: display browse_url & results
Loading
sequenceDiagram
    participant Workflow as GitHub Actions
    participant DetectHost as "Host Detection"
    participant VerifyRepo as "Verify Repository"
    participant VerifyAccess as "Verify Access"
    participant GitHub as GitHub API
    participant Git as git ls-remote
    participant Clone as Clone & Assess

    Workflow->>DetectHost: infer host from REPO_URL
    DetectHost->>DetectHost: pattern matching (github.com vs gitlab.com)
    DetectHost-->>Workflow: HOST, CLONE_URL
    
    Workflow->>VerifyRepo: check if repo exists & public
    
    alt HOST == "github.com"
        VerifyRepo->>GitHub: gh repo view --json isPrivate
        GitHub-->>VerifyRepo: public repo confirmed
    else HOST != "github.com"
        VerifyRepo->>Git: git ls-remote --exit-code CLONE_URL HEAD
        Git-->>VerifyRepo: public repo confirmed
    end
    
    VerifyRepo-->>Workflow: repo verified
    Workflow->>VerifyAccess: check submitter access
    
    alt HOST == "github.com"
        VerifyAccess->>GitHub: gh api to check collaborator/owner
        GitHub-->>VerifyAccess: access confirmed or denied
    else HOST != "github.com"
        VerifyAccess-->>Workflow: emit warning (skip verification)
    end
    
    Workflow->>Clone: Re-run assessment
    Clone->>Clone: git clone CLONE_URL
    Clone-->>Workflow: assessment complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 69.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding GitLab repository support for URLs and display names in the leaderboard pipeline.
Description check ✅ Passed The description is directly related to the changeset, explaining the GitLab support issue, changes made, and testing performed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
scripts/generate-leaderboard-data.py (1)

189-199: ⚠️ Potential issue | 🟡 Minor

Inconsistent use of repo_name in most_improved entries.

The most_improved list still uses the directory-derived repo_name (Line 191) rather than the URL-derived display_name. This creates inconsistency: main leaderboard entries show full GitLab paths like redhat/rhel-ai/wheels/builder, but the most_improved section would show truncated redhat/builder.

Proposed fix
                     most_improved.append(
                         {
-                            "repo": repo_name,
+                            "repo": display_name,
                             "improvement": round(improvement, 1),
                             "from_score": float(oldest["overall_score"]),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/generate-leaderboard-data.py` around lines 189 - 199, The
most_improved entries are using the directory-derived repo_name instead of the
URL-derived display_name; update the dictionary added to most_improved (the
block that sets "repo": repo_name, "improvement": ..., etc.) to use "repo":
display_name so the most_improved section matches the main leaderboard naming;
ensure display_name is the variable in scope where the append occurs (in the
same loop that computes improvement and uses submissions, oldest, latest).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/leaderboard.yml:
- Around line 107-114: The current grep checks (grep -q "github\.com" /
"gitlab\.com") can match those strings anywhere in the URL; update the checks to
match the actual host portion of REPO_URL (so malicious paths won't match).
Replace the two grep lines with anchored/URL-aware regex checks such as using
grep -E (or grep -Eq) against a pattern that ensures the hostname is github.com
or gitlab.com (for example matching ://...github\.com(/|$) or allowing optional
userinfo/www), i.e. change the grep commands that reference REPO_URL to stricter
regexes and leave the echo "host=github"/"host=gitlab" writes to GITHUB_OUTPUT
unchanged.

In `@src/agentready/cli/submit.py`:
- Around line 99-105: The click.echo call inside the host check uses an
unnecessary f-string (no placeholders) causing ruff F541; change the first
click.echo argument to a regular string literal (remove the leading f) in the
conditional that checks SUPPORTED_HOSTS (the block that prints "Error:
Unsupported repository host. Only GitHub and GitLab are supported." and echoes
repo_url), leaving the repo_url echo unchanged.

---

Outside diff comments:
In `@scripts/generate-leaderboard-data.py`:
- Around line 189-199: The most_improved entries are using the directory-derived
repo_name instead of the URL-derived display_name; update the dictionary added
to most_improved (the block that sets "repo": repo_name, "improvement": ...,
etc.) to use "repo": display_name so the most_improved section matches the main
leaderboard naming; ensure display_name is the variable in scope where the
append occurs (in the same loop that computes improvement and uses submissions,
oldest, latest).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ed6d5cf5-8432-41be-a048-cccb70365a72

📥 Commits

Reviewing files that changed from the base of the PR and between c6a4b4b and ef09358.

📒 Files selected for processing (5)
  • .github/workflows/leaderboard.yml
  • scripts/generate-leaderboard-data.py
  • src/agentready/cli/submit.py
  • tests/unit/test_cli_submit.py
  • tests/unit/test_generate_leaderboard.py

Comment on lines +107 to 114
if echo "$REPO_URL" | grep -q "github\.com"; then
echo "host=github" >> "$GITHUB_OUTPUT"
elif echo "$REPO_URL" | grep -q "gitlab\.com"; then
echo "host=gitlab" >> "$GITHUB_OUTPUT"
else
echo "::error::Unsupported repository host in URL: $REPO_URL"
exit 1
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Grep patterns may match unintended hosts.

The patterns grep -q "github\.com" and grep -q "gitlab\.com" match these strings anywhere in the URL. A malicious URL like https://evil.com/fake-github.com/org/repo would incorrectly be classified as GitHub. Consider using stricter patterns.

Proposed fix using anchored patterns
-          if echo "$REPO_URL" | grep -q "github\.com"; then
+          if echo "$REPO_URL" | grep -qE '^(git@|https?://)github\.com[:/]'; then
             echo "host=github" >> "$GITHUB_OUTPUT"
-          elif echo "$REPO_URL" | grep -q "gitlab\.com"; then
+          elif echo "$REPO_URL" | grep -qE '^(git@|https?://)gitlab\.com[:/]'; then
             echo "host=gitlab" >> "$GITHUB_OUTPUT"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if echo "$REPO_URL" | grep -q "github\.com"; then
echo "host=github" >> "$GITHUB_OUTPUT"
elif echo "$REPO_URL" | grep -q "gitlab\.com"; then
echo "host=gitlab" >> "$GITHUB_OUTPUT"
else
echo "::error::Unsupported repository host in URL: $REPO_URL"
exit 1
fi
if echo "$REPO_URL" | grep -qE '^(git@|https?://)github\.com[:/]'; then
echo "host=github" >> "$GITHUB_OUTPUT"
elif echo "$REPO_URL" | grep -qE '^(git@|https?://)gitlab\.com[:/]'; then
echo "host=gitlab" >> "$GITHUB_OUTPUT"
else
echo "::error::Unsupported repository host in URL: $REPO_URL"
exit 1
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/leaderboard.yml around lines 107 - 114, The current grep
checks (grep -q "github\.com" / "gitlab\.com") can match those strings anywhere
in the URL; update the checks to match the actual host portion of REPO_URL (so
malicious paths won't match). Replace the two grep lines with anchored/URL-aware
regex checks such as using grep -E (or grep -Eq) against a pattern that ensures
the hostname is github.com or gitlab.com (for example matching
://...github\.com(/|$) or allowing optional userinfo/www), i.e. change the grep
commands that reference REPO_URL to stricter regexes and leave the echo
"host=github"/"host=gitlab" writes to GITHUB_OUTPUT unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

📉 Test Coverage Report

Branch Coverage
This PR 66.6%
Main 66.8%
Diff ⚠️ -0.2%

Coverage calculated from unit tests only

@kami619 kami619 merged commit 47d8e71 into ambient-code:main Mar 26, 2026
11 checks passed
@kami619 kami619 deleted the bugfix/gitlab-leaderboard-support branch March 26, 2026 02:50
github-actions bot pushed a commit that referenced this pull request Mar 26, 2026
# [2.31.0](v2.30.1...v2.31.0) (2026-03-26)

### Bug Fixes

* **assessors:** support all YAML file naming conventions in dbt assessors ([3ff475a](3ff475a))
* **leaderboard:** add GitLab repository support for URLs and display names ([#350](#350)) ([47d8e71](47d8e71)), closes [#2](#2) [#11](#11) [#347](#347)

### Features

* add python-wheel-build/fromager to leaderboard ([#346](#346)) ([6a9fab1](6a9fab1))
* add redhat/builder to leaderboard ([#348](#348)) ([480a4a4](480a4a4))
* add redhat/rhai-pipeline to leaderboard ([#349](#349)) ([e305a0f](e305a0f))
* add redhat/rhel-ai AIPCC productization repos to leaderboard ([#347](#347)) ([9b07e37](9b07e37))
* **assessors:** add first-class dbt SQL repository support ([8660e6b](8660e6b))
@github-actions
Copy link
Copy Markdown
Contributor

🎉 This PR is included in version 2.31.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant