Skip to content

Add MLPerf Training and BigCode Evaluation Harness#239

Closed
alvinreal wants to merge 1 commit intomainfrom
research/add-mlperf-training-benchmark-2026-04-13
Closed

Add MLPerf Training and BigCode Evaluation Harness#239
alvinreal wants to merge 1 commit intomainfrom
research/add-mlperf-training-benchmark-2026-04-13

Conversation

@alvinreal
Copy link
Copy Markdown
Owner

This PR adds elite-tier evaluation projects to the "Evaluation, Benchmarks & Datasets" category:

Added Projects:

  1. MLPerf Training (1,755⭐, Apache-2.0, active Apr 2026)

    • Industry-standard ML training benchmarks from MLCommons
    • Reference implementations for measuring system performance across diverse training workloads
    • Complements existing MLPerf Inference entry
  2. BigCode Evaluation Harness (1,034⭐, Apache-2.0, active Jul 2025)

    • Framework for evaluating autoregressive code generation models
    • Supports HumanEval, MBPP, DS-1000, and other code benchmarks
    • From the BigCode project (StarCoder, SantaCoder)

Both projects meet all elite-tier criteria:

  • ✅ 1000+ GitHub stars
  • ✅ Active development (commits within 6 months)
  • ✅ OSI-approved license (Apache-2.0)
  • ✅ Production-grade tooling used by the AI community

Validation: 0 errors, 0 warnings

@alvinreal
Copy link
Copy Markdown
Owner Author

Closing this PR because BigCode Evaluation Harness (bigcode-project/bigcode-evaluation-harness) does not meet the elite-tier activity criteria. Last push was 2025-07-22 (over 6 months ago). MLPerf Training (mlcommons/training) is valid (1755 stars, active Apr 2026), but we require ALL projects in a PR to meet criteria. Please resubmit with only MLPerf Training or wait until BigCode has new activity.

@alvinreal alvinreal closed this Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant