Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Runtime Bugets and Step Hints #838

Open
wants to merge 20 commits into
base: dev
Choose a base branch
from

Conversation

fsschneider
Copy link
Contributor

Depends on #830, #831, #832, and #833.

This PR updates the runtime budgets and step hints according to our working group discussion and issue #836.

For runtimes and step hints, I have always rounded up to the nearest integer.

The new budgets are:

External tuning

  • Criteo 1TB: 7,703s (was 7,703s, no reduction)
  • fastMRI: 4,430s (was 8,859s, reduced to 50%)
  • ResNet: 66,159s (was 63,008s, extended by 5%)
  • ViT: 69,768s (was 77,520s, reduced to 90%)
  • Conformer: 58,015s (was 61,068s, reduced to 95%)
  • DeepSpeech: 44,405s (was 55,506s, reduced to 80%)
  • OGBG: 12,011s (was 18,477s, reduced to 65%)
  • WMT: 43,336s (was 48,151s, reduced to 90%)

Self-tuning
Change the factor from 3x to 1.5x. This keeps the baseline and Schedule-free AdamW still within finite workload scores (except ResNet, which could not be achieved, even with the 3x budget).

@fsschneider fsschneider requested a review from a team as a code owner February 3, 2025 11:22
Copy link

github-actions bot commented Feb 3, 2025

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@fsschneider fsschneider added the 🛑 AlgoPerf Leaderboard Blocking rolling AlgoPerf Leaderboard label Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🛑 AlgoPerf Leaderboard Blocking rolling AlgoPerf Leaderboard
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant