Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .github/workflows/create-challenge.yml
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,15 @@ jobs:

Pick the next available challenge number that is NOT already taken by a merged challenge or an open PR. Also avoid creating a challenge on the same topic as any pending PR, even if the number differs.

THEME — REAL-WORLD INFERENCE KERNELS:
Focus on challenges inspired by real-world ML inference workloads. Think about the building blocks of modern neural networks (transformers, diffusion models, LLMs, vision models) and the GPU kernels that make them fast. Good examples:
- Fused inference kernels: fused SwiGLU/GeGLU MLP blocks, flash attention, paged attention, speculative decoding verification, quantized matmul (INT8/INT4), fused QKV projection, KV-cache updates
- Sequence/token operations: top-k/top-p sampling, beam search step, KV-cache rotation, causal masking
- Model architecture blocks: full transformer decoder block (like the existing GPT-2 challenge), mixture-of-experts routing, LoRA forward pass
- Online/streaming algorithms: online softmax, streaming attention (process new queries without storing entire rows), continuous batching, prefix caching

Look at `challenges/medium/74_gpt2_block/` as the gold standard for this style of challenge. The solver should implement a meaningful, self-contained inference building block — not a toy operation.

HARD RULES:
- Do NOT create trivial element-wise challenges. We have way too many (sigmoid, relu, silu, clipping, etc). If your idea is just "apply f(x) to every element", pick something else.
- Do NOT duplicate existing challenges — check both the merged challenges in the repo AND the open PRs listed above.
Expand Down
Loading