Skip to content

v0.6.6.post1

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 27 Dec 06:24
· 329 commits to main since this release
2339d59

This release restore functionalities for other quantized MoEs, which was introduced as part of initial DeepSeek V3 support 🙇 .

What's Changed

  • [Docs] Document Deepseek V3 support by @simon-mo in #11535
  • Update openai_compatible_server.md by @robertgshaw2-neuralmagic in #11536
  • [V1] Use FlashInfer Sampling Kernel for Top-P & Top-K Sampling by @WoosukKwon in #11394
  • [V1] Fix yapf by @WoosukKwon in #11538
  • [CI] Fix broken CI by @robertgshaw2-neuralmagic in #11543
  • [misc] fix typing by @youkaichao in #11540
  • [V1][3/N] API Server: Reduce Task Switching + Handle Abort Properly by @robertgshaw2-neuralmagic in #11534
  • [BugFix] Deepseekv3 broke quantization for all other methods by @robertgshaw2-neuralmagic in #11547

Full Changelog: v0.6.6...v0.6.6.post1